Python Grabber scripts

Topic: Python Grabber scripts (Read 91514 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Python Grabber scripts

Reply #50 – 2010-03-15 22:00:28

Quote from: tore on 2010-03-15 21:34:54

I made something that works

Now that's a feeling

Python Grabber scripts

Reply #51 – 2010-03-15 22:14:22

Quote from: 2E7AH on 2010-03-15 22:00:28

Quote from: tore on 2010-03-15 21:34:54
I made something that works

Now that's a feeling

No, I've tested it on thousands of files already and it does work - at least on my computer, my foobar and my media library. Maybe not always, but most of the time. You should know the code well .. You wrote most of it after all.

edit:

By the way, to all who are interested in such scripts - it should also be possible to fetch information about artists from Wikipedia. It sometimes contains information like bands and artists' countries of origin which might not be available elsewhere. One obvious problem with that is that if you do a search for "KISS" for example, you will get to an article about kissing and not the band. However, I found a blog entry which has a simple workaround for this problem and provides a python code example. The solution is to use the musicbrainz database to find the right wikipedia article as that information gets stored in entries there.

If any Python clever people want to look into it, check here : http://www.davidcraddock.net/2008/09/22/sc...artists-part-2/

Python Grabber scripts

Reply #52 – 2010-03-15 23:26:05

For an example of what information could be pulled from Wikipedia/dbpedia, check out this page, using the band KISS as an example (again) :

http://dbpedia.org/page/Kiss_%28band%29

Python Grabber scripts

Reply #53 – 2010-03-19 14:05:02

To anyone having a problem with the discogs 5000 requests limit

The usefulness of 2E7AH's discogs tagging scripts is somewhat limited because of the current 5000 requests limit per 24 hours from the discogs database. 5000 files is a rather small part of my library - and I need to request both genre and style, limiting it to 2500 files for both tags. Because of this, I tried to come up with a solution to this problem - and I have.

First, 2E7AH's style and genre scripts ask for information about the artist and album, but not track title. If you can tag one file from an album, you already know what the genre and style tags should be for the rest of the tracks on that album. If you could tag one file in an album and then copy those tags to the rest of the tracks in that album, you could tag 2500 albums (2500 requests for style + 2500 for genre) rather than 2500 songs.

The practical solution was to do a query (autoplaylist) in the library that shows only the first track from albums. Then I tagged as much as I could of those with Genre and Style from Dicogs. For those that were successful, I gave a new tag, GSTAGGED. Then I rewrote one of 2E7AH's python script to make one which is able to copy tags from that track 1 file and apply it to the other files in the selection. The way it works is way too simple to be foolproof, so you have to use it with care. Every time it meets a file that has the GSTAGGED tag, it writes that file's Genre and Style into a .txt file for storage. When it meets a file which does not contain GSTAGGED, it tags it with the information stored in that text file.

Every time it meets a new GSTAGGED file, the text file is updated and rewritten with the new info. In effect, it works perfectly as long as you tag a selection of albums where each album's first song has been tagged correctly by discogs. If you tag two albums where the first album's first track is correctly tagged and none of the files of the other album are tagged (or more specific, contains the GSTAGGED tag), then the second album will get the tags from the first album. This is why it's not foolproof, so you have to take a little care with your selection.

The output of the script is "%genre%//%style%" so if I write those to a new tag, f.ex COPYTAGS, I can use the "automatically fill values" to split that tag into the correct discogs genre / style tags.

It still requires a little manual labour - making sure you get the right selection of files from your library that you apply the script on (although making a check against album shouldn't be hard even for me to incorporate), but that's about it. I don't wanna post the script here because it's pretty sloppy and I'm 95% sure a lot of those previously active in this thread can find a better way to handle this problem with less chance of getting your stuff tagged wrong. More, I'm posting this as a suggestion for how to address this problem. Maybe someone can refine my approach and make a better script or they could be inspired to find a different solution. Of course, if anyone really wants my modification of 2E7AH's script, just tell me here or send me a PM.

Python Grabber scripts

Reply #54 – 2010-03-19 17:52:49

I didn't bothered with 5000 limit on Discogs, which BTW as you posted earlier will be changed in a month or so, but I did for AMG scripts (for obvious reasons):

Code: [Select]

try:
    if album == tmp:
        result.append(amg)
    else:
        tmp = album
        amg = ''

So feel free to upload your modification here as it will be also much faster this way

Another solution in such cases would be using proxy in foobar networking preferences, thou it wont be faster nor you can use proxy lists easily

Problem with transferring tag value from one album item to the rest seems interesting: I would have used text tools component and skip/duplicate line feature for outputting the result for additional processing

Python Grabber scripts

Reply #55 – 2010-03-20 01:03:21

Here's a modification of what I described above.

This python script which is a modification of 2E7AH's scripts fetches Genre and Style from discogs. When it meets a track from an album it has not processed before, it writes those tags to a text file. If it meets a track from an album it has processed a track from before, it does not check discogs - rather, it just fetches the tags from those text files. In essence, it fetches discogs genre and style while minimizing the amount of requests. This will allow you to tag thousands more files per 24 hours. For temporary storage of information, it uses text files.

My text files are in the folder C:\Python26 .. Edit the script and change them for whatever values you want to use!

Code: [Select]

import urllib, urllib2, gzip, cStringIO
import xml.etree.ElementTree
from xml.dom import minidom
from encodings import utf_8
from grabber import LyricProviderBase

class Discogs_GetGenre(LyricProviderBase):
    def GetName(self):
        return 'Discogs GenStyles'
    
    def GetVersion(self):
        return '0.1'

    def GetURL(self):
        return 'http://www.discogs.com'

    def Query(self, handles, status, abort):
        result = []
        api_key = '783001745d'

        for handle in handles:
            status.Advance()
            
            if abort.Aborting():
                return result
            
            artist = handle.Format("[%artist%]")
            album = handle.Format("[%album%]")
            album_file = open("C:\Python26\!album.txt", "r")
            test = album_file.read()
            album_file.close()
            
            try:
                if album == test:
                    text_file = open("C:\Python26\!tag3.txt", "r")
                    writeout = text_file.read()
                    text_file.close()
                    result.append(writeout)
                else:
                    URL_s = 'http://www.discogs.com/search?type=all&q=' + artist.lower().replace(' ','+') + '+' + album.lower().replace(' ','+') + '&f=xml&api_key=' + api_key
                    request = urllib2.Request(URL_s)
                    request.add_header('Accept-Encoding', 'gzip')
                    response = urllib2.urlopen(request)
                    data = response.read()
                    unzipped_data = gzip.GzipFile(fileobj = cStringIO.StringIO(data)).read()
                    res = minidom.parseString(unzipped_data)
                    uri_1 = res.getElementsByTagName("uri")[0]
                    rel_id = uri_1.childNodes[0].data.encode('utf-8').rpartition('/')[2]

                    URL_r = 'http://www.discogs.com/release/' + rel_id + '?f=xml&api_key=' + api_key
                    request = urllib2.Request(URL_r)
                    request.add_header('Accept-Encoding', 'gzip')
                    response = urllib2.urlopen(request)
                    data = response.read()
                    unzipped_data = gzip.GzipFile(fileobj = cStringIO.StringIO(data)).read()
                    xml.etree.ElementTree.fromstring(unzipped_data)

                    def getGenres(tree):
                        genres = []
                        release = tree.find('release')
                        genreList = release.find('genres')
                        if genreList:
                            for i in genreList:
                                genres.append(i.text)
                        return genres
                
                    def getStyles(tree):
                        styles = []
                        release = tree.find('release')
                        styleList = release.find('styles')
                        if styleList:
                            for i in styleList:
                                styles.append(i.text)
                        return styles

                    lyric=getGenres(xml.etree.ElementTree.fromstring(unzipped_data))
                    lyric.append('//')
                    lyric.append(getStyles(xml.etree.ElementTree.fromstring(unzipped_data)))
                    lyric=str(lyric).strip('[').strip(']').replace(',', ';').replace('\'','').replace('[','')
                    result.append(lyric)
                    album_file = open("C:\Python26\!album.txt", "w")
                    album_file.write(album)
                    album_file.close()
                    text_file = open("C:\Python26\!tag3.txt", "w")
                    text_file.write(lyric)
                    text_file.close()              
            except Exception, e:
                traceback.print_exc(file=sys.stdout)
                result.append('')
                album_file = open("C:\Python26\!album.txt", "w")
                album_file.write(album)
                album_file.close()
                text_file = open("C:\Python26\!tag3.txt", "w")
                text_file.write('')
                text_file.close()    
                continue
        
        return result

if __name__ == "__main__":
    LyricProviderInstance = Discogs_GetGenre()

The output to the tag will be "%genre%; //; %style%". I don't know how to get rid of the semicolons just yet, but you can clean those up as you split the tag into genre and style tags.

And yes, it's written quick and dirty. I haven't yet beautified it and the script probably won't look the same tomorrow.

edit :

I've tested it on some 5000 files and it works perfectly

Python Grabber scripts

Reply #56 – 2010-03-21 22:05:58

Is it possible to use these scripts with a foo_run-service and python directly? Similar to

python.exe AMG_Review.py "Artist Name" "Album Name" OutputFilePath.txt

Python Grabber scripts

Reply #57 – 2010-03-21 22:21:42

If you have python, I guess you know something about it so stripping the script to work out of foobar wouldn't be that hard
For example you can properly debug them, and with python grabber you only get foobar console as your friend

Why would you like to do that anyway?

Python Grabber scripts

Reply #58 – 2010-03-27 22:15:35

Quote from: 2E7AH on 2009-11-10 15:22:51

Enjoy
I didn't forgot about composer/performer conversation, I'll post that soon

Here is masstagger script for cleaning the %amg% tag (Canar's version): just run it after the script (if %genre% and %style% should be preserved delete first two action from masstagger script):
[attachment=5484:AMG_release_MTS.rar]

Hey 2E7AH,

Did you ever manage to grab the composer/performer from AMG using Python Grabber?
On another note, it would be nice to be able to grab the track titles too...

Python Grabber scripts

Reply #59 – 2010-03-29 00:10:40

Are you really counting on that

Script is there, and it's working reasonably - change it slightly for fetching custom info from the matched page:

- open some AMG release page source
- find regexp that matches info that you need (you'll need some tool for that)
- change the regexp string in the script (the second one)
- try it couple of times watching foobar console
+ for composer or titles you'll need to do some more thinking as it is per track based and script is per album

Regexp is easy and not so efficient method, but you can't access AMG web services if you don't pay AFAIK

I'll try to make some AMG script one of these days, that doesn't need python grabber (nor foobar?), but will see stars aspect first so that I don't get called like this again if I don't do it

BTW here are more correct masstagger scripts for AMG release script (originally posted had some glitch):
[attachment=5811:AMG_release_MTS.zip]

Python Grabber scripts

Reply #60 – 2010-03-31 21:13:22

I was thinking of writing a small program (evaluated VB or Python) and have it tag the files using dBpoweramp's OLE COM object.
But then I noticed that VB became a lot more complicated than what I remembered from 10 years ago and Python is still very hard to understand for me.
Anyway I haven't had the time in the last few months to really try to get something going but should be able to starting this summer.

Oh about getting track and composer info, would it be possible to tag all the info into a structured tag on the first file of the album?
For example putting this info in a tag:
<1>Title 1
<2>Title 2
<3>Title 3

And then use a masstagger script to tag the other files with the correct data.

Oh I have a question for you, you mentioned this:

Quote

- find regexp that matches info that you need (you'll need some tool for that)

I wrote some regexp expressions a few years back in order to do the exact same thing in The Godfather, but I did it from scratch. You mentioned a tool, is there an easier more automated way?

Oh just found this site, pretty useful!

Python Grabber scripts

Reply #61 – 2010-04-01 03:15:34

I'm just working on the script, and it's almost finished, I need to test it a bit and compile it today or tomorrow, then I'll post it in another thread. It fetches everything possible, even song review if it exists and it's loaded with regexp As for which tool, I found RegexBuddy most comfortable, but as it's not free there are some other tools like Regulator, Kodos, Kiki... that are useful too

Python Grabber scripts

Reply #62 – 2010-04-04 22:15:52

Quote from: 2E7AH on 2010-04-01 03:15:34

I'm just working on the script, and it's almost finished, I need to test it a bit and compile it today or tomorrow, then I'll post it in another thread. It fetches everything possible, even song review if it exists and it's loaded with regexp As for which tool, I found RegexBuddy most comfortable, but as it's not free there are some other tools like Regulator, Kodos, Kiki... that are useful too

Cool! Looking forward to test it!!
Once you release it, I'll probably try to adapt it for the classical albums too. (They're very different from the standard ones, but data they contain is really nice)

Python Grabber scripts

Reply #63 – 2010-08-06 00:55:44

Does anyone still have 2E7AH's scripts for LyricsTXT, LyricsTime, and Songmeanings? The attachments seem to be broken. Thanks.

Python Grabber scripts

Reply #64 – 2010-08-07 21:03:18

Found a Lyrics-site with API: ChartLyrics

Here is ChartLyrics.py for foo_grabber_python + foo_lyricsgrabber:

Many false results...
but better than no results.

EDIT: (Upload failed. You are not permitted to upload this type of file) Renamed to ChartLyrics.txt. Hope that works.

Python Grabber scripts

Reply #65 – 2010-08-09 20:14:29

Quote from: grimes on 2010-08-07 21:03:18

Found a Lyrics-site with API: ChartLyrics

Here is ChartLyrics.py for foo_grabber_python + foo_lyricsgrabber:

Many false results...
but better than no results.

EDIT: (Upload failed. You are not permitted to upload this type of file) Renamed to ChartLyrics.txt. Hope that works.

This works, but like you said, there are many false results. Maybe instead of using SearchLyricDirect, you can use SearchLyric to obtain the lyricId and lyricChecksum, and while parsing that file do a string compare between the artists/titles to ensure that you're pulling the lyrics to the right song before calling GetLyric. That should eliminate the false results, right? Otherwise, nice find. They have a decent selection.

Python Grabber scripts

Reply #66 – 2010-10-06 13:06:54

For all that are interested...I am coding in PHP an AMG extractor. It currently is pretty basic but is a bit bloated in its coding. It is, though, able to successfully extract Genre, Styles, Moods & Themes and present them in a usable format without issue, if it is given the URL to search. I unfortunately do not have any experience in Python, but it is better than nothing

Python Grabber scripts

Reply #67 – 2010-10-23 23:05:31

Quote from: Beta4Me on 2010-10-06 13:06:54

For all that are interested...I am coding in PHP an AMG extractor. It currently is pretty basic but is a bit bloated in its coding. It is, though, able to successfully extract Genre, Styles, Moods & Themes and present them in a usable format without issue, if it is given the URL to search. I unfortunately do not have any experience in Python, but it is better than nothing

I would be very interested Beta... By the way, has anyone noticed that 2E7AH's AMG scripts have stopped working?

Python Grabber scripts

Reply #68 – 2010-10-24 00:41:16

yep. amg scripts not working here either

Python Grabber scripts

Reply #69 – 2010-10-24 06:38:49

Quote from: tberman333 on 2010-10-23 23:05:31

Quote from: Beta4Me on 2010-10-06 13:06:54
For all that are interested...I am coding in PHP an AMG extractor. It currently is pretty basic but is a bit bloated in its coding. It is, though, able to successfully extract Genre, Styles, Moods & Themes and present them in a usable format without issue, if it is given the URL to search. I unfortunately do not have any experience in Python, but it is better than nothing

I would be very interested Beta...

Sometime in December I'll probably get them done and put them to public testing, as I'm busy with exams (and then schoolies) until the end of November.

Python Grabber scripts

Reply #70 – 2010-10-24 09:41:43

Quote from: lo-fi on 2010-10-24 00:41:16

yep. amg scripts not working here either

It seems that the URL for search in amg is different but I didn't manage to get it right

Python Grabber scripts

Reply #71 – 2010-10-25 05:54:57

Since most of the script links seem to be broken - looks like I'm too late, wouldn't be a better idea to upload them to services like pastebin? I was looking for the AMG ones if someone can share them, but if they are not working then there's no point on it. Thanks.

Python Grabber scripts

Reply #72 – 2010-10-26 02:47:51

How did the original python grabber scripts work? Did you have to input the AMG ID or did it scan files and search and find everything for you etc.? Also, what tags did it pull from allmusic?

Python Grabber scripts

Reply #73 – 2010-10-26 02:53:18

AMG redesigned their site and search methods mid-October.

that's why everything stopped working.

i haven't messed with the python scripts yet, but the mp3tag scripts were completely broken by the HTML redesign.

i rewrote the mp3tag tag scripts to match the new html, it probably wouldn't be that difficult to re-do the python scripts.

Python Grabber scripts

Reply #74 – 2010-10-26 03:59:30

Quote from: mrinferno on 2010-10-26 02:53:18

AMG redesigned their site and search methods mid-October.

that's why everything stopped working.

i haven't messed with the python scripts yet, but the mp3tag scripts were completely broken by the HTML redesign.

i rewrote the mp3tag tag scripts to match the new html, it probably wouldn't be that difficult to re-do the python scripts.

I understand that. My PHP one works perfectly with the new design.
What I'm trying to do, is understand how the original Python script worked: "Did you have to input the AMG ID or did it scan files and search and find everything for you etc.? Also, what tags did it pull from allmusic?"

Notice