Skip to main content
Topic: Scraping script/component ? (Read 371 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Scraping script/component ?

Hi everybody,

Like said in a previous topic ( https://hydrogenaud.io/index.php/topic,116084.new.html#new ), I'm using foobar as a tag manager for video files (with the m-TAGS component), and so far it works amazingly well  :D I will gladly share my whole config once it's finished.

Foobar has great tagging capabilities and ease of use, which make it better IMHO than other "fancier" software for database management purposes (and believe me, I have tried LOTS of "video managers", free and not free - most of them are just eye-candy crap and/or do not fit my needs - I'm an old foobar user so I guess I have my own foobar habits, like using foobar's query language to make complex queries ;) )

I now have a personal database of about 1000 movies, partially tagged. Every movie has proper %title% and %date% tags, plus other misc info.

Now I'd like to add some other info, such as %genre% (multi-tag field), %director%, %actor% (multi-tag field), and so on. But doing it by hand will be long and painful. Especially since such info is already available online !

Most video software are able to fetch info from huge online databases like TMDB. foobar can already do the same for audio files. So would it be possible to do the same for video files ?

I guess it would only take a small script/component, where we would choose the tag fields that we want to fill, then everything would be fetched from TMDB and added to the tags automatically. The script/component would simply use the already existing %title% and %date% info from each file, and then fetch the missing info.

Could somebody help me achieve this purpose ? Many thanks in advance !  :D

Re: Scraping script/component ?

Reply #1
Here's a screenshot of the work in progress. Take some time to analyze it and you'll understand its potential.  ;)
Lots of things are automated, including colors, autoplaylists, forced sorting, and even some masstagger scripts for automatic and proper naming of movies, featurettes, etc..

foobar's advanced tagging and filtering capabilities allow for some unique things, that most video management software just can't match. All that's missing is a scraping script/component, which should be feasible : we can already fetch audio tags and lyrics, so why not fetch video tags ?  :)

Re: Scraping script/component ?

Reply #2
I did something similar to this in the past as well but didn't continue developing it because my focus is more to music instead of movies. I did however use columns ui and elplaylist and configured the double click action of the playlist viewer to open vlc and play the movie using foo_run.
About your initial question I'm guessing you're talking about IMDB? I'm not sure that's possible since as far as I'm aware IMDB doesn't have a public API. So if you're talking in the sense of a FB component that fetches information like  Wilb's biography component fetches information from Allmusic and Last.fm I doubt this is possible with no public API available.
Scraping website is another option but probably not ideal and I've never seen an FB component that does that.
A simple fast option could be to tag each movie simply with their imdb identifier (the number in the url of the IMDB movie page). With this you can quite easily create a button and using foo_run create an action that takes you to whatever page/subpage of a movie. It won't be tagged info internally available in your foobar but it would be a good alternative to nothing.


Re: Scraping script/component ?

Reply #4
I wasn't familiar with TMDB at all hence I thought you were talking about IMDB. TMDB seems to be smaller than IMDB but it does indeed have a public API. Now all you need is someone to code a component for you. Good luck.

Re: Scraping script/component ?

Reply #5
TMDB has become a standard. Some huge media centers like Plex use it by default instead of IMDB.

I hope somebody will be interested in helping me with this. There are much weirder/useless components after all.

Again, I am willing to share my whole config once it's finished, to make this a working "proof of concept" along with the scraping script/component.

Re: Scraping script/component ?

Reply #6
I doubt you'll get much response, similar topics about using foobar as movie media manager have sprung up in the past without much coming out of it. Foobar simply isn't meant for what you're using it for in this context. It makes sense Plex doesn't use IMDB, no API remember.

Re: Scraping script/component ?

Reply #7
jazzthieve, I really appreciate you trying to help, but if you're not a developer than I guess you can't help anymore here. Thank you.
As for Plex, please read me again : I said TMDB, not IMDB.

foobar may not be meant for video, but @Peter himself approved a few days ago a feature request from me for reading codec information in video files : https://hydrogenaud.io/index.php/topic,116084.0.html
foobar is an audio player, but it's also a wonderful tag manager. That is the part I intend to use. The fact that I want to use it for managing tags of video files instead of audio files is irrelevant.

Now I need the help of somebody willing to develop such script/component. It's really not a big task : input (%title% and %date%) ---> TMDB API ---> output (all required tag fields). That's all.
@bubbleguuum , @WilB , @ohyeah , you guys have developed similar fetching components. Maybe one of them could be adapted for this purpose. What do you think ?

Re: Scraping script/component ?

Reply #8
Ok, as I said....good luck.

Btw, you clearly never looked into code of existing components to be able to brush off scripting as "really not a big task" . Just take a look at Wilb's Biography Jscript and see how "not a big task" it looks.
It would also behoove you to show some respect to the only person willing to at least reply and help you to the best of my abilities. But I guess I'm not a dev so I shouldn't be replying anymore. I gave you some pointers in the beginning to improve with work you can accomplish yourself simply because I made the exact same thing you made before. Was also having some alternative options in mind using Python web scraping and MP3Tag to input txt into tag values. But it seems you want devs to do work for you and only want dev input.

Yeah, I knew you said TMDB the first time and as I explained I was talking about IMDB in my reply. Jeez read me again.

Re: Scraping script/component ?

Reply #9
Please @jazzthieve , stop hijacking my thread and go elsewhere. Thank you.
Also, stop lecturing me about development. I have looked at many scripts, and it's all relative : it's not a big task when compared to much bigger tasks.
Actually stop lecturing me about anything. Heck, you didn't even know what TMDB was 24 hours ago. Lol.
I thank your good intentions, despite the fact that they're useless here. Now please just go (I said 'please' several times, and 'thank you' several times). Thanks again.

 
SimplePortal 1.0.0 RC1 © 2008-2018