Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Audio identification task / question (Read 1669 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Audio identification task / question

I am trying to identify a batch of unidentified tracks (about 100 of them, mp3s).  None of these are "Shazam-able" but I suspect many of them come from a certain Youtube channel. This Youtube channel has about 6,000 videos, where they do have artist and title info (the videos are just uploaded audio tracks with a still image for video). I have downloaded the 6,000 or so Youtube tracks as .mp3. As I say, all of these do have artist and title info.

 I am trying to match as many of the 100 untitled mp3s to one of the 6000s titled mp3s ás possible (and hence identify them).

I have had some limited success by arranging all 6000 + 100 mp3s by track length. I then find one of the untitled mp3s in the list, and listen to the titled mp3s which are the same tracklength (or within +/- 3 seconds) to see if they match.

Finally, my question: is there a tool out there that can do this automatically, identify the tracks or rate them by similarity to help identify the rest? This of course would have to be based on how they sound, so would involve some kind of audio fingerprinting algorithm I imagine.


Re: Audio identification task / question

Reply #2
If you can spend the time/space/resources for doing the acoustic ID yourself, an accessible project like https://github.com/worldveil/dejavu might give you a head start. From the README.md, it seems that you should be able to directly use their code with minimal changes for your exact use case. Warning: this project will create in the order of 10 MB of fingerprint information per song, which seems not overly optimised.

 

Re: Audio identification task / question

Reply #3
If you can spend the time/space/resources for doing the acoustic ID yourself, an accessible project like https://github.com/worldveil/dejavu might give you a head start. From the README.md, it seems that you should be able to directly use their code with minimal changes for your exact use case. Warning: this project will create in the order of 10 MB of fingerprint information per song, which seems not overly optimised.

This is just what I was looking for. Thanks a lot!