HydrogenAudio

Hosted Forums => foobar2000 => 3rd Party Plugins - (fb2k) => Topic started by: musicmusic on 2008-08-08 15:21:03

Title: Song fingerprinting tools
Post by: musicmusic on 2008-08-08 15:21:03
I found an old copy of libFooID and decided to try and put it to some use. How is another question..

Anyway I came up with a little component that allows you to calculate FooIDs and save them to the file, as well as compare the FooID of two files. Some downloads below:

FooID binary (put this in your foobar2000 folder) (http://yuo.be/download/9EE08622-2AE2-42d3-A85F-31BB45DF450F/FooID.7z)
foobar2000 component version 0.3 (experimental) (http://yuo.be/download/9EE08622-2AE2-42d3-A85F-31BB45DF450F/foo_biometric-0.3.7z)
foobar2000 component version 0.2 (experimental) (http://yuo.be/download/9EE08622-2AE2-42d3-A85F-31BB45DF450F/foo_biometric-0.2.7z)


libFooID source (http://yuo.be/download/9EE08622-2AE2-42d3-A85F-31BB45DF450F/libfooid-src.7z)

Update: Garf has restored the Foosic homepage (http://foosic.org) now

I am not sure how this is useful.. Maybe finding matches within a set of files would be more useful (?) If someone actually downloads this have fun : P
Title: Song fingerprinting tools
Post by: Nemphael on 2008-08-08 23:22:40
Eager to try this. I've tried to get this working, but I can't. I've put the FooID folder in foobar2000 as specified and I slipped foo_biometric into components. Upon starting Foobar, I get an error:
Code: [Select]
Failed to load DLL: foo_biometric.dll
Reason: This component is missing a required dependency, or was made for different version of foobar2000.

I'm using fb2k 0.9.5.4. Do I need to do something with the source code?

As for a suggestion: Would it be possible to generate an ID (As I suppose FooID means fingerprint) from the fingerprints so we can use other search utilities to search for matches as well?

It would also be useful to compare one file against, or compare fingerprints within, a set of files.

EDIT: Nevermind the error. Merely a mistunderstanding.
Title: Song fingerprinting tools
Post by: musicmusic on 2008-08-08 23:47:45
Hello,

Check that FooID.dll is in the same folder as foobar2000.exe. (OK, seems you have solved this now.)

If you run the save command it will save the fingerprint in the FINGERPRINT_FOOID field. But to compare two fingerprints you need to interpret the data, so plain text searching won't achieve anything. So maybe some APIs would be useful if anyone would actually use them

If you save the fingerprints first the compare command will use those instead of recalculating the fingerprint (that command doesn't save them).

Agree on the usefulness of comparing one file against a set or finding matches within a set.. The main obstacles are probably creating the UI for these things
Title: Song fingerprinting tools
Post by: Nemphael on 2008-08-09 01:40:51
I doubt it, but would it be possible to compare fingerprints and find song with similar acoustic characters? Kind of like MusicIP's Mixer (earlier Music Magic Mixer), although I reckon they use another technique than fingerprint information for this.
Title: Song fingerprinting tools
Post by: foosion on 2008-08-09 11:40:50
I doubt it, but would it be possible to compare fingerprints and find song with similar acoustic characters?
The libFooID documentation contains information on how to do this (see copy of FooID documentation on archive.org (http://web.archive.org/web/20070602230635/www.foosic.org/libfooid.php)). Note that this is intended to identify different encodings or maybe even performances of the same song, it is not designed to be able to - say - correlate two songs containing violins.
Title: Song fingerprinting tools
Post by: hesher on 2008-08-10 06:59:36
This is genius! Where can i find more information about it?? I'd love to be able to "fingerprint" my mp3 library! (i know i have duplicates of alot of files)
Title: Song fingerprinting tools
Post by: Nemphael on 2008-08-10 13:12:43
I've tried running it to compare some files and I'm surprised at how well some of my acoustic vs. non-acoustic performances get a rather high score, despite notable differences in drums, electric guitar and some audience (3 Doors Down's Kryptonite 3:55 vs. Kryptonite (acoustic) 3:59 = 35%).

But I keep wondering - would it be possible to have a slider of some sort to change thresholds to, say, include less similar live/acoustic/instrumental performances?
Title: Song fingerprinting tools
Post by: musicmusic on 2008-08-10 23:54:35
This is genius! Where can i find more information about it?? I'd love to be able to "fingerprint" my mp3 library! (i know i have duplicates of alot of files)
Hi.. if you want to know more about FooID foosion's link can tell you more, or a little bit of general information is available on Wikipedia (http://en.wikipedia.org/wiki/Acoustic_fingerprint).

Currently you can only compare two files at a time with the component.. something for multiple files will be added later.

I've tried running it to compare some files and I'm surprised at how well some of my acoustic vs. non-acoustic performances get a rather high score, despite notable differences in drums, electric guitar and some audience (3 Doors Down's Kryptonite 3:55 vs. Kryptonite (acoustic) 3:59 = 35%).
I can get about 33% on different songs by the same artist, although they are slightly similar. But anyway I am not sure how significant or useful these low scores are.. testing on many files may give some answers.

But I keep wondering - would it be possible to have a slider of some sort to change thresholds to, say, include less similar live/acoustic/instrumental performances?
for the two file compare it is not particularly useful, but for multi-file commands yes I was probably going to do that, as it would only return tracks within the set thresholds. As for whether changing the thresholds would have any useful effect, that is another question.. for reference the levels are currently set at:

no: 0 <= i < 50
maybe: 50 <= i < 75
yes: 75 <= i <= 100

where i is the given percentage
Title: Song fingerprinting tools
Post by: musicmusic on 2008-08-14 23:55:38
Version 0.2 released:With the new multiple file mode I was able to do some more testing. My observations about match levels are below:

75-100%: Correct matches or slight variations of the same track (extra featured artist or other slight difference in vocals)
50-75%: Generally different versions of the same track (extra featured artist etc.), and a few correct positives.
0-50%: matches don't seem to be of any useful significance
Title: Song fingerprinting tools
Post by: Nemphael on 2008-08-15 11:07:36
I tried your new version and like the new mass comparison. I've noticed, though, that every match will show up twice. For instance, "Track a" will match "Track b", but at the same time "Track b" will match "Track a". Is this behavior intended?
Title: Song fingerprinting tools
Post by: Garf on 2008-08-15 11:23:11
Hey,

great that someone found a nice use for libFooID.

I will try to bring back the libFooID site. The fingerprint server probably will not come back. The resource usage on the server is just too big.
Title: Song fingerprinting tools
Post by: musicmusic on 2008-08-15 12:38:20
I tried your new version and like the new mass comparison. I've noticed, though, that every match will show up twice. For instance, "Track a" will match "Track b", but at the same time "Track b" will match "Track a". Is this behavior intended?
It's kind of intended.. Song b may match some other songs that song a does not. It could group them all together but it somewhat reduces the quality of information provided, so I have not.

Hey,

great that someone found a nice use for libFooID.

I will try to bring back the libFooID site. The fingerprint server probably will not come back. The resource usage on the server is just too big.
Hi Garf  Yes I surprised myself and found a few duplicates in my library  Yes it would be nice to get the libFooID site back

Whilst I was looking around on the net, I found some discussion of an oddity in regress.c. On line 66, it does this test:
Code: [Select]
if (ssyy <= 0.0f + EPSILON)
and sets *r, but it then divides by ssyy and sets *r again. Basically, the test does nothing. It seemed there was a missing else statement so I changed that (in the downloads above also), but it would be nice if you could confirm..
Title: Song fingerprinting tools
Post by: Nemphael on 2008-08-15 15:02:56
After trying to remove some duplicates, I have a suggestion. The window notifying the user about duplicates isn't really useful, as you can only go to one file at the time. I see how this is useful, but it doesn't allow the user to compare the files himself on a grand scale. How about making a "duplicate-sorted" playlist with the results?
Title: Song fingerprinting tools
Post by: musicmusic on 2008-08-15 15:16:46
What about being able to play the tracks directly from the results window (double click) instead? That I will add anyway. Maybe a delete command on the shortcut menu as well.

Otherwise yes I could add a button to flatten the tree into a playlist, but the resultant playlist may be slightly confusing.
Title: Song fingerprinting tools
Post by: MartDann on 2008-08-17 19:24:58
I think all of your suggestions could be very usefull.
For example: I could make a run over my complete Music Lybrary, put all doubles in playlist und check this playlist later when i have time.
Title: Song fingerprinting tools
Post by: mpioner on 2008-08-26 13:36:54
musicmusic
Maybe you can make local version foosic.com?
Title: Song fingerprinting tools
Post by: Nemphael on 2008-08-26 14:26:48
With the ability to generate FooID fingerprint and possibly have a local Foosic database; what about making a database component to pick up where foo_custominfo left? It's a component I really wish further development in. FooIDs should allow the database to match up files and data which could allow, for instance, play count update of multiple song (released on multiple albums?), songs moved with Explorer or similar.
Title: Song fingerprinting tools
Post by: 2E7AH on 2008-09-01 02:27:36
so, this component is only for finding duplicates among songs.

even if i set min threshold to 25, i'll only get songs which have a similar length, although not all tracks with similar length + some are without any audio similarities!?

i don't know the contents of fingerprint data (424 hex num: 3rd, 4th and 5th are length i guess) and i don't know those "algorithms" from foosion link + some kind of limitation to song length, but isn't there a way to code option about finding similar tracks in means of audio (music) content or something other than just looking for duplicates?

with no means to disrespect the work on this components, but for most duplicates search scripts are sufficient (and faster).
Title: Song fingerprinting tools
Post by: odyssey on 2008-10-19 23:34:24
After trying to remove some duplicates, I have a suggestion. The window notifying the user about duplicates isn't really useful, as you can only go to one file at the time. I see how this is useful, but it doesn't allow the user to compare the files himself on a grand scale. How about making a "duplicate-sorted" playlist with the results?

I second this request, simply because I don't wanna delete all duplicates, but maybe I'll mark a track or do other stuff with it.
Title: Song fingerprinting tools
Post by: musicmusic on 2008-10-20 17:06:04
even if i set min threshold to 25
You are not supposed to set it to 25 (for anything other than testing). See my table in post #8.

with no means to disrespect the work on this components, but for most duplicates search scripts are sufficient (and faster).
I had a few duplicates under different song titles - how are you going to find that with a text search ?! Fingerprinting has other potential uses also which aren't exploited here. Maybe you don't find it useful but at least I do
Title: Song fingerprinting tools
Post by: odyssey on 2008-10-20 20:50:09
with no means to disrespect the work on this components, but for most duplicates search scripts are sufficient (and faster).
I had a few duplicates under different song titles - how are you going to find that with a text search ?!

I had planned to use php's similar_text() to find duplicates  But still a fingerprint is more reliable.
Title: Song fingerprinting tools
Post by: musicmusic on 2008-10-20 22:25:34
I meant completely different.
Title: Song fingerprinting tools
Post by: 2E7AH on 2008-10-21 06:31:19
i've read the posts, but i was playing with the component so i post some of that

I had a few duplicates under different song titles - how are you going to find that with a text search ?!
but for most duplicates search scripts are sufficient (and faster).
my library and my incomes are quite neat, so it's subjective matter anyway.
fingerprinting is time consuming.

perhaps i should have asked only my main question: isn't there a way to code option about finding similar tracks in means of audio (music) content or something other (not depending on track's length)?
i've read the posts #4 & #5

Fingerprinting has other potential uses also which aren't exploited here.
can you explain, if it isn't too much trouble?
Title: Song fingerprinting tools
Post by: Nemphael on 2008-10-21 12:10:39
perhaps i should have asked only my main question: isn't there a way to code option about finding similar tracks in means of audio (music) content or something other (not depending on track's length)?
i've read the posts #4 & #5


This is what foo_biometrics does - it creates fingerprints of audio data and compares them. It has a quick match of +-30 seconds, I believe (Could be extended, could it not? Any benefits from this, you think?), as it intends to identify duplicates tracks rather than to extract certain elements. If I'm not mistaken, foo_biometrics is built from FingerPrint.java, which to me looks more like a duplicate finder than an audio comparator.
Title: Song fingerprinting tools
Post by: OCedHrt on 2008-12-12 15:28:37
Hey,

great that someone found a nice use for libFooID.

I will try to bring back the libFooID site. The fingerprint server probably will not come back. The resource usage on the server is just too big.


What kind of load are we looking at?
Title: Song fingerprinting tools
Post by: odyssey on 2008-12-15 10:25:10
musicmusic: Would you consider an update that improves the usability of matched dupes? I.e. putting the dupes into a playlist?
Title: Song fingerprinting tools
Post by: Nemphael on 2008-12-15 10:29:58
Will you consider checking one file against a dataset? For instance, I'd like to know how if I have any duplicates of file A in ABCADBB.
Title: Song fingerprinting tools
Post by: nilesr on 2009-01-15 18:03:05
I was wondering if anyone had a copy of the matching.zip file from the foosic website(now down). I have downloaded the one from archive.org but it said it was invalid archive when I tried to unzip(tried two different zip programs the others unzipped fine).
I have been playing with libfooid and would like to play with the matching portion of the library.

I have hosted a couple websites for years and would be more than happy to put up a mirror of the old foosic site, minus the database portion for people looking for the code a descriptions later on.

Niles
Title: Song fingerprinting tools
Post by: Garf on 2009-01-15 23:15:32
I restored the relevant part of http://foosic.org (http://foosic.org)
Title: Song fingerprinting tools
Post by: nilesr on 2009-01-19 03:21:58
I restored the relevant part of http://foosic.org (http://foosic.org)



Great thanks.
Title: Song fingerprinting tools
Post by: sabelosimelane on 2009-02-24 10:16:30
I think all of your suggestions could be very usefull.
For example: I could make a run over my complete Music Lybrary, put all doubles in playlist und check this playlist later when i have time.


Could somebody help, I am a java developer and I'm trying to develop an application that compares two audio files acoustic fingerprints. When I tried loading the audio file from java, I got a FileFormatNotSupportedException. Has anyone been able to implement this in Java? Please help....thanx.
Title: Song fingerprinting tools
Post by: musicmusic on 2009-03-11 23:22:12
My bad, I just had a look and it seems the compiler settings weren't set sanely for this project. After correcting this, the compare fingerprints command takes nearly a quarter of the time it took before (for me anyway). New build to fix that in a couple of days 
Title: Song fingerprinting tools
Post by: odyssey on 2009-03-12 11:25:31
musicmusic, did you see this thread (http://www.hydrogenaudio.org/forums/index.php?showtopic=70228)? I think it has great potential to use acoustic fingerprinting. Did you think through the possebility of an API for this?

What do you think about the possebility of storing an identifier to all clustered tracks specified by a certain threshold? That could help (at least me) creating playlists with duplicate tracks and sorting out in these. In addition something like "remove duplicates", but based on acoustic fingerprinting could be possible on playlists.
Title: Song fingerprinting tools
Post by: musicmusic on 2009-03-14 19:45:36
0.3 released:

-Comparing more than two pre-calcuated fingerprints takes about one-eight of the time it did before with the above change and some further optimisations
-Calculating fingerprints uses the given length instead of continuing to decode the remainder of the file to get the actual decoded length. Seems reasonable to me, and of course there is a performance benefit this way. If anyone has any opinion on this feel free to share it...
-Added a few extra actions previously mentioned in the more-than-two files compare fingerprint results display
Title: Song fingerprinting tools
Post by: Milloflex on 2009-03-15 12:42:25
Maybe try hosting it on a google app engine and see how long it lasts?

http://code.google.com/appengine/docs/what...eappengine.html (http://code.google.com/appengine/docs/whatisgoogleappengine.html)

Hey,

great that someone found a nice use for libFooID.

I will try to bring back the libFooID site. The fingerprint server probably will not come back. The resource usage on the server is just too big.


Title: Song fingerprinting tools
Post by: Garf on 2009-03-15 13:02:20
Maybe try hosting it on a google app engine and see how long it lasts?

http://code.google.com/appengine/docs/what...eappengine.html (http://code.google.com/appengine/docs/whatisgoogleappengine.html)


That requires Python, which would make it a factor 1000 slower than it already is. And the storage they provide is WAY too small for the db.
Title: Song fingerprinting tools
Post by: Garf on 2009-03-15 13:15:36
Whilst I was looking around on the net, I found some discussion of an oddity in regress.c. On line 66, it does this test:
Code: [Select]
if (ssyy <= 0.0f + EPSILON)
and sets *r, but it then divides by ssyy and sets *r again. Basically, the test does nothing. It seemed there was a missing else statement so I changed that (in the downloads above also), but it would be nice if you could confirm..


Yes. There should be an else around the rest of the computation if ssyy = 0.

The code is doing a linear regression on the spectrum represented in dB, and calculating the regression coefficient. The spectrum might be all zeroes.
Title: Song fingerprinting tools
Post by: Nemphael on 2009-03-15 13:52:35
I still can't believe just how fast it's become - here it takes a mere 24 seconds to compare 14k tracks! Also, the new playlist interface is extremely useful. A big thank you from me!
Title: Song fingerprinting tools
Post by: MartDann on 2009-03-15 14:51:24
Wow, incredible fast.
Thank you very much for the new Version.
Title: Song fingerprinting tools
Post by: odyssey on 2009-03-20 12:09:57
musicmusic: Thanks for the ability to save a playlist!  Much appreciated - You probably just saved me a ton of space!
Title: Song fingerprinting tools
Post by: odyssey on 2009-03-20 12:45:58
I was comparing 32320 items in a playlist and after roughly 25%, foobar2000 hardcrashed with: The exception unknown software exception (0x0000409) occured in the application at location 0x0324731b. Click OK to terminate the program
Title: Song fingerprinting tools
Post by: musicmusic on 2009-03-20 13:19:34
Do you mean 0xC0000409? That's STATUS_STACK_BUFFER_OVERRUN... hmmm
"The system detected an overrun of a stack-based buffer in this application. This overrun could potentially allow a malicious user to gain control of this application."

Strange. Did all those files have the fingerprints saved to the file already?

I don't think there's anything wrong on my side, but I can't say what caused that.
Title: Song fingerprinting tools
Post by: odyssey on 2009-03-23 11:37:34
Strange. Did all those files have the fingerprints saved to the file already?

I have a few possible causes

1. I had streams in my selected items (Edit: Judging by the speed it took without those, they are likely the reason it failed at first )
2. No, not all files actually had fingerprints

I'm trying again now... (Edit: Works fine!)

Is it possible to use NG grouping with the inserted silence entrys?
Title: Song fingerprinting tools
Post by: Nemphael on 2009-05-24 13:41:17
Transcoding means a change in audio data, which will yield a different fingerprint than the original. Some people, including myself, might forget this and have files with "invalid" fingerprints in their libraries. It's probably not very important, but how about an opinion to "verify fingerprints" for files? I'm sure this is kind of stupid (Workload and "Hey, it's the same song, so why care?"), as actually generating new fingerprints for all files might be faster, but It would at least not change files unnecessarily.

Upon testing something, I've found that different levels of FLAC compression will have different fingerprints. For instance, I tried converting some files (Originally FLACs at -8) to -0 through -7, then compared all eight. -0 through -2 had a 99.6% with -3 through -8. The first set had a 100% against each other (Comparing -0 through -2) and same for the other (-3 through -8), but not when comparing them to each other. I don't know if this signifies anything, but I'm kind of curious.
Title: Song fingerprinting tools
Post by: Patsoe on 2009-05-24 14:54:09
Upon testing something, I've found that different levels of FLAC compression will have different fingerprints.


This would be a bug in the fingerprinting tool (or in the flac decoder, or something else in the chain, but a bug of some kind). In principle, fingerprints should be robust even on transcoding, because they are based on the spectral composition and not on the binary representation of the music, but in any case, there's no way fingerprints may differ between various levels of lossless compression.

The other explanation would be that your hardware is somehow failing...
Title: Song fingerprinting tools
Post by: Nemphael on 2009-05-24 15:26:05
I've tried this on two different computers, so I don't think it's my hardware. Using Foobar with the official flac.exe. What I'm curious about is how why it's usually always ~99.6% (Tried with some other files too - 99.6%~99.8% for these.)... When I decode a given "99.6% file" from -3 through -8, to -0, it'll give me a clean 100% when comparing it to -0 to -2 files.

Can anyone else confirm this? Convert a file so that you have one FLAC -0 and one FLAC -8, then compare them with foo_biometric.
Title: Song fingerprinting tools
Post by: musicmusic on 2009-05-24 16:37:32
I confirm, looking into it...
Title: Song fingerprinting tools
Post by: 2E7AH on 2009-07-28 00:32:41
musicmusic, can fingerprint data be somehow interpreted?
I downloaded FingerPrint.java and ServerUtil.java in hope that I'll find the answer but I got lost with all those shifts and operators, which I don't understand
Is it built from 424 2-figure hex numbers, or something more complicated?

Why? - I was trying to represent the data with color, without some success and don't know if it's possible in the first place
Title: Song fingerprinting tools
Post by: odyssey on 2009-11-06 09:20:39
musicmusic: If you try to "Save fingerprints to file" on a streaming file (asx in my case), it will try to scan and then crash sometime. Can you fix it so it will simply ignore if you try to do something this stupid?

(It's a real problem when I use it with New File Tagger that automatically saves fingerprints to new files)
Title: Song fingerprinting tools
Post by: musicmusic on 2009-11-06 12:36:37
Do you have a crash log? (Not sure if the crash submitter gives an ID/reference to look it up..)
Title: Song fingerprinting tools
Post by: odyssey on 2009-11-14 14:04:04
It crashes hard, not with the crash-handler. I figured that your tool correctly ignores regular streams, but the problem I experience is while using foo_mslive to stream wma-audio. foo_biometrics don't see these as streams and tries to fingerprint them. Taking one at a time doesn't make foobar2000 crash, it just waits until it times out or something like that. However, take 5 streams or so at the same time, it will crash after a while.

Did you fix the FLAC-problem?
Title: Song fingerprinting tools
Post by: musicmusic on 2009-11-14 23:00:29
It crashes hard, not with the crash-handler.
Not sure if you mean it hangs, or just exits with no error. If it hangs, you can create a dump using Task Manager.

I figured that your tool correctly ignores regular streams, but the problem I experience is while using foo_mslive to stream wma-audio. foo_biometrics don't see these as streams and tries to fingerprint them. Taking one at a time doesn't make foobar2000 crash, it just waits until it times out or something like that. However, take 5 streams or so at the same time, it will crash after a while.
Possibly foo_mslive doesn't handle decoding multiple streams at the same time properly. Don't think I have any special handling for streams etc., I'll try and have a look as you are probably right that it shouldn't bother scanning them.

Did you fix the FLAC-problem?
As far as I could see, it seemed like the floating-point output of the FLAC decoder was different in the two cases. fb2k normally decodes to floating-point, and FooID takes floats as input so I'm not sure that I'm doing anything inherently wrong, unless anyone else has any input.. foo_bitcompare is happy though but I don't know exactly what it does.
Title: Song fingerprinting tools
Post by: partneriflight on 2009-11-24 18:52:52
I restored the relevant part of http://foosic.org (http://foosic.org)


So the site seems to be down again. Anywhere I could find matching.zip?

Thanks!
Title: Song fingerprinting tools
Post by: odyssey on 2010-04-21 03:04:21
I'm a little puzzled about the results (and/or the threshold levels).

Often it finds a similar track (which it should imho) like the instrumental track (from a cdm), a different cut or slightly different version, but at the same time completely ignores some tracks that should be the exact same track just on a different album. Now I tried to lower the threshold to min 50% and the results are mostly the same, except I now get some large groups with completely different songs in them - It seems to easily confuse especially electronic tracks and club mixes that mostly begin with a simple beat.

Can someone clarify?
Title: Song fingerprinting tools
Post by: romor on 2012-11-02 05:46:39
Can I revive this a bit?

I realize that developer is unavailable, but maybe someone else may shred some light

I downloaded Garf's source library, although C illiterate but out of curiosity
From what I can guess browsing it, I assume that track is downsampled and sliced on fixed number of parts then dominant harmonic is extracted for each part?

I then converted fingerprint hex string:
Code: [Select]
dec = array([int('0x' + d['FINGERPRINT_FOOID'][x:x+2], 16) for x in xrange(0, 424, 2)])

So splitting the string on every 2nd char, and converting to decimal I get an array. However I can't get any matching between similar tracks, doing standard correlation tests.
I then made histograms of tested arrays with 10 and then 16 bins, doing same simple correlation tests (Pearson, Spearman, Kendall ... ) on histogram values, but again no luck

Any tips from more knowledgeable?
Title: Song fingerprinting tools
Post by: foosion on 2012-11-02 09:27:02
Check the definition of the t_fingerprint structure in common.h (http://code.google.com/p/libfooid/source/browse/tags/1.0/common.h):
Code: [Select]
/*
    fingerprint storage
*/
struct t_fingerprint
{
    /*
        fingerprint version
    */
    short version;
    /*
        length in centiseconds
    */
    int length;
    /*
        average line fit, times 1000
    */
    short avg_fit;
    /*
        average dominant line, times 100
    */
    short avg_dom;
    /*
        spectral fits, 4 bits times 16 bands = 32 times 87 frames
        -> 348 bytes
    */
    unsigned char r[348];
    /*
        spectral doms, 6 bits times 87 frames = 65.25
    */
    unsigned char dom[66];
};
The content of this structure is packed into a byte array in fp_calculate function in fooid.c (http://code.google.com/p/libfooid/source/browse/tags/1.0/fooid.c):
Code: [Select]
    memcpy(buff, &(fi->fp.version), sizeof(short));
    buff += sizeof(short);
    memcpy(buff, &(fi->fp.length), sizeof(int));
    buff += sizeof(int);
    memcpy(buff, &(fi->fp.avg_fit), sizeof(short));
    buff += sizeof(short);
    memcpy(buff, &(fi->fp.avg_dom), sizeof(short));
    buff += sizeof(short);
    memcpy(buff, &(fi->fp.r), sizeof(unsigned char) * 348);
    buff += sizeof(unsigned char) * 348;
    memcpy(buff, &(fi->fp.dom), sizeof(unsigned char) * 66);
    buff += sizeof(unsigned char) * 66;
As you can see different parts of the fingerprint contain different values. If you need more details about the algorithm, you could try to contact Garf. Since he developed FooID during his time at university, he might have written a paper about it.
Title: Song fingerprinting tools
Post by: romor on 2012-11-02 11:40:30
Thanks foosion  I did so naive...
Can I ask further assistance, as how to decode this structure?

For example, variable `r`, if I convert each "unsigned char" [10:358] to ordinal integer, I don't get expected results (doing correlation), so I assume it's not how it should be done.
In common.h (as quoted nicely) there is equation like: "4 bits times 16 bands = 32 times 87 frames -> 348 bytes" which I can't make sense.
4*87 is 348, so if I divide it (`r`) on 87 frames, I'll get 4 byte values, or?
Title: Song fingerprinting tools
Post by: romor on 2012-11-02 12:41:25
OK, I get that I should group each 4 bytes in this subsequence, then convert each char to bits, slice in 2 and convert this 4 bits to integer, so that I get 16 bins from each 4bytes. Results aren't satisfactory, still

Nevermind
Title: Song fingerprinting tools
Post by: romor on 2012-11-03 10:39:19
My previous post is not correct. It may make sense if we decode each character in FOOID tag, but that's just wrong of course, as we have string that actually represent byte stream.

I assume there are 8 (4bit) bands in 4 bytes. There must be a type in the code.

Here is my notebook, which I cleaned a bit now, in case anyone gets similar idea: http://nbviewer.ipython.org/url/dl.dropbox...nb/foo_id.ipynb (http://nbviewer.ipython.org/url/dl.dropbox.com/u/30782742/ipynb/foo_id.ipynb)
Title: Song fingerprinting tools
Post by: foosion on 2012-11-06 17:04:39
According to the code in spectrum.c (http://code.google.com/p/libfooid/source/browse/tags/1.0/spectrum.c) the r field contains 16 bands per frame, but only 2 bits are used per band.
Title: Song fingerprinting tools
Post by: romor on 2012-11-06 18:27:58
Thanks foosion. I discarded that idea, because of 2bit capacity

I did again now, and with what I initially thought - cross correlate not values, but histograms - here is for example plot for all 16 bands (same playlist as in example): http://i.imgur.com/9Q3tc.png (http://i.imgur.com/9Q3tc.png)
It was just silly idea. Things can't work that way.

Coincidentally, just yesterday, I had a tweet about Mel Cepstral Coefficients (MFC). Used for voice recognition, could provide also genre classification. Computing MFC is relatively easy, especially in Python: `import mfcc` , and searching further I found some papers, but related to classification (machine learning), not about track to track comparison, which should be much easier and deduced just by these coefficients, but not there yet.

Title: Song fingerprinting tools
Post by: foosion on 2012-11-06 18:50:00
I remember that Garf talked on IRC about the fingerprint matching algorithm he used on foosic.org. What I don't remember is how he did the matching.
Title: Song fingerprinting tools
Post by: Garf on 2012-11-08 12:37:36
I uploaded some Python source that reads and matches fingerprints here:
http://sjeng.org/ftp/fooid.py (http://sjeng.org/ftp/fooid.py)

If you have specific questions, I'll try to answer.

The fits data reprents how "flat" or "spiky" the band is, quantized to a value from 0..3.
Title: Song fingerprinting tools
Post by: romor on 2012-11-09 00:45:01
Thanks Garf.

I tried your suggestion, and here is example result: http://nbviewer.ipython.org/url/dl.dropbox...ynb/fooid.ipynb (http://nbviewer.ipython.org/url/dl.dropbox.com/u/30782742/ipynb/fooid.ipynb)
It's same playlist as in my previous post. Decoding is done differently, without `struct` module, but results are same which I checked, just in case.

So I assume you know I'm looking for similar tracks, but result doesn't show any candidates. Do you perhaps have further suggestions?
foosion mentioned you may have some paper written for your fingerprinting - is it so, and is it public?
Title: Song fingerprinting tools
Post by: Garf on 2012-11-12 21:43:21
So I assume you know I'm looking for similar tracks, but result doesn't show any candidates. Do you perhaps have further suggestions?


I would say that it's working as intended. Only the exact same song, possibly after having gone through lossy compression etc, should match with high confidence.

If you want to search for similar songs, drop the confidence, investigate a large sample, and have a look at the mutual closest matching ones. But in my experience the songs that are judged close won't really sound so much to a human listener. I think we use different criteria to judge that compared to the ones libfooid measures - even if they're psycho-acoustically very robust.

There are exceptions - you will likely see "live" recordings match the studio ones fairly well.

Quote
foosion mentioned you may have some paper written for your fingerprinting - is it so, and is it public?


I have a PowerPoint presentation, but it doesn't go in depth, and it's in Dutch.
Title: Song fingerprinting tools
Post by: ncmaothvez on 2012-11-13 01:53:44
romor, are you using Garf's (?) fooid library as is or are you trying to rewrite it in another language? If you're using it as is then you might have run into the same problem I had when I messed around with fooid. It's been nearly a year since I touched the project last time so the details are a bit fuzzy.

The short version
As far as I can tell, fooid has a problem with detecting the start of some songs. This throws off the analysis and the resulting fingerprints are too different to indicate a good match, even when two songs sound the same. In some cases the fingerprints have 0% match even though the songs sound identical. So, it could be that your fingerprint comparison algorithm is actually OK but the fingerprints coming from fooid are bad.

fooid will allways produce fingerprints with 100% match if you compare two identical file-copies of the same song though.


The long version
Line 101 in fooid.c, downloadable from the Google project page, says:
Quote
if (fabs(data[(pos * fid->channels) + c]) >= (1.0f/32768.0f - EPSILON))

and as far as I can remember this detects when the intro silence ends and the song starts. As written, that detection threshold is a just single LSB above absolute silence!

In my case this was waaay to sensitive. Most duplicates in my song collection were detected properly but alot of songs were detected as being 0% similiar, even though they were perfectly identical sound-wise. The reason being that the sampling of the song data had started at differnet points in time due to noise exceeding one LSB before the songs started. I worked with nearly noise-less MP3 and FLAC files but I suspect that if one tries to compare let's say recordings of vinyl records with static noise before the song starts, then it's probably quite likely that one would end up with alot more 0% matches.

When I replaced the '1.0f/32768.0f' part with a much higher value (can't remember how much, I tried several different values, the threshold is a value between 0.0 and 1.0) the results were significantly improved: Identical versions of songs (regardless of encoding method) were detected as 95% similiar or better, off-vocal versions of songs compared to their on-vocal versions were detected with around 75% similiarity and everything else fell below 50% similiarity. Don't quote me on the exact numbers but I remember seeing very well defined groups of similiarity values.

After changing the threshold I ran into another problem though: Identical versions of songs with gradually increasing sound level at the intro rather than a well defined start beat would still be detected as 0% identical if the rate of sound level change at the intro was different between the two songs being compared.

I suspect that the threshold detection is the culprit here too. If the volume level change rate during the intro is different, then the song start level is detected at different points in time and the sampling of the song thus starts at different points in time.
Title: Song fingerprinting tools
Post by: romor on 2012-11-13 02:33:27
Thanks ncmaothvez for you input
I'm not rewriting fooid. I just thought to play with the data provided by foo_biometric, and maybe extract other feature then same song matching. In case I find something interesting I'll be able to provide general script so that any foobar user can use it (in VBS perhaps).
Garf provided some formulas and I haven't tried anything other than that, but will try one of these days

Also MFCC approach should be interesting, and I'll back on that too, as I won't need to scan whole song to calculate MFCC and extract song features, but just couple of samples.
Title: Re: Song fingerprinting tools
Post by: 2tec on 2016-09-01 14:38:24
Installed Foobar2000 v1.3.11 beta 4 - Failed to load DLL: foo_biometric.dll / Reason: This component is missing a required dependency, or was made for different version of foobar2000.
Title: Re: Song fingerprinting tools
Post by: PeteG on 2016-09-01 15:16:38
You need FooID.dll in your foobar2000 program folder. See 1st posting in this thread.

Edit
Um, did you already have the component installed and working before updating foobar2000?
Title: Re: Song fingerprinting tools
Post by: 2tec on 2016-09-01 15:47:28
You need FooID.dll in your foobar2000 program folder. See 1st posting in this thread.
Hmmm, forgot to do that, thanks!
Title: Re: Song fingerprinting tools
Post by: Jan S. on 2022-10-16 15:05:01
Does anything similar exist today? I'd like to fingerprint all my files and look for albums that are completely duplicated elsewhere (usually in a boxset). If the files can be tagged with a fingerprint I can export to R and do the checking for duplicates there.

I am aware that there are de-duplication programs that can do it but from what I have seen they focus on finding duplicate files. That makes answering the question if the whole album is duplicated difficult...
Title: Re: Song fingerprinting tools
Post by: regor on 2022-10-17 12:22:02
Playlist tools (my sig) contains tools to calculate FooID or chromaprint for files on batch. Binaries included for both. You can find duplicates with Fooid within foobar with the tools, for Chromaprint in 32 bit there is not enough memory for large libraries.

Anyway once you tag your library if you know how to export the tags and do the checks yourself that would do.
Title: Re: Song fingerprinting tools
Post by: Jan S. on 2022-10-21 16:54:44
Thanks! I thought fooid was dead but I managed to get it going with the files from your SMP scripts.