Binary Comparator version 2.0

Topic: Binary Comparator version 2.0 (Read 18281 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Follow-ups
- After re-encoding some tracks with CUETools, "Bit-compare tracks" in foobar shows differences

Binary Comparator version 2.0

2014-12-09 14:29:07

This is a total rewrite of the old Binary Comparator component.

Current version: 2.0

http://www.foobar2000.org/components/view/foo_bitcompare

Highlights:

Added the ability to detect offseted tracks and automatically rerun the comparison with offset corrected.

Binary Comparator version 2.0

Reply #1 – 2014-12-09 17:14:05

Nice, thanks for this!
A minor suggestion: when offseted tracks match add that information to the summary at the top of the results box, something like "Differences found in x out of y track pairs, but no differences in decoded data found after applying offset in a out b track pairs".

Binary Comparator version 2.0

Reply #2 – 2014-12-11 12:45:39

Beta 2 posted, improved comparison summary formatting.

Binary Comparator version 2.0

Reply #3 – 2015-01-09 10:28:47

Version 2.0 final posted.

Binary Comparator version 2.0

Reply #4 – 2015-01-12 17:18:55

Quote from: Peter on 2014-12-09 14:29:07

Added the ability to detect offseted tracks and automatically rerun the comparison with offset corrected.

Huh, this passed me by until now, thanks.

Wishlist:
Splice tracks as follows:

Selecting

A
B
C
a
b
c

to compare. Consider the Bb pair, assumed identical or near-identical modulo offset: fill up, say beginning of B with samples from end of A and end of b with samples from beginning of c.

Also consider (saving stupid questions from idiots like myself): When reporting e.g. "Compared 195951000 samples", write instead "Compared 2 channels, 195951000 samples each"

Binary Comparator version 2.0

Reply #5 – 2015-01-15 21:45:27

Played a bit around with this, and as much as I think the trim-and-compare-intersection-set feature is cool, it is so dangerous that I think it should be possible to turn it off; I recently had a disk crash, and the comparator now leaves me with a lot of on-the-surface reassuring comments about no differences found in the compared range (which could be a few seconds subset).

Oh, one more thing: when invoking the three-step procedure of
(1) different length, check intersection
(2) match up for offset
(3) repeat step (1);
does it actually then compare the intersection of the aligned streams? Or are the samples discarded in step (1) discarded for good despite they could be relevant after applying offset?

Binary Comparator version 2.0

Reply #6 – 2015-02-23 08:08:35

Still for wishlist: warn me when I (by mistake) try to compare a file to the same file.

Binary Comparator version 2.0

Reply #7 – 2015-04-15 07:12:17

This could need some explanation:

[blockquote]Differences found in all track pairs, out of which 9 became identical after applying offset and truncating first/last samples.
Non-zero offset detected in 11 out of 11 non-identical track pairs.[/blockquote]

OK, 9 are supposed to have become identical. But 10 - not 9 - yield the "No differences in decoded data found within the compared range."
The one that distinguishes itself, is track 2, with the slightly longer message:

[blockquote]Comparing again with corrected offset...
Length mismatch : 4:10.554240 vs 4:10.426667, 11049442 vs 11043816 samples.
Compared 11043816 samples, with offset of -7898 discarding last/first samples from total of 11051714, discarded last 5626 samples from the longer file.
No differences in decoded data found within the compared range.[/blockquote]

as opposed to e.g. track 1:

[blockquote]Comparing again with corrected offset...
Compared 13988266 samples, with offset of -6134 discarding last/first samples from total of 13994400.
No differences in decoded data found within the compared range.[/blockquote]

Binary Comparator version 2.0

Reply #8 – 2015-07-06 12:38:40

And, since I'm having this nice little conversation in my own bad company:

I just played around with a CD rip where all tracks but the first were - modulo offset - reported as identical.
The first track was longer in CD number I than in CD number II - seems it started a bit abruptly in the latter one. So there were two differences:
- One quarter of a second more audio at the beginning of CD I.
- One 10th of a second offset - same for each track, but fb2k fails to detect it for the first one.

fb2k cannot detect this.
I aligned the offset using CUETools.
fb2k can still not detect this, because it insists on cropping the last quarter of a second from the longer file. A quick solution: try to drop the last N samples and then try the first N samples.

Binary Comparator version 2.0

Reply #9 – 2015-07-23 08:00:50

On the component homepage it says: "All listed components are compatible with foobar2000 1.1 and newer.", but it will not run under v1.2.8. which I'm stuck with. Is there an older version of foo_bitcompare available somewhere?

Binary Comparator version 2.0

Reply #10 – 2015-07-23 17:10:02

Quote from: Manchesterer on 2015-07-23 08:00:50

On the component homepage it says: "All listed components are compatible with foobar2000 1.1 and newer.", but it will not run under v1.2.8. which I'm stuck with. Is there an older version of foo_bitcompare available somewhere?

1) The latest foobar2000 is 1.3.8: http://www.foobar2000.org/download
2) The component works with the latest version.

What's you issue again?

Binary Comparator version 2.0

Reply #11 – 2015-07-23 17:37:33

the issue looks clear enough to me. the website wrongly states that all components only require v1.1 or above which is not true. yes, everyone could use the latest version of foobar which would solve the problem but it doesn't change the fact that the website is giving out wrong information.

it looks like foo_bitcompare is built with the latest SDK which requires at least v1.3. therefore the website (or at least the page for that component) should be updated to say so.

Re: Binary Comparator version 2.0

Reply #12 – 2017-04-09 00:00:45

I have these two CD rips of the same album, but different prints.
I tried to compare them because I wanted to find out if there was any difference worth keeping.
foobar didn't detect an offset and just said they were different.
I used an audio editor and manually found a 15632 sample offset. I aligned them by cutting the excess from the start of the one that lagged and pasting it at the end.
Now foobar2000 says they are identical.
I guess there's a limit on how far fb2k will look for an offset, which seems reasonable. But, how much is it? 10000 samples? 1/4 of a second?

Re: Binary Comparator version 2.0

Reply #13 – 2017-04-09 19:59:59

I think I have posted this before, but it is so with the just released version:

Bitcompare:
All tracks decoded fine, no differences found.

But no, they do not decode fine. Verify yields:
Warning: Reported length is inaccurate : 4:04.912086 vs 4:04.859841 decoded
Error: MPEG frame checksum mismatch.
Error: MP3 decoding failure: Unsupported format or corrupted file
Error: MPEG stream error at 2971881 bytes

Re: Binary Comparator version 2.0

Reply #14 – 2017-04-10 09:13:00

Quote from: Porcus on 2017-04-09 19:59:59

I think I have posted this before, but it is so with the just released version:

Bitcompare:
All tracks decoded fine, no differences found.

But no, they do not decode fine. Verify yields:
Warning: Reported length is inaccurate : 4:04.912086 vs 4:04.859841 decoded
Error: MPEG frame checksum mismatch.
Error: MP3 decoding failure: Unsupported format or corrupted file
Error: MPEG stream error at 2971881 bytes

Thanks for reporting, fixing.

I really hate it when software tells you lies, and this can be considered such. A fix will be out shortly.

Re: Binary Comparator version 2.0

Reply #15 – 2017-04-10 10:27:58

A suggestion on the matter:
Since one of my uses has been to verify mass conversion, I need somehow to distinguish between "both corrupted" and "one corrupted": I want to know if I have converted a corrupted file into a valid file with "corrupted content", only removing the proof that the original file was broken.

And another one, merely semi-related:
Issue a warning when I have compared a file against itself. Guarding against an all-too-simple human error that completely invalidates the result.

Re: Binary Comparator version 2.0

Reply #16 – 2017-04-11 09:33:42

Quote from: radorn on 2017-04-09 00:00:45

I have these two CD rips of the same album, but different prints.
I tried to compare them because I wanted to find out if there was any difference worth keeping.
foobar didn't detect an offset and just said they were different.
I used an audio editor and manually found a 15632 sample offset. I aligned them by cutting the excess from the start of the one that lagged and pasting it at the end.
Now foobar2000 says they are identical.
I guess there's a limit on how far fb2k will look for an offset, which seems reasonable. But, how much is it? 10000 samples? 1/4 of a second?

The limit was at 8192 samples. It will be bumped to 65536 samples in the next version.

Quote from: Porcus on 2017-04-10 10:27:58

A suggestion on the matter:
Since one of my uses has been to verify mass conversion, I need somehow to distinguish between "both corrupted" and "one corrupted": I want to know if I have converted a corrupted file into a valid file with "corrupted content", only removing the proof that the original file was broken.

And another one, merely semi-related:
Issue a warning when I have compared a file against itself. Guarding against an all-too-simple human error that completely invalidates the result.

OK, information about which file is problematic will be shown.
Guard against compare-against-self added as well.

Version 2.1.1 will be posted shortly.

Re: Binary Comparator version 2.0

Reply #17 – 2017-04-11 22:49:09

@Peter Thank you very much!

Re: Binary Comparator version 2.0

Reply #18 – 2017-04-12 10:47:59

Thank you for the update. Tested it, works here.

But some never get satisfied, do they/we: what about a "Compare the rest" button for when it detects a compare-against-self among a larger set of pairs? And then in the final report, a "N files compared against self" in the summary on top and for each file the appropriate information?
(Then the question: if the user actually selects to compare anyway, should it skip or should it read twice the against-selfs? Will anyone ever have use for the latter for testing purposes?)

Re: Binary Comparator version 2.0

Reply #19 – 2017-04-23 15:37:21

The following makes me curious. Comparing two pressings, and most tracks have different length, non-zero offset and end with "the tracks became identical after applying offset and truncating first/last samples."

But then there is one track-pair that does not. It still ends up with an identical subset, but is reported different. I see the catch, but I wonder what the issue is.
Differences found: length mismatch - 6:55.533333 vs 6:55.040000, 18325020 vs 18303264 samples.
Compared 18303264 samples, discarded last 21756 samples from the longer file.
[...]
Compared 18303264 samples, with offset of -10677 discarding last/first samples from total of 18313941, discarded samples were not silent in either file, discarded last 11079 samples from the longer file.
No differences in decoded data found within the compared range.

Yet the heading says "Differences found in compared tracks." Of course there are differences, but not in the subset.

Which brings me to the issue:

Is it so that it truncates, calculates offset, truncates again if it has to, and then compares?
Shouldn't it rather, after calculating offset, "start anew"? Otherwise it may "have already discarded the wrong samles", wouldn't it?
And is that what happened, and the reason why it reports "Differences found" despite having discarded itself down to a bit-identical intersection?

Re: Binary Comparator version 2.0

Reply #20 – 2021-05-19 18:14:50

Hi there.

This component is very difficult to understand (at least for me...) and I couldn't found a basic tutorial that explains the terminologies...

Please, what means that:

Code: [Select]

Differences found in compared tracks.
Zero offset detected.

Comparing:
"A:\_VIP\Output 2021\FLAC\02 Easy Rocker.flac"
"A:\_VIP\Output 2021\FLAC\02 Easy RockerSAM.flac"
Compared 14430578 samples.
Differences found: 9602710 values, 0:00.000045 - 5:27.223946, peak: 0.009277 (-40.65 dBTP) at 4:46.922993, 2ch
Channel difference peaks: 0.005249 (-45.60 dBTP) 0.009277 (-40.65 dBTP)
File #1 peaks: 0.988586 (-0.10 dBTP) 0.988586 (-0.10 dBTP)
File #2 peaks: 0.988586 (-0.10 dBTP) 0.988586 (-0.10 dBTP)
Detected offset as 0 samples.

Total duration processed: 5:27.224
Time elapsed: 0:01.505
217.49x realtime

Thanks in advance.

Re: Binary Comparator version 2.0

Reply #21 – 2021-05-19 20:22:40

It means that the files have a lot of samples that differ, but the peak difference is as low as 40.65 dB under digital full scale - which is 40.55 dB under the loudest on the file (this because of the "-0.10".

For dB ballparks, see https://ehs.yale.edu/sites/default/files/files/decibel-level-chart.pdf ; if you play like at 96 dB, the loudest difference is like having a refrigerator turned on or off.

Then the other part: zero offset means that the files are aligned to start and end at the same time (i.e., if both have a tiny bit of fade-in, music starts at the same millisecond and even the same 1/44.1th of a millisecond). Why would that be an issue? Because CD readers and writers differ and disagree on precisely where to begin, so even what was digitally identical files from the studio, may get a different offset - get shifted left or right by some samples - by the way they end up on plant A's CDs and plant B's CD's. foo_bitcompare can compensate for that, very useful.

Re: Binary Comparator version 2.0

Reply #22 – 2021-05-19 20:51:53

@Porcus

Great! Your explanation illuminated everything - from now on I will be able to get along better with this component!

Thank you very much!

Re: Binary Comparator version 2.0

Reply #23 – 2022-05-01 04:33:58

Hello. When bit compare compares tracks, does it extract the RAW audio stream from each track? I recently discovered that FLACs in tags have a checksum record of RAW audio stream MD5 like this Audio MD5 : 4A25EF5A772377157CF29A227712F953. I think bit compare could first read this information, since it is probably faster than extracting RAW audio streams, and if MD5 does not match for tracks, then it would extract RAW audio streams

Re: Binary Comparator version 2.0

Reply #24 – 2022-05-01 16:12:32

It decodes to raw PCM (as 32-bit floating-point) and compares bit by bit.

If you rather want to compare stored MD5 sums, then you can make a custom column "MD5", code [%__md5%] and see if they match (probably the easiest if you have many is to sort by MD5?). It's not unique to FLAC, but FLAC has it enabled by default; WavPack, TAK and OptimFROG have an optional MD5 sum (optional in that it will not be stored unless you apply an option upon encoding); Monkey's Audio makes a checksum out of the encoded data, so it isn't comparable with anything except Monkey's at the same setting.

Also if you are looking for duplicates, you might use foo_facets with the statistics reporting: it can then sort MD5s by the number of occurrences. That does not mean you want to delete them - I have a lot of 1234DD57F3AF7775D57493B54D59BCEB (4 seconds of silence on CD tracks), and unfortunately a few D41D8CD98F00B204E9800998ECF8427E as well.

Notice