Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: computing accurip sums (Read 1794 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

computing accurip sums

I wish generate checksums that entirely identify all of the audio results from a rip operation.

I have the following two questions, which relate to whether the AccuRip algorithms would be helpful in this respect:

  • Does Accurip include an algorithm for a single sum for all audio on the disc, or only a separate sum for each track?
  • If an operation is defined for a whole-disc sum, how may such a value easily be computed from a local audio file, preferably on a Linux command line?

Re: computing accurip sums

Reply #1
If you are ripping to one-lossless-file-per-disc - why not instead use the Audio MD5? It is more widely supported.

* AccurateRip is track-based. The code may of course be used on the entire thing.
You can alternatively look up the CUETools database's checsum. At least originally that was developed for the full CD.

* If you want to compare across drives, there is the offset issue: drive A will start the bitstream some samples to the left or to the right of drive B. For this reason as well, AccurateRip omits the very beginning and the very end.

Re: computing accurip sums

Reply #2
Are tools available for local computation of the full-CD checksum?

(I thought the entire purpose of the system was for comparison across drives.)

Re: computing accurip sums

Reply #3
Are tools available for local computation of the full-CD checksum?
The audio MD5? FLAC and WavPack calculate and store them as standard. They are used for integrity checks.

(I thought the entire purpose of the system was for comparison across drives.)
Yes, and therefore the CD rippers set the offset before ripping. Newer versions of the rippers also do cross-pressing verification by calculating as if offset were different. But you don't want to implement that thing for yourself, I suppose.

Actually, if you have ripped with a different offset, CUETools can fix offset. Typically to nearest which verifies AccurateRip.

 

Re: computing accurip sums

Reply #4
I would like to compute a checksum of the audio in a disc, preferably according to some common or standard approach, whatever may be available.

The reason is that using metadata for file names in archived images is too arduous, due to collisions as well as ambiguities. I want to name the files according to a checksum, and then later generate a copies with more human friendly names. By checksums as names, I ensure that I would store exactly one copy of each unique audio image.

Re: computing accurip sums

Reply #5
The reference FLAC and WavPack encoders will do that, and it is standard to the format.

I am using tracks, rather than images, and in ten thousands of MD5 sums I have no collisions: those with identical MD5 are identical audio.
(Yes it happens that a track ends up in two albums, in particular with compilations; also several albums that have a bonus track 99 fill up with a bunch of completely silent 4.00 seconds tracks.)

So if you want a directory with a complete image file with embedded cuesheet, you would be safe with a first using a naming pattern like
D82451F63441095B1AB43206F2E43A0E.wv
and then later go to a naming pattern of the form
Oldfield, Mike {1990} Amarok ¨ D82451F63441095B1AB43206F2E43A0E.wv

Edit: If you change the audio, e.g. by using CUETools' repair function, the name will not reflect the new MD5, of course.

Re: computing accurip sums

Reply #6
Yes, the scheme certainly would be safe from collisions, as you say.

My preference is for a sum algorithm that describes the audio data, separate from format or encoding. I was hoping also to find a system that is in somewhat widespread use.

The Python library audiodiff appears to supply this functionality, though the algorithm is exclusive to the particular utility.

Re: computing accurip sums

Reply #7
The FLAC/WavPack MD5 is on the decoded audio, the raw PCM. Widespread? Most lossless compressed formats support it: FLAC, WavPack, Monkey's Audio, OptimFrog - and it is supported (but not mandated) in TAK. (The main exception is it is not supported for ALAC in MP4.)
What you need is then for your audio renaming application to be able to read it. I don't know what you are using, I'm on foobar2000 that certainly does.


Re: computing accurip sums

Reply #8
I can handle the renaming through scripting as long as I have access to a command that writes the sum.

Is this sum you mention, the MD5, computed exactly the same for both FLAC and WavePack. That is, would a file of each format show the same value for this sum as long as they encoded identical copies of an audio source?

Re: computing accurip sums

Reply #9
Yes it is the same.

To get it from a .flac:
metaflac --show-md5sum filename.flac

To get it from a .wv:
wvunpack -f7 filename.wv

Re: computing accurip sums

Reply #10
If the algorithm is the same for all the formats mentioned, then it exactly satisfies my needs. I have no WavePack files available to verify, but I am content as long as you understand that the sum would be the same for a matching source.