Hydrogenaudio Forum => Scientific Discussion => Topic started by: Logarithmus` on 2021-11-12 01:33:27

Title: Improving compression of multiple similar songs
Post by: Logarithmus` on 2021-11-12 01:33:27
Hello. That's my first post ever on this forum. Maybe you know about tar program. It combines multiple files/directories into one. After that one can compress it with gzip, zstd, xz, etc. If compressing similar files, the compression achieved by .tar.gz will be better than of .zip. That's because .zip compresses each file separately, while .tar.gz compresses the whole .tar archive.

So I've got an idea: suppose there are multiple similar songs, maybe from the same band or album. Would applying the same principle improve compression of the songs? If yes, then how much lossy vs. lossless algorithms would benefit from the technique?

P. S. I'm just a layman who is interested in video/audio codecs, but I'm not familiar with codecs' internals.
Title: Re: Improving compression of multiple similar songs
Post by: saratoga on 2021-11-12 01:48:47
Practical audio/video codecs only analyze a very short amount of material (milliseconds to at most a few seconds) at a time to keep memory usage reasonable.  Using more memory allows you to exploit longer term correlations in the data stream and so improves compression somewhat, but it usually isn't worthwhile.
Title: Re: Improving compression of multiple similar songs
Post by: Markuza97 on 2021-11-12 03:06:57
The closest thing is .cue + audio file. (.flac for example)
By doing that you can save couple of bytes maybe?
Like saratoga said, it's just not worth it.
Title: Re: Improving compression of multiple similar songs
Post by: ktf on 2021-11-12 07:34:41
For example, the FLAC codec never looks back more than 32 samples, and in subset files CDDA files (which is the overwhelming majority) no more than 12 samples. 12 samples is 0.3 milliseconds. So, with FLAC, you're not going to get any compression benefit.

If you want to do this, you should create a format that removes long-term redundancy (by comparing blocks and subtracting one from the other) and after removing redundancy, feeding it through FLAC. However, real-time decoding of such a file would be very difficult.
Title: Re: Improving compression of multiple similar songs
Post by: Porcus on 2021-11-12 09:41:31
I've been thinking of the same, and maybe I even posted it here - or maybe someone else did. And the answer, like others have already hinted at:

Audio formats are supposed to be played back in real time. They aren't made for you to decompress your entire collection every time you want to play back a song. If you had a workstation where you would make a demo tape with "record chorus once, insert it five times in a song" then obviously it would save space if you could point 2,3,4,5 back to the first occurrence and save all that space.
But if you are streaming the song by successively delivering chunks of the encoded file, and someone tunes in after chorus 1, they wouldn't have the data. So it isn't "streamable" from file - imagine the buffering.

I am not saying that it wouldn't be possible to do what you suggest, but it would require other workarounds - and besides, for end-users: how much space would you really save? If you have two different masterings of the same album, I doubt it would be much; if you have fourteen different rips all with errors of a single song and want to keep them all until CUETools gets a correction file that can be useful, then sure - but this isn't a big part of your hard drive, hopefully, and then overall gains for implementing such a monster would be quite small.

In theory it could be done at file system level, with a file system constructed specifically for the purpose of compressing and deduplicating audio. That sounds like more than a Master thesis work, to say the least ... and compression and block-level deduplication demand quite a bit computing power - saving bytes is not done for free. Then the file-system compressed bytes would be decompressed prior to streaming. (Buffer issue? Full file in RAM before you play.)
Title: Re: Improving compression of multiple similar songs
Post by: rutra80 on 2021-11-12 20:27:48
You might try some archiver with solid mode and large dictionary, but they rather seek identical, not similar data. And even slight difference in any audio characteristic, will result in not identical data.
Such idea seems more suited for lossy codecs, but so far nothing like that exists (maybe deep learning codecs will be something like that). Also, something like that would rather be an archiver, with poor seeking, quite a lot of I/O etc.
MIDI and module files are something similar - they consist of samples which you play on different notes, volumes and effects, create sequences which may repeat etc. But you make them like that at the time of creation - you can't quite convert something to MIDI or mod.
Title: Re: Improving compression of multiple similar songs
Post by: Porcus on 2021-11-13 00:36:17
Such idea seems more suited for lossy codecs, but so far nothing like that exists
In principle, a hybrid lossy as lossless with correction file does something like that. By design when encoding, of course. 

Title: Re: Improving compression of multiple similar songs
Post by: kode54 on 2021-11-13 01:58:33
Even many lossless codecs are internally a lossy predictor with correction data. And even Fraunhofer tried to do something lossless with MP3 when they made MP3HD, which was doomed to fail because it was a closed standard.

Something like MP3HD requires a reference MP3 decoder that works purely in integer space, and produces consistent 16 bit or greater output given a particular MP3 stream, regardless of which CPU type it is compiled for, since the compressed residue correction data needs to precisely work with a uniform lossy input.
Title: Re: Improving compression of multiple similar songs
Post by: Porcus on 2021-11-18 23:39:47
Experiment: Crossing channels between the same song in two different masterings, to see if bitrate would go up a lot.
If yes - an indication that deduplication wouldn't compress much (due to the different masterings); if no - an indication that it could. Not a rigorous test by any means, not sure how valuable - but at least an indication.

And edit: here I had to change my conclusion because ... actually, stereo decorrelation doesn't help more than about a percent on these albums. Bummer, should have thought of that.

Oh well, here I did initially:

I have three King Diamond albums where the remasters have exactly the same track lengths - to the sample! - as the original CD. Evidently Mr. LaRocque Allhage has gone back to files and just applied some new settings.  They do measure quite different:
originals compress to 918 kbit/s at FLAC -8, RG album gains are like -5, -9, -7 dynamic range are 13, 9, 10;
remasters compress to 1026 kbit also at -8, RG album gains are like -9, -11, -12, dynamic ranges 9, 8, 6);
(They are Spider's Lullabye, Voodoo, House of God)

Using ffmpeg, I took each CD and created a stereo .wav with the left channel being the left channel from the original master, and the right channel being the right channel from the remaster.  Then compressed.

* The albums - all 2x3 averaged: 972 kbit/s in flac.exe -8, and TAK -p4m test encoding reported at 67.46 percent.
* The ones I generated crossing channels: 983 kbit, resp. 68.27 percent.
* The 2x3 mono files average to 492 kb. Multiply that by two, and ...
* So then I realized the idea was quite a bit flawed, and generated stereo files with left channel from old & left channel from new. How would stereo decorrelation then change? Not much: 985, while the left-channel mono files are 493.

What I did not: try to align peaks. So there could be an offset misalignment.