HydrogenAudio

Lossless Audio Compression => Lossless / Other Codecs => Topic started by: Porcus on 2021-11-23 10:00:01

Title: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-11-23 10:00:01
So after I made a quite thoughtless across-tracks-decorrelation experiment (https://hydrogenaud.io/index.php?topic=121739.msg1005004#msg1005004), and played a bit back and forth with different codecs on that to sort out my confusion, I thought, why not run a test on some files I've already thrown at @ktf's FLAC betas (https://hydrogenaud.io/index.php?topic=120158.msg1003738#msg1003738).

I was in particular curious about one thing: OptimFrog's claim to using a smarter decorrelation (http://losslessaudio.org/). Turns out, OptimFrog and Monkey's and TAK can get more out of stereo on CDDA, than do WavPack-less-than-x4 and FLAC - but that is not the case for 96/24. In any case, it cannot explain OptimFrog's small filesizes; rather they are probably because the format is generally complex and puts your CPU at work to keep you warm during the winter.

The results should be interpreted with so much caution that I initially thought they might not be useful unless something striking would show up. Say, here you cannot expect results to go in the same direction:
If encoder X spends more time than encoder Y getting stereo file x smaller than stereo file y, we cannot tell whether it is more "efficient" or just spends more effort searching for patterns. We don't even know the theoretical compressibility of the L-R difference signal.

Anyway, I think we can take home a few findings:
* TTA cannot do mono! Yeah it can handle multi-channel, but cannot read input files that are mono .wav  *shrugs*
* The small-files compressors OFR/TAK/MAC compress both the mono well and the stereo well. OFR and TAK's heavier modes increase the stereo diff.
* WavPack at x4: Consistent with tests at https://hydrogenaud.io/index.php?topic=120454.msg1004854#msg1004854 , WavPack "needs" x4 to compress hirez well. Maybe it is to get differences in the ultrasonic octave compressed?
* FLAC. The beta implements double-precision calculation that improves quite a bit, especially for higher-rez. An reasonable speculation on the "good" stereo reduction for stock 1.3.1, could be that it compresses away some of the effects of a "bad roundoff" to single precision that makes for more digits common to all channels. Bad common roundoff to zeroes in common -> can in part be compressed.
* FLAC's "-M" does pretty well. With that switch it does not fully calculate L/R vs mid/side before deciding which one to use.


Columns: Mono size (my locale uses comma for decimal separator), stereo gain in ppm; then stereo gain per sub-corpus. Not displayed: gains per file to get an idea of per-file overhead, see instead the first FLAC column, and consider that there were 71 + 42 + 1 file(s).

mono GB2chdiffppm.CDDA, rockhirez, rockhirez, jazz/cl.
.
WAVE10,612,9.7,41,80,0
.
FLAC  irls-2021-09-21 -8 --no-mid-side6,21580.783512468
FLAC  irls-2021-09-21 -8 -M6,215 500.12 2903 0731 864
FLAC  irls-2021-09-21 -86,216 485.13 4594 4322 312
FLAC  irls-2021-09-21 -56,266 608.13 6704 5482 364
FLAC 1.3.1 -86,387 940.13 5678 1552 707
FLAC 1.3.1 -56,448 274.13 9788 7182 747
.
WavPack -f6,584 744.11 9793 077−45
WavPack default6,404 953.10 9673 994546
WavPack -hx16,275 782.16 2841 987199
WavPack -hhx46,2415 832.18 10522 9986 667
.
Monkey's normal6,296 923.17 6913 403828
Monkey's insane6,226 931.16 8874 016957
.
TAK -p26,156 907.17 9942 7351 176
TAK -p4m6,097 440.18 4533 3891 654
.
OFR --preset 26,066 457.17 2861 7851 456
OFR --preset 105,987 604.18 6573 7061 631

Corpus in more detail:
The first two columns are the corpus from https://hydrogenaud.io/index.php?topic=120158.msg1003738#msg1003738 .
The "hirez jazz/cl." is one file where I merged together 106 minutes 96/24 from http://www.2l.no/hires/ , jazz/classical acoustic recordings sometimes recorded in multi-ch and downmixed. Same file as mentioned at the bottom of https://hydrogenaud.io/index.php?topic=120158.msg1001334#msg1001334 .
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: rutra80 on 2021-11-24 12:35:21
Could you please elaborate a bit on the columns? I'm too dumb to figure that "ppm" meaning :)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-11-24 12:52:25
ppm = "parts per million". 1/10k of a percentage point.

So for the biggest overall effect, WavPack -hhx4, the mono files are 624/1061 of WAV size, that is 58.8 percent; the stereo files are around 1.6 less, 57.2 percent of WAV size.

Here you got the same with different formatting, where mono filesizes are in percent of .wav, and where the differences are in percentage points:
mono compression2chdiff in pctptsCDDA rockhirez rockhirez jazz/cl.
WAVE100.0%2.94E−047.42E−041.84E−044.36E−06
FLAC  irls-2021-09-21 -8 --no-mid-side58.5%0.060.080.050.05
FLAC  irls-2021-09-21 -8 -M58.5%0.551.230.310.19
FLAC  irls-2021-09-21 -858.5%0.651.350.440.23
FLAC  irls-2021-09-21 -559.0%0.661.370.450.24
FLAC 1.3.1 -860.2%0.791.360.820.27
FLAC 1.3.1 -560.7%0.831.400.870.27
WavPack -f62.0%0.471.200.310.00
WavPack default60.4%0.501.100.400.05
WavPack -hx159.1%0.581.630.200.02
WavPack -hhx458.8%1.581.812.300.67
Monkey's normal59.3%0.691.770.340.08
Monkey's insane58.6%0.691.690.400.10
TAK -p258.0%0.691.800.270.12
TAK -p4m57.4%0.741.850.340.17
OFR --preset 257.1%0.651.730.180.15
OFR --preset 1056.4%0.761.870.370.16
It may be surprising to see WavPack -hhx4 not out-compress FLAC, but that is because most of the corpus is high sample rate where WavPack doesn't shine as much and where the new FLAC beta improves a lot.
WAV file sizes:
CDDA rock: 3.05 GB (5h09min)
hirez rock: 3.4 GB (1h47)
hirez jazz/cl.: 3.4 GB (1h46)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: rutra80 on 2021-11-24 21:41:19
2chdiff - is that full file but stereo, or is that only the extracted mid/side or l/r difference signal?
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-11-24 21:42:46
That is the difference
one file in stereo - (one file for the left channel + one file for the right channel)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: rutra80 on 2021-11-24 22:04:34
Hmm the question is if codecs treat the difference as separate signal and compress it separately, or is it somehow used for predictors etc. If the latter, then I'm not sure if compressing the difference signal alone tells much - it's not music and predictors aren't tuned for it... It somehow resembles analysing lossy codecs by listening to difference signal...
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-11-24 22:31:08
No, not "difference" as in difference signal - as difference in size.

What I did, was I split a stereo file file into a left channel file and a right channel file.
Compressed left channel file and right channel file. That is a safe way to get "dual mono" of the same audio.
Compressed the original file too, with the same setting.

Then a measure of how much use the encoder makes of channel correlation, is: how many percent does it gain when it can look at both?
A measure, but I didn't say it was a precise one. But FWIW I think it says something about the FLAC revision, about some WavPack settings - and, it suggests that OptimFrog's secret doesn't lie in exceptional handling of stereo, but rather in throwing heavy artillery at every signal.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: ktf on 2021-11-28 19:00:50
Quite clear results. I only know the FLAC format very well, I just looked up Wavpack and Monkeys Audio. Put simply, it seems WavPack and Monkeys Audio only implement a conversion to mid-side audio, while FLAC also has stereo decorrelation modes called left-side and right-side. To me it seems that for WavPack and Monkeys Audio, though I'm not sure about Monkey's Audio, that either left and right or mid and side channels are treated separately after either converting from left-right to mid-side or not.

So, apparently, the gain is not in the stereo decorrelation but in the way a mid or a side channel can be compressed. The best explanation I can come up with (but please note this is purely guesswork) is that FLAC is less equipped to deal with small signals that might occur in the mid channel of highly-correlated stereo. This would also explain why FLACs benefit is only present for 16-bit (CDDA) material and not for 24-bit signals.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-11-29 00:14:29
Not sure what are the right guesses CDDA vs hirez ... one thing is that a big part of hirez is uncorrelated noise. (Also I have not timed these. No info here about how much more effort froggy puts in stereo than in mono, for example.)

Anyway, here is the CDDA-only part, and where I have added columns that compare each mono to the FLAC beta at -8 mono, and each stereo to new -8 stereo. The final column is, say, [t](filesize FLAC -8 stereo minus Monkey's insane stereo) minus (filesize FLAC -8 mono minus Monkey's insane mono)[/t], that is: We know that Monkey's insane compresses more than FLAC does, and that difference is how much bigger in stereo than in dual mono?

CDDA part onlymono compressionstereo compression2chdiff in %pts1ch vs new FLAC -82ch vs new FLAC -8diff previous two
WAVE100.0%100.0%7.42E−04
FLAC  irls-2021-09-21 -8 --no-mid-side67.2%67.1%0.080.00−1.27
FLAC  irls-2021-09-21 -8 -M67.2%66.0%1.230.00−0.12
FLAC  irls-2021-09-21 -867.2%65.9%1.350.000.00
FLAC  irls-2021-09-21 -567.6%66.2%1.37−0.34−0.32
FLAC 1.3.1 -867.3%65.9%1.36−0.08−0.07
FLAC 1.3.1 -567.7%66.3%1.40−0.46−0.40
WavPack -f68.7%67.5%1.20−1.47−1.62−0.15
WavPack default67.4%66.3%1.10−0.20−0.45−0.25
WavPack -hx166.9%65.3%1.630.320.600.28
WavPack -hhx466.7%64.9%1.810.551.010.46
Monkey's normal66.1%64.3%1.771.151.570.42
Monkey's insane65.1%63.4%1.692.122.470.34
TAK -p266.3%64.5%1.800.951.400.45
TAK -p4m65.8%63.9%1.851.471.970.50
OFR --preset 265.4%63.7%1.731.812.190.38
OFR --preset 1064.4%62.5%1.872.823.340.52
We see that getting a stereo signal, will enable the higher-compressing codecs to increase their compression advantages over FLAC, that is not unexpected; and for WavPack, TAK and OptimFrog (but not Monkey!) the higher modes do even better.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-12-03 17:13:55
One bad-ass track, for what it is worth:
Merzbow: "I Lead You Towards Glorious Times". (https://www.youtube.com/watch?v=OzWNJtN86kU)

There are genres that messes up good codecs, and this is the least compressible CDDA track in my collection. Some compression tests here. (https://hydrogenaud.io/index.php?topic=120158.msg1001917#msg1001917) ("my final" test, my ass ... too curious. Rehab is for quitters.)

Anyway, because this is noise, one would not expect much help from stereo - indeed, tests show that the stereo file isn't much compressible from PCM, so there cannot have been much eh? Still codecs behave different. These are sorted by stereo size (mono size maintains that order except between LA -high and -high -noseek).

stereo kBmono kBstereo − monoleft − right
587825872260apenormal−280
587165865263apeextrahigh−269
586915864743apeinsane−290
5835458545−191takdefault−216
5821458362−148takp3e−230
5818458340−157takp4m−233
58177581770wavuncompressed0
5787957699180wvdefault−41
5757157615−44flac5−258
5750857552−44flac8−285
5750757551−44flac9pe−286
5430654693−387ofrpresetmax−221
539425391625wvhx480
53921539147wvhhx683
5390953587322lahighnoseek−311
5379353588204lahigh−310
5375753564193ladefault−299
522245214975ofrdefault744
5140151888−486ofrpreset101064
I tried to get three settings from each encoder: normal, high and maximal. OFR 10 was chosen as maximal by mistake, and kept as high when I discovered --preset max, and then ... Other choices were a bit ... arbitrary. The new FLAC beta produces bit-identical audio to 1.3.1, so the new "-9 -p -e" was a candidate for max (it didn't squeeze much out).

*The file fools OptimFrog's --preset max big time. And the LA's come out in the wrong order.
* I've known since long that TAK doesn't like this piece of music, and Monkey's is even worse. Those return bigger files than the WAV. But TAK in the very least can utilize stereo.
* Indeed less than half these files can get help from stereo here. FLAC does, as these presets (which include -m) pretty much brute-force searches the stereo options. TAK does even better. Two OFR modes do, so here there actually might be some support to froggy claims that it can make pretty good sense out of stereo.
* The two good WavPacks and the two good OFRs disagree with everything else about what channel should be smallest. Maybe because they are the only to make good sense out of the right channel.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: bryant on 2021-12-05 01:13:05
Sorry I’m a little late to respond here, but I have been following along. Thanks for analyzing this, it’s definitely interesting to see how the different compressors respond to stereo! I also was a little confused at first on what the columns meant, but you clarified it nicely.

I am only really familiar with how WavPack handles stereo, and how the “extra” modes work, so I’ll clarify that a little which will hopefully add something to the discussion.

In the “fast” and “normal” modes the default behavior (as was guessed) is just converting left-right to mid-side, and then treating the two channels completely independently. It can be turned off for comparison (-j0), but it’s almost always better. All of the “extra” modes check to make sure mid-side is improving things.

There is obviously still going to be some correlation between the channels even after mid-side encoding, and so the “high” and “extra high” modes take advantage of this. The filters with negative term values (-1, -2, and -3) employ this “cross-correlation”.

As for the “extra” modes, when I created the filters that are available at levels -x1 to -x3, there was very little high-resolution material out there (I think I had three tracks I captured somehow from a DVD) and so I didn’t use that in my corpus. Everyone was just comparing compression using CD audio and so I optimized for that.

The higher modes (-x4 to -x6) create new filters from scratch, so it makes perfect sense to me that those would be best for high-resolution (they have no preconceived notions).
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2021-12-30 20:42:24
So I knocked TTA for the wrong reason:
a few findings:
* TTA cannot do mono!
Yes it can do mono! What this TTA version refuses to handle, are ffmpeg-generated .wav files (https://hydrogenaud.io/index.php?topic=121924).

So I tested it. Same corpus. First table, the three rightmost figures are unsurprising, not far from far WavPack default or -hx1: sixteen thousand ppm, five thousand ppm, and three hundred ppm.
But the Merzbow mono files fooled it. The monos sum up to 23 kbit/s worse than Monkey's normal, while stereo is 25 better.  Mono: worst by far, stereo: between flac -5 and wavpack default.  So it is a signal that fools it, and luckily it is a mono (less interesting) such that stereo finds what to do about it.


I also was a little confused at first on what the columns meant, but you clarified it nicely.
No wonder for confusion when I cannot even make my mind up on whether size s(h)avings should be positive of negative numbers.


And ... :
while FLAC also has stereo decorrelation modes called left-side and right-side
It was only after reading this it dawned for me that left-side and right-side are stereo decorrelation strategies - not weird channel configurations that FLAC chose to support. Explains my ignorant comment here (https://hydrogenaud.io/index.php?topic=121478.msg1004551#msg1004551).
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: ktf on 2021-12-31 10:12:46
It was only after reading this it dawned for me that left-side and right-side are stereo decorrelation strategies - not weird channel configurations that FLAC chose to support.

Yes. It seems most lossless formats only do either left & right or mid & side encoding, but FLAC can also choose to encode left & side or right & side. This is beneficial if there is some form of stereo correlation, but the resulting mid channel is more complex to encode than either left or right.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-04-06 18:16:35
Tested: 7.1 files - as they are and with channels exported to quad+quad or to stereo+stereo+stereo+stereo
Purpose: Effects of "multi-channel" decorrelation

... also purpose: lobbying at @TBeck for TAK to do 7.1, as it would slay anything that isn't awfully slow. I have included the new TAK beta too!

Caveat: who knows what a representative 7.1 signal looks like - these files turned out to something OptimFROG doesn't like, which is uncommon.

There are some Dolby Digital trailer files at https://thedigitaltheater.com/dolby-trailers/ . I retrieved the 7.1 files, and remuxed the Dolby TrueHD audio streams to .mka. Note to those who want to play with the same thing: ffmpeg -acodec copy picks only the first audio stream, but luckily the lossless stream was first on all, saving me work.
Deleting a duplicate I was down to 18 files, a total of no more than 22 minutes; all but one are 48/24 but with lots of wasted bits.
Exported each to .wav in three ways:
* as-is: 7.1
* two quad files: "front" channels FL+FR+FC+LFE in one file, and "side/back" channels BL+BR+SL+SR in another (for TAK to handle it!)
* four stereos, in order 0&1, 1&2, 3&4, 5&6, 7&8

... and then let encoders run for a couple of nights.

A major surprise emerged after OptimFROG'ing the stereos: this is material where OptimFROG performs worse than FLAC.
A not so big surprise: Monkey's performs bad - due to not utilizing wasted bits. That means, a 16 bit sample in a 24 bit container (padded with zeroes) compresses much worse than 16 in 16.  Flac, WavPack and TAK compress them as good as 16 in 16; MPEG-4 ALS does so with the "-l" switch.
Several of these files appear to have parts where fewer than 24 bits are at work - but not necessarily so during the entire file.

Why all these ALS settings tested?
-t# is ... well what is it? Help file says it is "two modes", joint stereo and #channel decorrelation, where channels must be a multiple of the number. I take it that in a 7.1, -t4 means it tries joint stereo and tries 4ch groupings and picks the best.
In the very least, it gives an idea of what the encoder can make out of considering several channels at once.
-l makes use of wasted bits
-p is slow, -7 is awfully slow, -7 -p even worse


Encoding time here is ... often just too expensive. Monkey's Extra High encodes in a minute. TAK -p4m (on two quad files per signal) in two. The ALS encoder hasn't seen much optimization, and the slow ALS modes take over two hours on 22 minutes material (and much worse on the single 96/24 file).  The only thing slower is ffmpeg's WavPack encoder at "-compression_level 8".  No, not reference WavPack's -hhx6; ffmpeg has its own implementation, and its level "8" took 12x as much time as wavpack.exe -hhx6.

size/1024Codec (& fileset) & settingRemarks
476340ALS-l-7-p-t8150-ish minutes. -7 is the slow mode. -p is "long-term" prediction (slower)
476445ALS-l-7-t8... and -t8 makes full 8ch decorrelation I think?
477487ALS4ch+4ch.-l-7-p-t4
478512ALS-l-7-p-t4Hm, quad decorrelation is slightly worse than splitting in quads
478654ALS-l-7-t4
480668ALS2ch+2ch+2ch+2ch.-l-7-p-t2
481019ALS4ch+4ch.-l-7-p-t2
481134ALS5.1file_AND_stereoSLSRfile-l-7-p-t2
481410ALS-l-7-p-t2
481461TAK4ch+4ch.-p4m_BETA232_Saves ~ 2 percent over four stereos
481521TAK4ch+4ch.-p4mTakes 2 minutes
482031ALS-l-7
482050ALS5.1file_AND_stereoSLSRfile.-l-7
482057ALS4ch+4ch.-l-7-t0(This had a "-t0" by mistake, I realize)
482062ALS4ch+4ch.-l-7-t1
482083ALS2ch+2ch+2ch+2ch.-l-7
483200TAK4ch+4ch.-p4_BETA232_
483263TAK4ch+4ch.-p4
485650TAK5.1file_AND_stereoSLSRfile.-p4m_BETA2.3.2_The 5.1 part is ~ 1.5 percent smaller than three stereos
485712TAK5.1file_AND_stereoSLSRfile.-p4m
486690TAK5.1file_AND_stereoSLSRfile.-p4
488162ALS-l-7-iThe "-i" is supposed to shut off joint stereo. But -7 is the heavy slow mode
489350TAK2ch+2ch+2ch+2ch.-p4m_BETA232_
489415TAK2ch+2ch+2ch+2ch.-p4m
489509TAK5.1file_AND_stereoSLSRfile.-p2_BETA232_
489540TAK5.1file_AND_stereoSLSRfile.-p2
490276TAK2ch+2ch+2ch+2ch.-p4_BETA232_
490340TAK2ch+2ch+2ch+2ch.-p4
490826ALS-l-p-t8No "-7", takes "only" 40 minutes.
491952ALS-l-t8t8 makes around 1.8 percent difference
494991ALS-l-t4t4 takes out 2/3rds of the "t8" effect
493132TAK2ch+2ch+2ch+2ch.-p2_BETA232_
493164TAK2ch+2ch+2ch+2ch.-p2
499654ALS5.1file_AND_stereoSLSRfile.-l-t2Half a percent over the next.
501180ALS-l=wastedbitsThis ALS encodes at TAK -p4m speed
505319flac2ch+2ch+2ch+2ch.ktfs_irlspost-9ktf's IRLSPOST build, here FLAC utilizes stereo decorrelation
505444ALS-l-p-iUses wasted bits, but no joint stereo. "Long term"
507620ALS-l-iUses wasted bits, but no joint stereo.
509926flac2ch+2ch+2ch+2ch.-8ep
510731flac_ktfs_irlsbeta.-9FLAC encodes as 8x mono.
510856flac2ch+2ch+2ch+2ch.-8e
512365flac2ch+2ch+2ch+2ch.-8p
513169flac5.1file_AND_stereoSLSRfile.-8e
515495flac-8pe
516245WavPack-hx4
516298WavPack4ch+4ch.-hx4
516441flac-8e
516680WavPack-hx4j0j0 is supposed to switch off channel decorr, does that work with 7.1 at -hx4?
517032WavPack5.1file_AND_stereoSLSRfile.-hx4
518110flac-8p
520595OptimFROG2ch+2ch+2ch+2ch.--presetmaxWhiskey.Tango.Frogxtrot?!
521995OptimFROG2ch+2ch+2ch+2ch.--preset10
523309WavPack2ch+2ch+2ch+2ch.-hx4
523589flac.-5
523604OptimFROG2ch+2ch+2ch+2ch.--preset8
526597OptimFROG2ch+2ch+2ch+2ch.--preset5
532411OptimFROG2ch+2ch+2ch+2ch.--preset2
562507WavPack-f
618871MLP5.1file_AND_stereoSLSRfileffmpeg's MLP encoder. (Does not handle 7.1) TrueHD: 1 KB smaller
below:CODECS W/O WASTED BITS CAPABILITY
710654Monkey5.1file_AND_stereoSLSRfile.EXTRAExtra high better than Insane.
711777Monkey5.1file_AND_stereoSLSRfile.INSANE
714264Monkey-EXTRATakes 57sec. Weaker than 6ch decorr + stereo decorr
714990ALS-default Takes 80sec. No -l, it does not utilize wasted bits.
715475Monkey-INSANE
719188Monkey2ch+2ch+2ch+2ch.EXTRA
719587Monkey2ch+2ch+2ch+2ch.INSANE
757712Monkey4ch+4ch.EXTRA
759119Monkey4ch+4ch.INSANE
776954tta2ch+2ch+2ch+2chSeems that TTA also only decorrelates stereo. Four files ...
782144tta5.1file_AND_stereoSLSRfileOne file is stereo
797231tta
799264refalac70sec. Reassigned BL,BR to make ALAC work!
802489tta4ch+4ch... two quads are worse than a 7.1, is that file/block overhead?
803519refalac5.1file_AND_stereoSLSRfile
803806refalac2ch+2ch+2ch+2ch
971694DolbyTrueHDMuxed out of the downloaded files. Much worse than ffmpeg's TrueHD encoder
1532054wavUncompressed PCM 7.1
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: ktf on 2022-04-06 19:03:49
Interesting list. Did you perhaps check whether the TrueHD original was pure MLP or a AC3 with correction?

I really don't understand how FLAC outperforms OptimFROG here. I've only had that with chiptune before. The relatively small difference between FLAC 8 channel and FLAC 4 stereo's (1%), and the modest gains achieved by the much smarter algorithms empoyed by TAK and ALS suggest that this file doesn't have much interchannel correlation to begin with.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-04-06 19:32:04
Did you perhaps check whether the TrueHD original was pure MLP or a AC3 with correction?

Oh. Damn. I'm not doing this over again :-o

13 minutes of the 22: ffmpeg-i says Stream #0:1(eng): Audio: truehd, 48000 Hz, 7.1, s32 (24 bit) (default)
9 minutes were .m2ts files where ffmpeg -i filename -acodec copy m2ts.mka gives something like
Code: [Select]
[SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 90k tbn
  Stream #0:2[0x1100]: Audio: truehd (AC-3 / 0x332D4341), 48000 Hz, 7.1, s32 (24 bit)
  Stream #0:3[0x1100]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 640 kb/s
  Stream #0:4[0x1101]: Audio: eac3 (AC-3 / 0x332D4341), 48000 Hz, 7.1, fltp, 1664 kb/s
  Stream #0:5[0x1102]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 640 kb/s
Output #0, matroska, to 'm2ts.mka':
  Metadata:
    encoder         : Lavf59.16.100
  Stream #0:0: Audio: truehd ([255][255][255][255] / 0xFFFFFFFF), 48000 Hz, 7.1, s32 (24 bit)
Stream mapping:
  Stream #0:2 -> #0:0 (copy)
What's it doing, really?
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: ktf on 2022-04-06 20:12:58
Did you perhaps check whether the TrueHD original was pure MLP or a AC3 with correction?

Oh. Damn. I'm not doing this over again :-o
That comment was mostly to put the very bad performance of TrueHD into perspective. It probably has a lossy stream as fallback embedded, which would explain why it compressed so badly.

Quote
What's it doing, really?
It copies stream 0:2 to the new (mka) file. Stream 0:2 is truehd. This doesn't tell us anything about whether that truehd stream is AC3+correction data or 'pure' MLP.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-04-06 21:12:44
Yes, it doesn't tell when in .mka, you are right!

Code: [Select]
> ffmpeg -i .\Chameleon.m2ts -acodec copy -vn -sn m2ts-to.m2ts

[...]
  Stream #0:2[0x1100]: Audio: truehd (AC-3 / 0x332D4341), 48000 Hz, 7.1, s32 (24 bit)
[...]
  Stream #0:0: Audio: truehd (AC-3 / 0x332D4341), 48000 Hz, 7.1, s32 (24 bit)
Stream mapping:
  Stream #0:2 -> #0:0 (copy)
[...]

This one does say something about it. But then info on the output file:
 
Code: [Select]
> ffmpeg -i .\m2ts-to.m2ts

[...]
  Stream #0:0[0x1100]: Audio: truehd ([131][0][0][0] / 0x0083), 48000 Hz, 7.1, s32 (24 bit)
Now "truehd (AC-3 / 0x332D4341)" has become "truehd ([131][0][0][0] / 0x0083)"


Over to Matroska:

Code: [Select]
> ffmpeg -i .\m2ts-to.m2ts -acodec copy .\m2ts-to.m2ts-to.mka

[...]

  Stream #0:0[0x1100]: Audio: truehd ([131][0][0][0] / 0x0083), 48000 Hz, 7.1, s32 (24 bit)
  No Program
  Stream #0:1[0x1100]: Audio: ac3, 0 channels, fltp
Output #0, matroska, to '.\m2ts-to.m2ts-to.mka':
  Metadata:
    encoder         : Lavf59.16.100
  Stream #0:0: Audio: truehd ([255][255][255][255] / 0xFFFFFFFF), 48000 Hz, 7.1, s32 (24 bit)
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
 
and info on the output file:

Code: [Select]
ffmpeg -i .\m2ts-to.m2ts-to.mka

[...]

  Stream #0:0: Audio: truehd, 48000 Hz, 7.1, s32 (24 bit)


Now it has become just "truehd". Which means that no information about AC3 does not rule out it being AC3. Oh.


Well at least it wasn't an outright transcode.  That was my worry. Not that I know whether there is anything to worry about from a testing point of view. (Maybe it is? Is there any lossy that when decoded is more friendly towards lossless compressor X than Y - without it being related to wasted bits?)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: kode54 on 2022-04-07 10:37:38
It's not that it's plain AC3. The question is if it's possible to find out if it's an AC3 elementary stream with a correction stream, or if it's pure MLP. Clearly, FFmpeg isn't telling you. Maybe that requires verbose output?
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-04-07 15:41:42
ffmpeg -i mkvfile.mkv -acodec copy -vn -sn mkv-to.m2ts
and then
ffmpeg -loglevel 40 -hide_banner -i .\mkv-to.m2ts
yields
Code: [Select]
[mpegts @ 000002854760b3c0] max_analyze_duration 7000000 reached at 7000000 microseconds st:0
[mpegts @ 000002854760b3c0] start time for stream 1 is not set in estimate_timings_from_pts
[mpegts @ 000002854760b3c0] stream 1 : no TS found at start of file, duration not set
[mpegts @ 000002854760b3c0] Could not find codec parameters for stream 1 (Audio: ac3, 0 channels, fltp): unspecified sample rate
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, mpegts, from '.\mkv-to.m2ts':
  Duration: 00:00:40.13, start: 1.400000, bitrate: 5754 kb/s
  Program 1
    Metadata:
      service_name    : Service01
      service_provider: FFmpeg
  Stream #0:0[0x1100]: Audio: truehd ([131][0][0][0] / 0x0083), 48000 Hz, 7.1, s32 (24 bit)
  No Program
  Stream #0:1[0x1100]: Audio: ac3, 0 channels, fltp
At least one output file must be specified
[AVIOContext @ 0000028547613f00] Statistics: 4886672 bytes read, 3 seeks
... whatever that means.

(Using -codec copy yields pretty much the same audio-relevant output.)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-02 09:21:02
Tested: near-mono stereo (CDDA)

Background: some time when I did the test on the least-compressible CD in my collection (the above Merzbow, compression figures here (https://hydrogenaud.io/index.php?topic=120158.msg1001917) and some mildly shocking here (https://hydrogenaud.io/index.php?topic=122040.msg1010086#msg1010086)) - I recalled that I also once tested the other end of my CD collection (an Édith Piaf compilation).
There I stumbling upon a known deficiency in WavPack 4: bad mono optimization, kept for compatibility.
WavPack 5 has sacrificed the stone-age decoder compatibility, so, why not do that test over again?


Closer inspection reveals the Piaf CD as almost mono indeed; only a few thousand samples differ between the channels, and never more than the LSB (difference peak at -90.31 dBTP).  Differences in all tracks yes, but nine of the twelve only in the last second.
Job was probably outsourced to El Cheapo Basement Mastering.

So as that is maybe as good as "mono encoded as stereo", I generated another near-mono that isn't mono: I took the "CDDA" corpus used here (https://hydrogenaud.io/index.php?topic=120158.msg1003738#msg1003738) - 71 tracks arbitrarily chosen by sorting my non-classical collection by track MD5 sum - and generated one file of the first 10 seconds of each, i.e. 11 min 10 seconds.
Extracted the left channel, ffmpeg-resampled it to 88.2 kHz and back to 44.1 kHz, and made a "stereo" file with the unaltered left channel and the re-resampled channel. 
Result is very highly correlated channels, witnessed by encoders that can switch off stereo decorrelation: doing so with FLAC -4 to -8 and WavPack -f and WavPack default created 61 to 65 percent larger files.  A bit more or less with other options (actually, WavPack -h isn't too good here).

In both I included a couple of oddball codecs - not because I believe they will be used, but to get a gut feeling for what it takes to get these kinds of signal number-crunched, compared to ordinary ones. After all, there is a long development path from "works on my small development sample" to "robust enough not to make fool out of it on someone else's music", and I am sure that the developers of the alive-and-kicking codecs know.


Test number 1: 41 minutes, most samples "mono as stereo", some LSBs differing at end of every track:
100,0%wavPiaf 41 minutes almost mono
..dual mono & the like.
44,4%shn(default)
40,4%alac--fast
40,1%flac-0
37,9%wv-ffmpegffmpeg-defaultffmpeg's WavPack runs dual mono.
36,9%flac-3
31,5%7zultra
30,2%als-i-i means dual mono. Adding -l changes nothing.
..Some of these are ... ... quite underwhelming
25,4%sac--optimize=normal --sparse-pcm0.01x realtime for this. Complete failure.
20,8%wv 4.80-hxOld WavPack cannot cope, we knew that.
20,4%wv 4.80-hhx62x realtime
20,3%wv-ffmpegffmpeg -compression_level 80.263 realtime and no match for WavPack 5.
20,2%flac-1-2 about the same
19,4%tta
19,2%wv-fj0New WavPack ... but shouldn't j0 be dual mono!?
18,7%rka-l3RKAU's heaviest mode does not impress
18,5%alac
18,4%flac-5
18,0%la-high-noseekLA does not impress!
17,9%apeFAST
17,8%wv-fx
17,7%rka-l1RKAU's lightest mode beats the heaviest
17,6%la-normalLA normal beats high
17,4%flac-8and -8p between -8 and WavPack default
17,2%wvdefaultFLAC -5 and WavPack default shouldn't beat all of the ones above here, eh? ;-)
..TAK starts here.
17,0%tak-p0TAK files are in the right size order -p0, -p0e etc.
16,9%flac-8e-e makes more sense on CDDA than -p
16,7%flac-8ep
16,5%tak-p0m
16,5%flacflake -5
16,4%sac--high --optimize=fast --sparse-pcm52 hours
16,4%flacflaccl -11Not as good as -8
16,4%flacflake -6
16,2%flacflaccl -8
16,0%ofr--preset0--preset 0 has none of the frog's "optimizations" (compare WavPack's -x settings)
16,0%wv-hx
15,9%flacflake -8
15,9%wv-hx4
15,9%flacflake-11
15,8%wv-hhx64x realtime, is twice the speed of 4.80's -hhx6 (and saves a quarter size)
15,7%apeINSANEInsane ape fooled again!
15,4%flacktfs' irlspost build -9epalso ktf's double precision build lines up here
15,4%flacffmpeg-11
15,4%flacffmpeg-12-cholesky12ffmpeg wins the FLAC game, but:
15,4%apeNORMAL
15,4%flacffmpeg-12-cholesky6... ffmpeg-flac: 6 passes beats 12.
15,3%apeHIGH
15,3%ofr--preset1OptimFROGs are in the right size order except 9 beats 10
15,1%apeEXTRA HIGH
15,1%tak-p1
15,1%alsdefault="-l". A few KB better than with -t2
15,0%als-p
15,0%tak-p2
14,8%tak-p4mBetween TAK -p2 and -p4m there is nothing but other TAK
14,4%als-7 -pSmallest ALS (also tried -z3 -p, avoid)
14,4%ofr--preset2This is OptimFROG's default
14,3%sac(default, i.e. "--normal")< 0.5x realtime speed.
13,9%sac--normal --sparse-pcm< 0.4x realtime. Only improving SAC option on this file.
13,8%ofr--preset6frog-only territory from here
13,6%ofr--preset9beats 10 narrowly
13,5%ofr--preset max1.85x realtime
.
Comments for this corpus:

* WavPack: predictably, it now does well - though not ffmpeg's version.
* FLAC: Here ffmpeg (including Justin Ruggles, creator of original Flake) are doing something damn good.
* TAK: damn good.  And, it is damn hard to find signals where TAK is not in size order.

Then the ones that work more asymmetrically, OptimFROG is pretty much where you thing it would be - --preset 9 beating --preset 10 is one of those little glitches in its machinery, but doesn't do much to the overall picture. 

* Monkey's: we are getting used to Monkey's Insane getting fooled.  Again I think the blocksize just isn't appropriate. 
* LA, RKAU and sac: It takes more than just a run-of-the-mill development corpus!
sac in particular: It is utterly useless but for this kind of benchmarking. Use the entire battery of sac's options brings it down to 0.007x realtime.  Gnawing at the same segment for forty-five minutes without a disk write.  Spending four days to make a compressed CD image file that is unsuited for playback - but ... sometimes there is something to be learned: Even this brute force, where it can chew on a segment for an hour looking for the right model to encode, is quite worthless without some engineering craft.  While sac's "--sparse-pcm" modelling twist might be mildly interesting, its --optimize (apparently inspired by its idol OptimFROG, and which again I think the frog took from WavPack's -x?) is of no use.
... on this signal.  On the Merzbow - baffling that it is even possible to out-frog OptimFROG on a signal that Monkey's and TAK cannot distinguish from static.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-02 09:50:22
Test number 2: 11 minutes with one channel an up-and-then-downsample of the other
That actually means most samples are different - but music-wise pretty much transparent to each other (admittedly I didn't do any listening test, but they should be). So highly correlated, but very few dual monos.

Since WavPack 4 vs WavPack 5 was one of the points of testing this, let me mention why they aren't both included in the table: Turns out the difference is small (WavPack 5 has slightly bigger overhead per block, some kilobytes - if you don't like that, just choose a bigger blocksize). Apparently, WavPack 4's deficiency in mono was in ... mono! Not in highly correlated stereo.
But WavPack is in for other surprises, as you will see.

Edit: Note again, this is 11 minutes, the previous one was 41. Read the times accordingly. Also, timings are not particularly rigorous, take them as "what ballpark".
..dual mono & the like.
71,4%bz2
71,1%flac-0
71,0%shn
68,0%alac--fast
67,9%wv-fj0Penalty for WavPack 5: <0.1 points - and vanishing at high --blocksize setting
67,8%wv-ffmpegffmpeg-default
67,4%flac-3
66,6%wv-j0
66,6%flac-8p--no-mid-sideforces dual mono
66,5%als-iforces dual mono
64,3%xzGeneral purpose compressors beating dual mono
61,4%7zPPMd
59,7%7zLZMA2 ultra
..Some of these are ... ... quite underwhelming
52,9%ladefaultAgain LA disapponts
51,5%wv-hj0Shouldn't j0 force dual mono?
50,8%lahighLA high beats default, but nothing special.
47,6%flac-1
44,0%apeFASTMonkey's in the right order, but ...
43,7%ttatta
43,2%wv-hThat wasn't good?
43,0%apeNORMAL
42,8%apeHIGH
42,6%wv-hh
42,5%apeEXTRA
42,4%apeINSANE... but every monkey beaten by WavPack -f?!
41,8%wv-f-f beats -hh?! Same with 4.80
41,1%flac-8p-4 to -8 between wv -f and here
40,9%wv-hx
40,5%wvWavPack defaults beats -hx
40,4%flacffmpeg-12cholesky6
40,4%wv-xAt least -x improves
40,4%flacffmpeg-8-compression_level 8 beats ... ?
40,3%wv-x4Max blocksize squeezes 0.11 points
40,2%ffflacffmpeg-12
40,1%wv-hhx4107 seconds
40,1%wv-ffmpeg-compression_level 4183 seconds
40,1%wv-hhx6230 seconds
40,0%wv-ffmpeg-compression_level 6356 seconds. WavPack format 4, and beaten by WavPack.exe 4.80
40,0%wv480-hhx4Only half as fast as 5.40, but still beats ffmpeg ...
40,0%wv-ffmpeg-compression_level 8... except ffmpeg at speeds you do not want to endure in daily use
40,0%alac(refalac!)At 7 seconds, ALAC is suprisingly good
39,9%flakeflake -1119 seconds
39,9%flacclflaccl -1110 seconds
39,8%flacflac-irls.-9ktf's IRLSPOST build takes 66 seconds
39,7%flacirls.-9p3 minutes. In between here: regular -8ep and double precision -8ep
..TAK starts here.
39,3%tak-p03 SECONDS. (And on spinning drive.)
38,7%alsdefault12 seconds. Bitexact to "-l".
38,6%sac--normal--optimize=normalSAC's "optimize" only wastes *hours* on these 11 minutes
38,4%tak-p1all TAK are nicely ordered
38,3%rka-l2RKAU 2 better than 3
38,2%ofr--preset0Bit-exact same file as --preset 1
38,1%tak-p2TAK default shaves a point off -p0
37,9%sac--high--optimize=high--sparse-pcm51 hours compressing 11 minutes.
37,8%tak-p47 seconds
37,7%tak-p4m13 seconds. Two points s(h)aved off the smallest FLAC, one off the second-smallest fast codec ALS
37,4%als-l-7-p1.2x realtime, like OptimFROG --preset max. -7 about same size
37,1%ofr--preset55 and 4 worse than 2 ...
36,3%ofr--preset215 seconds
36,2%sac--normal--sparse-pcm"only" 26 minutes or 0.4x realtime
36,0%sac--high--sparse-pcm32 minutes
35,5%ofr--preset322 seconds, and smaller files than presets up to 7
35,4%ofr--preset865 seconds to improve over --preset 3
35,1%ofr--preset10150 seconds
35,0%ofr--presetmaxStill > 1x realtime. Tweaking options at this level yields same file
.

The "faster" (de)compressors:
* WavPack: uh, this wasn't as should be, -f beating -hh?
Also, it is most appreciated that the ffmpeg team supports WavPack encoding - but still WavPackers might as well use its default setting and then recompress with WavPack 5. Unless you use the maddest -compression_level settings, that don't really pay off that well.
* FLAC/WavPack: WavPack has only mid/side decorrelation strategy, and I have a hunch that FLAC's willingness to try different strategies is what makes better at this file.
* TTA: Not horrible, not impressive ... nobody cares?
* ALAC: !! What the f**c is it doing there? That is good!
* MPEG-4 ALS is kinda the second-best fast codec here: its default operation is faster than the slowest TAK -p4m and smaller than any FLAC/ALAC/WavPack (/tta)
* TAK: Again TAK starts where FLAC ends. (ALS? The 10 minutes ALS isn't a fast decoder either.)

The "heavier" ones:
* Monkey's (and LA?): no clever stereo decorrelation strateg - maybe a naive mid/side like WavPack? That is more of an objection to something that sets out to win the compression game, than to WavPack that has its priorities more balanced. But at least Monkey's and LA get their sizes in order of the presets this time.
* OptimFROG: surprise that its size orders are so mixed here. Preset 3 beating preset 5 by 1.6 points is quite a bit out of froggy character. 
* sac: Again, a good codec takes more than CPU time, it has to be spent wisely.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-03 11:45:35
Oh, on TAK:
* Also tested the TAK 2.3.2 beta, since ... well it isn't targetting improvements, but if the fixes applied would matter somewhere it could very well be on oddities? Nothing to write home about, max difference 9 KB on fortyfive to seventy megabytes. Although, the difference is larger on p4 than on p3 than ... etc, and none are worse (read off integer KB only though).
* Actually it isn't completely true that all TAK are in order, there is one exception: in test 1 (the most-samples-mono Édith Piaf), -p1 is slightly better than -p1e. Both in 2.3.1 and the beta. Around 0.01 percentage points.


And I see that I have included so many that they obscure the magnitudes. And also that the compression is so high that I should maybe have quoted savings in percent rather than points - well I did point out that WavPack 5 saves "a quarter" size off WavPack 4 on the essentially-mono first corpus.
But also, note how patient FLAC users can save like 16 percent over FLAC -5 on the first CD, but only four on the second signal (that is much more varied!)
While on the other hand, the savings in going FLAC to TAK or TAK to OFR are about the same percentwise ballpark in both tests. (Five plus/minus one percent (not point!) - respectively eight plus/minus one.)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: bryant on 2022-05-05 04:43:14
Thanks again Porcus for your meticulous testing! Like you say, not the most useful files for choosing a compressor, but interesting to get down to the nuts and bolts!

And you guessed right about the WavPack mono-optimization thing. It does only apply to stereo that’s 100% mono. The original omission was to not check for and make special provision for truly identical channels (i.e., encode only one channel) and we were instead encoding it as mid/side. Of course the “side” would have been total silence, and this would have been fine if the silence detector worked on individual channels, but it doesn’t. So there it is.

And you may very well be right about the second test results and the ability of some codecs (FLAC) to choose from more than left+right and mid+side. I suspect the other two are left+side and right+side and I think that actually very old WavPack 3 could do this (or at least I experimented with it at one point). In the end though I decided that the improvement was so rare and small that it didn’t justify the extra time to check.

There is one other interesting thing here, and that’s the --sparse-pcm option of sac. I believe that this refers to the situation where some PCM values are missing or there are probabilistic peculiarities in the PCM value’s distribution (the former being just a specific case of the latter).

The simplest and most universally encountered and easily handled version of this is the zeroed LSBs phenomenon that, for example, lossyWAV takes advantage of. This it trivial to implement and seems to crop up in real samples more than one would expect. Interesting fact: WavPack’s version also handles cases where the LSBs are 1s instead of 0s and the case where they’re either 1s or 0s, but still all identical for each sample. I have no idea how often those cases come up, or if other compressors bother, but it was so trivial to add that after the 0’s case that I could not resist.

Funny story is that when I submitted a patch to handle this for FFmpeg Michael resisted a little because this is unnecessarily complicating the code to handle something that should not exist. In a purist sense it’s simply wrong, which might be why some codecs refuse to deal with it all together.

But the zeroed LSB case is just the tip of the iceberg. Imagine the cases where every represented PCM value is a multiple of some other, non-power-of-two integer, like 3 or 5. This is not that different from the zero LSBs case, except it’s no longer trivial to detect, and probably not that common.

I actually wrote a program at some point to analyze PCM audio for these kinds of uneven distributions and then ran it on a sampling of thousands of CD tracks in my collection. The cases are surprisingly common. One of the most common types I found was where the PCM values had obviously been multiplied by some value greater than 1 and just truncated (not dithered) resulting in regularly spaced missing values. I was able to squeeze these out (losslessly, of course) and get significantly better compression, in some cases then blowing WavPack past all the competitors. I even went down this road pretty far devising easy ways to detect these cases and encode the “formula” to convert back to the original sample values. This was complicated by cases I found where this process had occurred twice, and there were variations on how positive and negative values were handled.

Another situation, which I also encountered, were cases where there were no missing codes but that adjacent codes that should have been equally probable, were not. So imagine a file where even values were twice as probable as odd ones. Quite a bit of entropy there to take advantage of (maybe close to half a bit per sample?) but not really obvious how to take advantage of it.

In the end I decided that this was not worthwhile. For one thing I feared that perhaps my rather old CD collection was not representative and modern mastering tools would not create audio like this any more. And of course this was going to be slow and create enormously complex code. It was an interesting exercise and it would have been fun to release a WavPack that significantly bettered existing encoders on specific files, but in the end thought better of it.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-05 11:29:16
The --sparse-pcm feature of sac is also "relatively cheap" - relatively meaning a 20 to 25 percent time penalty. That isn't much compared to a factor of 50 (not percent, that would be like 5000!) for leaving it do its frog-inspired --optimize thing.
Source available (https://github.com/slmdev/sac) - didn't see a license, but ideas are out of the bag [insert rant about certain codec license here]
Also on the Merzbow track the --sparse-pcm makes much more difference than the other options to put into it. (I wrote wrong here, should be that "Normal" mode beats "High" at compression. (https://hydrogenaud.io/index.php?topic=122040.msg1010086#msg1010086))


But the zeroed LSB case is just the tip of the iceberg. Imagine the cases where every represented PCM value is a multiple of some other, non-power-of-two integer, like 3 or 5. This is not that different from the zero LSBs case, except it’s no longer trivial to detect, and probably not that common.
I think you mentioned this once here yes.  But you only checked for integer multiples?
Hunch: what if the "last three steps" to the final 44.1/16 file are resampling to 44.1, peak normalization and dithering down to 16.
Peak normalization is scaling.

Also I have scratched my head over ... more quiet parts (= blocks) could in principle be handled by
(1) upscaling, remembering the scaling factor, and
(2) wasted bits in the upscaled signal.
Worth it? Maybe not? Had the 14 bits DTS-CDs achieved world domination, it would have been a different thing.


For one thing I feared that perhaps my rather old CD collection was not representative and modern mastering tools would not create audio like this any more.
If my above hunch has anything to it, then we shouldn't even be surprised if "modern mastering tools" do this more often Edit: when hit the submit button, TBeck was already writing that he only found it in old recordings. Oh well ... that's what I get from just making layman's speculations. Which, anyway, follow unedited from here:
Modern mastering tools include the musician's own computers to an extent unheard of in 1990. And who knows whether they do things in the "correct" order, when the outcome anyway so much more than enough resolution for nothing to be audible.

Also I have scratched my head over: suppose for example I create a 16-bit signal with peak of 8 bits below digital full scale, and then pad to 24. WavPack/FLAC/TAK handle the 8 wasted LSBs. They also certainly benefit from the 8 MSBs being zero (all numbers are smaller!), but do they exploit this fully?

Padding to 24 isn't farfetched. There must be a lot of files around that once were 16 bits, but were then imported into say Audacity for a minor adjustment (like peak normalization?!) and then exported to 24 bits ... dithered at the bottom. Which means there is a linearly transformed "bunch of wasted bits signal" that differs to this by ... some noise that anyway needs to be stored as residual.
(Insert obvious analogy to least-squares fit here.)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: TBeck on 2022-05-05 11:30:54
Thanks again Porcus for your meticulous testing! Like you say, not the most useful files for choosing a compressor, but interesting to get down to the nuts and bolts!
Full consent!

One of the most common types I found was where the PCM values had obviously been multiplied by some value greater than 1 and just truncated (not dithered) resulting in regularly spaced missing values. I was able to squeeze these out (losslessly, of course) and get significantly better compression, in some cases then blowing WavPack past all the competitors.
I tried  the same for Tak when i noticed that Optimfrog had implemented such a  feature. Optimfrog also seems to take advantage of non uniform quantization (12 bit DAT e.g.). My test corpus was quite small but i too got the impression that most affected files were comparatively old recordings. I am not even sure if there was at least one example in the music i bought in the last 15 Years.

In the end I decided that this was not worthwhile. For one thing I feared that perhaps my rather old CD collection was not representative and modern mastering tools would not create audio like this any more. And of course this was going to be slow and create enormously complex code. It was an interesting exercise and it would have been fun to release a WavPack that significantly bettered existing encoders on specific files, but in the end thought better of it.
Same conclusion here. And i have to admit i wasn't smart enough to find a detection algorithm that would have been fast enough by Tak's standards. Furthermore Optimfrog's savings still were twice as large for not few of my test files.  To do better i would have to modify far more of the (quite well-tuned) existing codec than i wanted to.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-05 12:02:14
I tried  the same for Tak when i noticed that Optimfrog had implemented such a  feature. Optimfrog also seems to take advantage of non uniform quantization (12 bit DAT e.g.).

You guys and Florin - who has been fiddling around with an asymmetric frog extension too! (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.638.4853&rep=rep1&type=pdf) - should stick your heads together to see what you can unify.
Take one letter from each of WavPack, TAK and OptimFROG, in that order, and hope that it doesn't make World of Warcraft (https://fileinfo.com/extension/wtf) crash. If that is a problem, bring on board the ALS for WT A(ctual) F.  :D

(The serious part of this is that it is unrealistic to extend FLAC, as it is probably found in too many non-upgradeable embedded devices - but it could bring optimizations to WavPack, Matroska to TAK and playback support to the frog.)

Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-06 11:04:27
Oh, and you must leave it to @TBeck to announce it. Just make sure he doesn't miss the date by less than an hour (https://hydrogenaud.io/index.php?topic=122317.msg1009743#msg1009743).


... and to new HA readers, here is a classic in the history of lossless audio: 
Once upon a time, on April 1st, someone posted about a codec compressing to sizes smaller than Monkey's High and at speeds like this (https://hydrogenaud.io/index.php?topic=43179.msg377983#msg377983).
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: rutra80 on 2022-05-08 21:16:31
Would it be possible for any lossless codec to get anywhere close to RAR with stuff like this? It's ZX Spectrum beeper PWM.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-08 22:55:15
I suspected that these files have repeating patterns, and yes: when I cut one of them - the biggest, the DarkFusion - into one second segments and compressed each separately with xz -9 -e, file sizes increased by forty percent.

Here is the point: If you process 162 files with a general purpose processor that looks for inter-file patterns, it will detect that file 3 and file 151 are identical, and virtually compress away one of them.
An audio format meant to be streamable doesn't do that. After two and a half minutes you get an instruction that says "repeat segment seconds 3-4". Yeah sure I haven't stored that.
An audio format not meant to be streamable ... go ahead if you want to, and you will be able to outfox some other codecs on some samples yes, but ... is it worth it?

(WavPack beats all the audio codecs on that track though - but you have to use reference WavPack, not ffmpeg.)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: mycroft on 2022-05-10 09:40:41
Really have you proof for your "claims"? FFmpeg wavpack encoder is marginally different from wavpack reference ones.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-10 10:08:38
Really have you proof for your "claims"?
Files appear too big to attach, so please go to https://www.sendspace.com/filegroup/JEjBiar2HYKPs2fq9zFAhQ to find encodes of the DarkFusion.wav audio:
* a 7.47 MB file produced using reference WavPack.
* the 8.74 MB file produced by recompressing the former by ffmpeg 5.0 using the awfully slow -compression_level 8.

Now open the WavPack file and see what mode it is. Hint: it is not "high". In case you will bother to even look at it - last time I took the effort to upload to you any evidence in form of a file, you just ignored it (https://hydrogenaud.io/index.php?topic=122094.msg1008187#msg1008187) and never came back to the topic at all.


FFmpeg wavpack encoder is marginally different from wavpack reference ones.
ffmpeg WavPack produces version 4 files.
Above I used the newest reference WavPack.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: mycroft on 2022-05-12 08:27:30
Yes. I just tried to download those files, and sendspace is on scheduled maintenance. Why I ever bothered to visit this thread to get proofs for "bold" claims.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-05-12 09:18:26
Are you calling fake just because Sendspace temporarily is on maintenance? Do we need to teach you how to download WavPack and ffmpeg and try for yourself?

In "honor" of this bullshit of yours, I re-ran it - this time through WavPack 4.80 with and without --optimize-mono, which explains the whole thing:
(https://i.imgur.com/27W1yGi.png)
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: rutra80 on 2022-05-12 09:28:51
These files are technically stereo but the content is mono - both channels should be the same.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: mycroft on 2022-06-07 08:23:29
Again I get attacked by brave user of this forums, next time use -optimize_mono switch if you will ever be bothered again.

Hope your smart team is doing well.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: mycroft on 2022-06-07 08:29:17
Why I ever bothered to write decoder for TAK? Users are typically ungrateful as always. Wanted to write RKA decoder too, but looking how are users very ungrateful I will pick something else to do and will never ever be bothered with evil users again.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2022-06-12 21:10:10
Hope your smart team is doing well.
Mine? "team"? I'm just my arrogant swiney self, not affiliated with any codec developer(s). Bark up a different tree.

My issue with you here at the forum is not what code you make elsewhere (if you implemented the TAK decoder, then I am grateful even if I don't use TAK - and it was a better choice than RKAU yes). But your behaviour here isn't ... constructive.

Like, going full denial just because sendspace has a temporary downtime (that still lets you see the files!)?

And had you a solution to the TTA error handling crap or had you not (https://hydrogenaud.io/index.php/topic,122094.msg1008187.html#msg1008187)? Hard to tell, and with that attitude, people will just shrug it off as useless.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2023-04-04 20:38:20
@bryant ... this about wasted bits:
Interesting fact: WavPack’s version also handles cases where the LSBs are 1s instead of 0s and the case where they’re either 1s or 0s, but still all identical for each sample. I have no idea how often those cases come up, or if other compressors bother, but it was so trivial to add that after the 0’s case that I could not resist.

I wonder whether that is what is going on in this track? https://relapsesampler.bandcamp.com/track/elysium-of-dripping-death
You can avoid the $1 price tag by downloading the entire compilation: https://relapsesampler.bandcamp.com/album/relapse-sampler-2015

But if so, it seems that --pre-quantize zeroes out the bits? Which more or less kills WavPack's advantage. Filesizes follow - "pre-quantized" FLAC files are re-encodes of the same WavPack-generated signals.

399515646   WAVE
270847726   ALAC (ffmpeg)
264226120   ALAC (refalac)
263107430   Monkey's FAST
254236383   TTA
250527682   Monkey's NORMAL
250377278   Monkey's HIGH
249947571   ALS default
248810066   Monkey's INSANE
247930508   TAK -p0
246826374   Monkey's EXTRA HIGH
241284186   OptimFROG –-preset 0 and -–preset 1
234564608   OptimFROG --preset 2 (default)
234100276   FLAC ... -8pe I think
233799103   OptimFROG –-preset 3 beats 0125476
233695667   FLAC -8pel32 -r8 -A subdivide_tukey(4) (That.Was.Slow.)
233066234   OptimFROG –-preset max
232621364   TAK -p4m
221559078   23-bit pre-quantized WavPack at -hhx4
219641625   23-bit pre-quantized FLAC at -8p
197709053   ALS -7 -p
190900370   21-bit pre-quantized WavPack at -hhx4
189654475   21-bit pre-quantized FLAC at -8p
153522746   WavPack -f.  On the original file yes. NOTE: to the byte same filesize as:
153522746   17-bit pre-quantized WavPack at -f
141018808   WavPack -g (default): to the byte same filesize as:
141018808   17-bit pre-quantized WavPack at -g (default)
131267306   WavPack -hx6
129890175   17-bit pre-quantized FLAC at -8p
129650300   17-bit pre-quantized WavPack at -hhx4
129650300   WavPack -hhx4.  Same filesize as 17-bit!
129464830   WavPack -hhx6

Shouldn't exist ... and so WavPack slays everything else by 1/3 (... ALAC by 1/2 ...). Quite a showcase.
That is, until you try to squeeze more out of it in a lossy manner. Now you "have to" make --pre-quantize smarter, don't you? ;D If you allow it to zero out the LSB, you should allow it to add one too ... ahem, unless the sample is full scale all ones. Oops.

Edit: I wonder, is there any "known software" that does any polarity inversion like sending every sample S --> -(S+1)?
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: darkalex on 2023-04-04 22:43:37
Hey Porcus,

So what's your conclusive finding after running these multitude of tests on lossless compressors?

As the dev said, FLAC nails it for 16-bit CDDA audio, but isn't so helpful with 24 bit high res stuff. Which lossless compressor can give comparable compression levels like FLAC on 16 bit, but for 24 bit 96khz?

Users with MacBook and other computers with soldered storage, would share my pain of keeping the 256 gb drive healthy ;-;
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2023-04-04 23:38:39
For a computer with 256 GB, the solution is of course an external drive ;-)  Besides you want backup anyway. You aren't keeping the only copy of your music collection on a portable unit ... I hope?

Concerning compression performance, both CPU power and storage have gotten cheaper - but if you can barely fit your entire collection on your 256 GB then sure, go ahead find a format that compresses better. But then on a MacBook? The most impressive overall performing codec is arguably TAK, and that is a closed-source Windows encoder ... although ffmpeg can play it. Compatibility is an issue too - still!
What do you want to accomplish, really?

Now, I am not testing "overall" codec performance - for that, look to ktf's tests: https://hydrogenaud.io/index.php/topic,122508.msg1024541 for CD audio, and more at audiograaf.nl/losslesstest/
What I have been doing is to curiously try to put codecs (/encoders/decoders) through some peculiarities to explain why or whether some do this better than others. Some of this is performance related, but remember that for performance you want the grand total, not the odd counterexamples that don't matter much for more than a very few files. Here was a special kind of signal that WavPack can detect and compress easily, but if they were many enough to be a big part of the average, they would show up in the totals in ktf's tests.
Although a wide corpus like his doesn't get the picture for your collection if you have more of one genre than of others ... and most of us do.

Myself I am using FLAC. And then WavPack for some more odd signals. Those FLAC cannot contain, but also (long story) Isome that FLAC can actually handle.
But codec choice won't solve the "big problem" about 24-bit signals: the last few bits are typically noise. Not good for anything but taking up space, and pretty much incompressible.
WavPack, FLAC and TAK (and OptimFROG) handle well one particular aspect of certain 24-bit files: those which aren't really "24 bits", but where the last few bits are padded with zeroes. (WavPack also handles the very few cases where those last few bits are all 1 rather than all 0.) And then the files with higher sample rates are quite diverse - is it noise up there in the audible range, is it ... nothing? It is possible to squeeze out a bit more of FLAC using heavier settings than -8, but don't expect miracles (except on very few samples, some impressive but no miracles in the big picture). Again, see ktf's tests. You will see that for WavPack on high resolution signals, the -x4 switch (or in high or very high mode, -hx4 or -hhx4) can squeeze out quite a bit. (There are even heavier switches, -x5 or -x6, but they don't offer that much bang for the buck, they are very slow. (https://hydrogenaud.io/index.php/topic,120454.msg1004848.html#msg1004848)).

I don't use appleware, so I don't know what is easier to handle there. Of course Apple wants you to use ALAC, which isn't a good codec.


Blah blah unsorted ramblings, sorry ... but what are your needs? Most lossless needs are perfectly well covered by FLAC. If you want to go for WavPack ... make sure your player can handle it.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: ktf on 2023-04-05 06:38:06
As the dev said, FLAC nails it for 16-bit CDDA audio, but isn't so helpful with 24 bit high res stuff.
Huh? Which dev said that where?
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: darkalex on 2023-04-05 13:04:44
As the dev said, FLAC nails it for 16-bit CDDA audio, but isn't so helpful with 24 bit high res stuff.
Huh? Which dev said that where?

This dev said that right here:
This would also explain why FLACs benefit is only present for 16-bit (CDDA) material and not for 24-bit signals.

.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: ktf on 2023-04-05 15:13:10
You're taking these results (and my quote) completely out of context. What is tested in this topic is one specific aspect of lossless audio compression, trying to find intricacies. The title says it: 'the effect of stereo decorrelation'. This is in no way meant as a thorough comparison.

FLAC up until version 1.3.4 did reasonable with 24-bit material, but was not up to par with the other codecs. FLAC 1.4.0 and later do very well with 24-bit material. I am confusing things. FLAC up until version 1.3.4 did reasonable with high samplerate material with no actual content above 20kHz, but was not up to par with the other codecs. FLAC 1.4.0 and later do very well with this. FLAC (any version after 1.2.1) does fine with 24-bit material in general.
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2023-04-05 16:36:33
I think that the FLAC "irls-2021-09-21" build I used in this test, incorporates the change that made 1.4 improve on higher sampling rates, namely doing some number-crunching with higher precision. Also I used a stock 1.3 build that does not.
The high resolution files used in this particular test are not only 24 bits, but also higher sampling rates.

And just to confirm ktf's explanation of not only what I did, but also the purpose and limitations thereof: this is not an assessment of overall performance. The purpose is to isolate the impact of specific features (in the codecs/encoders or in the signals), and the relevance for "overall performance" are "unknown". Potential relevance:
* Understanding why. Geeky curiosity.
* Improvements. However, that would often be constrained by format specifications. For example, it seems that only WavPack can handle "wasted bits all being 1", but WavPack cannot do say Left+Side stereo decorrelation.
* ... which means that well sometimes one just has to accept that we are twenty years too late to implement this improvement, and likely one will just have to let <this competitor> have this advantage.
* Understanding that performance over a broad test corpus - which is what developers may aim at - doesn't necessarily match performance in more special signals. Possibly that might save devs from "but my collection doesn't measure that way!" questions - that was for the "obvious" thing that different music compress different - but for one step up in understanding, make that "performance differences" twice in the first sentence.
.
.
.
* And even if the outcome is a boring null result: write it up and post it when I have done the test job, because "nothing to see here, move on" also is information.
Besides, a clever question might spawn ... oh there might be more than completely nothing going on. https://hydrogenaud.io/index.php/topic,122056.msg1007356.html#msg1007356
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: bryant on 2023-04-05 19:36:41
@bryant ... this about wasted bits:
Interesting fact: WavPack’s version also handles cases where the LSBs are 1s instead of 0s and the case where they’re either 1s or 0s, but still all identical for each sample. I have no idea how often those cases come up, or if other compressors bother, but it was so trivial to add that after the 0’s case that I could not resist.

I wonder whether that is what is going on in this track? https://relapsesampler.bandcamp.com/track/elysium-of-dripping-death
You can avoid the $1 price tag by downloading the entire compilation: https://relapsesampler.bandcamp.com/album/relapse-sampler-2015

But if so, it seems that --pre-quantize zeroes out the bits? Which more or less kills WavPack's advantage. Filesizes follow - "pre-quantized" FLAC files are re-encodes of the same WavPack-generated signals.
I downloaded both the sampler and the individual track and only got 16/44.1 versions. Not sure if that's because I don't "belong" or they changed them in the meantime.

But regardless, your theory as to why WavPack would slay this file seems like the only possibility. It might be that '1's are used for padding intentionally to make it harder to detect that the files were just upsampled? Or there was a logical complement in the chain somewhere (which is a valid operation on signed audio values, unlike negation which, as you point out, can overflow). In any event, it's not a good look.

And it makes me happy I included that check. It seems counter-intuitive that a compressor would do well on a set of data, but do poorly on its complement.  :D
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: Porcus on 2023-04-06 13:31:54
Ooops. Will PM you the file in question.

That file was part of a different compilation announced for the label's 30th anniversary right before Xmas 2020 (https://www.overdrive.ie/relapse-records-reveal-free-30th-anniversary-playlist/).
It is now deleted from Bandcamp, which explains why I got a different version when I google'd up the track to get the link. Instead I got a previously published version.

Which makes me wonder what kind of creative "remastering" some wise person therein did for this compilation (and whether that was the reason it is gone?). But of course that is not necessarily what happened ... I did some detective work on the other downloads - and on CD samplers from the label. Among those 241 tracks (https://www.discogs.com/release/22645568-Various-Relapse-30-Year-Anniversary-Sampler)
* 34 have same MD5s as earlier compilations. Fine.
* 7 are "remastered" to 48kHz/16, 7 to 48kHz/24 and 7 to 44.1kHz/24. I am not sure if that means remastering, or maybe that they picked a file that had not yet been downmized to CDDA later.
* 24 are "remastered" to 88.2kHz/24 or 96/24. Or again, it could be that they are retrieved from the DAW and that the previously released ones were finalized.
Among those 24, three have this feature that WavPack beats FLAC by say forty percent. A few more are in the -10 percent league, I guess things like that just happens when using x4 - especially when one of them is 20 percent smaller than the -hx encode.
* And a handful I also have on early 2000s CD promo compilations, they are all bit-different here although also in 44.1/16.

Anyway, nearly all the tracks can be listened to in a slightly different Spotify playlist (https://open.spotify.com/album/2lW6wArcGFBPo5IGo1NNWC).
Title: Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)
Post by: bryant on 2023-04-06 23:14:30
Thanks for the files!

So, for the ones that WavPack slays everything without even trying, those indeed have the redundant LSBs. I checked only one in detail, but it was converted from 16-bit with the new byte either all ones or all zeros (0xff) in bursts (not randomly sample-to-sample). So even checking for all ones in the LSB wouldn't catch this, but my code checks for some number of LSBs being identical, and in this file it's the lower 7 bits (0-6) always match the next bit (7), so my code essentially truncates 7 bits. Have no idea how that could happen except intentionally, but for what intention?

The others where the WavPack extra modes improve compression by 10% or so, those are upsampled from 44.1 with varying filters. In one it was a slow rolloff starting at 17 kHz, while in another it was a steep rolloff at 20 kHz. Funny that the upsampled ones might even have less HF content than the 16/44!