Skip to main content


Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: "Tested": codecs for the effect of stereo decorrelation (mid/side) (Read 936 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

"Tested": codecs for the effect of stereo decorrelation (mid/side)

So after I made a quite thoughtless across-tracks-decorrelation experiment, and played a bit back and forth with different codecs on that to sort out my confusion, I thought, why not run a test on some files I've already thrown at @ktf's FLAC betas.

I was in particular curious about one thing: OptimFrog's claim to using a smarter decorrelation. Turns out, OptimFrog and Monkey's and TAK can get more out of stereo on CDDA, than do WavPack-less-than-x4 and FLAC - but that is not the case for 96/24. In any case, it cannot explain OptimFrog's small filesizes; rather they are probably because the format is generally complex and puts your CPU at work to keep you warm during the winter.

The results should be interpreted with so much caution that I initially thought they might not be useful unless something striking would show up. Say, here you cannot expect results to go in the same direction:
If encoder X spends more time than encoder Y getting stereo file x smaller than stereo file y, we cannot tell whether it is more "efficient" or just spends more effort searching for patterns. We don't even know the theoretical compressibility of the L-R difference signal.

Anyway, I think we can take home a few findings:
* TTA cannot do mono! Yeah it can handle multi-channel, but cannot read input files that are mono .wav  *shrugs*
* The small-files compressors OFR/TAK/MAC compress both the mono well and the stereo well. OFR and TAK's heavier modes increase the stereo diff.
* WavPack at x4: Consistent with tests at , WavPack "needs" x4 to compress hirez well. Maybe it is to get differences in the ultrasonic octave compressed?
* FLAC. The beta implements double-precision calculation that improves quite a bit, especially for higher-rez. An reasonable speculation on the "good" stereo reduction for stock 1.3.1, could be that it compresses away some of the effects of a "bad roundoff" to single precision that makes for more digits common to all channels. Bad common roundoff to zeroes in common -> can in part be compressed.
* FLAC's "-M" does pretty well. With that switch it does not fully calculate L/R vs mid/side before deciding which one to use.

Columns: Mono size (my locale uses comma for decimal separator), stereo gain in ppm; then stereo gain per sub-corpus. Not displayed: gains per file to get an idea of per-file overhead, see instead the first FLAC column, and consider that there were 71 + 42 + 1 file(s).

mono GB2chdiffppm.CDDA, rockhirez, rockhirez, jazz/cl.
FLAC  irls-2021-09-21 -8 --no-mid-side6,21580.783512468
FLAC  irls-2021-09-21 -8 -M6,215 500.12 2903 0731 864
FLAC  irls-2021-09-21 -86,216 485.13 4594 4322 312
FLAC  irls-2021-09-21 -56,266 608.13 6704 5482 364
FLAC 1.3.1 -86,387 940.13 5678 1552 707
FLAC 1.3.1 -56,448 274.13 9788 7182 747
WavPack -f6,584 744.11 9793 077−45
WavPack default6,404 953.10 9673 994546
WavPack -hx16,275 782.16 2841 987199
WavPack -hhx46,2415 832.18 10522 9986 667
Monkey's normal6,296 923.17 6913 403828
Monkey's insane6,226 931.16 8874 016957
TAK -p26,156 907.17 9942 7351 176
TAK -p4m6,097 440.18 4533 3891 654
OFR --preset 26,066 457.17 2861 7851 456
OFR --preset 105,987 604.18 6573 7061 631

Corpus in more detail:
The first two columns are the corpus from .
The "hirez jazz/cl." is one file where I merged together 106 minutes 96/24 from , jazz/classical acoustic recordings sometimes recorded in multi-ch and downmixed. Same file as mentioned at the bottom of .
High Voltage socket-nose-avatar

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #1
Could you please elaborate a bit on the columns? I'm too dumb to figure that "ppm" meaning :)

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #2
ppm = "parts per million". 1/10k of a percentage point.

So for the biggest overall effect, WavPack -hhx4, the mono files are 624/1061 of WAV size, that is 58.8 percent; the stereo files are around 1.6 less, 57.2 percent of WAV size.

Here you got the same with different formatting, where mono filesizes are in percent of .wav, and where the differences are in percentage points:
mono compression2chdiff in pctptsCDDA rockhirez rockhirez jazz/cl.
FLAC  irls-2021-09-21 -8 --no-mid-side58.5%
FLAC  irls-2021-09-21 -8 -M58.5%0.551.230.310.19
FLAC  irls-2021-09-21 -858.5%0.651.350.440.23
FLAC  irls-2021-09-21 -559.0%0.661.370.450.24
FLAC 1.3.1 -860.2%0.791.360.820.27
FLAC 1.3.1 -560.7%0.831.400.870.27
WavPack -f62.0%0.471.200.310.00
WavPack default60.4%0.501.100.400.05
WavPack -hx159.1%0.581.630.200.02
WavPack -hhx458.8%1.581.812.300.67
Monkey's normal59.3%0.691.770.340.08
Monkey's insane58.6%0.691.690.400.10
TAK -p258.0%0.691.800.270.12
TAK -p4m57.4%0.741.850.340.17
OFR --preset 257.1%0.651.730.180.15
OFR --preset 1056.4%0.761.870.370.16
It may be surprising to see WavPack -hhx4 not out-compress FLAC, but that is because most of the corpus is high sample rate where WavPack doesn't shine as much and where the new FLAC beta improves a lot.
WAV file sizes:
CDDA rock: 3.05 GB (5h09min)
hirez rock: 3.4 GB (1h47)
hirez jazz/cl.: 3.4 GB (1h46)
High Voltage socket-nose-avatar

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #3
2chdiff - is that full file but stereo, or is that only the extracted mid/side or l/r difference signal?

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #4
That is the difference
one file in stereo - (one file for the left channel + one file for the right channel)
High Voltage socket-nose-avatar

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #5
Hmm the question is if codecs treat the difference as separate signal and compress it separately, or is it somehow used for predictors etc. If the latter, then I'm not sure if compressing the difference signal alone tells much - it's not music and predictors aren't tuned for it... It somehow resembles analysing lossy codecs by listening to difference signal...


Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #6
No, not "difference" as in difference signal - as difference in size.

What I did, was I split a stereo file file into a left channel file and a right channel file.
Compressed left channel file and right channel file. That is a safe way to get "dual mono" of the same audio.
Compressed the original file too, with the same setting.

Then a measure of how much use the encoder makes of channel correlation, is: how many percent does it gain when it can look at both?
A measure, but I didn't say it was a precise one. But FWIW I think it says something about the FLAC revision, about some WavPack settings - and, it suggests that OptimFrog's secret doesn't lie in exceptional handling of stereo, but rather in throwing heavy artillery at every signal.
High Voltage socket-nose-avatar

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #7
Quite clear results. I only know the FLAC format very well, I just looked up Wavpack and Monkeys Audio. Put simply, it seems WavPack and Monkeys Audio only implement a conversion to mid-side audio, while FLAC also has stereo decorrelation modes called left-side and right-side. To me it seems that for WavPack and Monkeys Audio, though I'm not sure about Monkey's Audio, that either left and right or mid and side channels are treated separately after either converting from left-right to mid-side or not.

So, apparently, the gain is not in the stereo decorrelation but in the way a mid or a side channel can be compressed. The best explanation I can come up with (but please note this is purely guesswork) is that FLAC is less equipped to deal with small signals that might occur in the mid channel of highly-correlated stereo. This would also explain why FLACs benefit is only present for 16-bit (CDDA) material and not for 24-bit signals.
Music: sounds arranged such that they construct feelings.

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #8
Not sure what are the right guesses CDDA vs hirez ... one thing is that a big part of hirez is uncorrelated noise. (Also I have not timed these. No info here about how much more effort froggy puts in stereo than in mono, for example.)

Anyway, here is the CDDA-only part, and where I have added columns that compare each mono to the FLAC beta at -8 mono, and each stereo to new -8 stereo. The final column is, say, [t](filesize FLAC -8 stereo minus Monkey's insane stereo) minus (filesize FLAC -8 mono minus Monkey's insane mono)[/t], that is: We know that Monkey's insane compresses more than FLAC does, and that difference is how much bigger in stereo than in dual mono?

CDDA part onlymono compressionstereo compression2chdiff in %pts1ch vs new FLAC -82ch vs new FLAC -8diff previous two
FLAC  irls-2021-09-21 -8 --no-mid-side67.2%67.1%0.080.00−1.27
FLAC  irls-2021-09-21 -8 -M67.2%66.0%1.230.00−0.12
FLAC  irls-2021-09-21 -867.2%65.9%1.350.000.00
FLAC  irls-2021-09-21 -567.6%66.2%1.37−0.34−0.32
FLAC 1.3.1 -867.3%65.9%1.36−0.08−0.07
FLAC 1.3.1 -567.7%66.3%1.40−0.46−0.40
WavPack -f68.7%67.5%1.20−1.47−1.62−0.15
WavPack default67.4%66.3%1.10−0.20−0.45−0.25
WavPack -hx166.9%65.3%1.630.320.600.28
WavPack -hhx466.7%64.9%1.810.551.010.46
Monkey's normal66.1%64.3%1.771.151.570.42
Monkey's insane65.1%63.4%1.692.122.470.34
TAK -p266.3%64.5%1.800.951.400.45
TAK -p4m65.8%63.9%1.851.471.970.50
OFR --preset 265.4%63.7%1.731.812.190.38
OFR --preset 1064.4%62.5%1.872.823.340.52
We see that getting a stereo signal, will enable the higher-compressing codecs to increase their compression advantages over FLAC, that is not unexpected; and for WavPack, TAK and OptimFrog (but not Monkey!) the higher modes do even better.
High Voltage socket-nose-avatar

Re: "Tested": codecs for the effect of stereo decorrelation (mid/side)

Reply #9
One bad-ass track, for what it is worth:
Merzbow: "I Lead You Towards Glorious Times".

There are genres that messes up good codecs, and this is the least compressible CDDA track in my collection. Some compression tests here. ("my final" test, my ass ... too curious. Rehab is for quitters.)

Anyway, because this is noise, one would not expect much help from stereo - indeed, tests show that the stereo file isn't much compressible from PCM, so there cannot have been much eh? Still codecs behave different. These are sorted by stereo size (mono size maintains that order except between LA -high and -high -noseek).

stereo kBmono kBstereo − monoleft − right
I tried to get three settings from each encoder: normal, high and maximal. OFR 10 was chosen as maximal by mistake, and kept as high when I discovered --preset max, and then ... Other choices were a bit ... arbitrary. The new FLAC beta produces bit-identical audio to 1.3.1, so the new "-9 -p -e" was a candidate for max (it didn't squeeze much out).

*The file fools OptimFrog's --preset max big time. And the LA's come out in the wrong order.
* I've known since long that TAK doesn't like this piece of music, and Monkey's is even worse. Those return bigger files than the WAV. But TAK in the very least can utilize stereo.
* Indeed less than half these files can get help from stereo here. FLAC does, as these presets (which include -m) pretty much brute-force searches the stereo options. TAK does even better. Two OFR modes do, so here there actually might be some support to froggy claims that it can make pretty good sense out of stereo.
* The two good WavPacks and the two good OFRs disagree with everything else about what channel should be smallest. Maybe because they are the only to make good sense out of the right channel.
High Voltage socket-nose-avatar