Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Lossless codec comparison - part 1: multichannel (Mar '22) (Read 9307 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Lossless codec comparison - part 1: multichannel (Mar '22)

Hi all,

In the past I've written a few reports comparing the performance of lossless codecs, the last of which was published in March of 2015. The last few months I've been working on an update, and today I'm publishing the first part of it, which compares the performance of codecs encoding multichannel material.

The report can be found here: http://www.audiograaf.nl/losslesstest/Lossless audio codec comparison - revision 5 - multichannel.html

I intend to publish reports on high-res performance and CDDA performance later this year.

The main result of the test (comparing only 5.1 surround sources) is inserted below:
X

For more results and an analysis/discussion refer to the report linked above.
Music: sounds arranged such that they construct feelings.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #1
TAK absolutely kills here. Saving nearly four percentage points over FLAC -8 (nearly ten percent file size) - which in turn saves even more over TTA.
Already in your Revision 1 - with the LotR - TAK won clearly and TTA lost clearly, and it seems the relationship FLAC-ALAC-WavPack(blue) is about the same as then.


And multichannel TAK can be played back by ffmpeg.
... iOS users: does that mean you can actually play these files?

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #2
foobar2000 for iOS can play TAK.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #3
Multichannel too? Easier to ask here than trying to get those iPhoneiTunesiEverything friends of mine to play that .tak file I sent ...

With an increasing number of people using mobile OSes for playback, we are kinda back fifteen years to the "can I play this?" question, which might be where Monkey's back then kept its upper hand over TAK - except, cue 2022, TAK might be perceived as at least as cross-platform compatible as Monkey's. Due to multi-channel support (implemented in, let's say, a way that one could rant a bit over).


Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #5
Thank you, ktf. Nice comparison.

For me it's FLAC or FLAC anyway. TAK is good too but software support is not even close as for FLAC.

Looking forward part 2 as stereo is much more interesting for me.  :)

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #6
Looking forward part 2 as stereo is much more interesting for me.  :)
Me too, actually, but it's interesting to see how different the codec performances still are on multichannel content. Since the number of contenders here is smaller than I thought: does that mean that other lossless codecs (e.g., OptimFrog, see http://losslessaudio.org/InProgress.php) don't have full multichannel support yet?

Chris
If I don't reply to your reply, it means I agree with you.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #7
OptimFROG is mono or stereo only.
Monkey's Audio got multichannel in August 2019, you find it by going to https://monkeysaudio.com/versionhistory.html and text-searching for "backwards compatibility broken". Only the official build supports it - not in ffmpeg (yet?), not in the *n*x port (ever?).

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #8
Since the number of contenders here is smaller than I thought: does that mean that other lossless codecs (e.g., OptimFrog, see http://losslessaudio.org/InProgress.php) don't have full multichannel support yet?

When you look at the other graph in the report, there are even fewer contenders. WMA, TAK, TTA and ALAC do not support 7.1.
Music: sounds arranged such that they construct feelings.


Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #10
My testing of codecs for support has indeed not been very thorough, I simply tried to compress a 7.1 surround file (L R C Lfe Lr Rr Ls Rs) with all codecs, and those 4 returned some kind of error.
Music: sounds arranged such that they construct feelings.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #11
Multichannel is fishy matter, it seems. And that is actually an issue in itself: if you encode something to format X and get it back in the wrong channel configuration, then what? Tempting to ask "is that even lossless?".
(While the FLAC format doesn't prescribe WAVE_FORMAT_EXTENSIBLE channel mask support: do the known multichannel-aware FLAC implementations all support it?)


* ALAC: Is refalac's "5 fronts" a limitation to ALAC-in-mp4? Could it be circumvented by ALAC-in-mkv?
Anyway, don't use ffmpeg -acodec alac on a 7.1 unless you want to lose a channel.

* TTA and 8 channels: ffmpeg can do that, and it seems ffmpeg-tta is just the same algorithm ported over, producing the same files (except when tta.exe refuses the input - and of course, except you have to strip tags).
Omitting TTA on 7.1, is that much of a loss? Nah ... format has no channel allocation support; the foobar2000 input component claims to support 8 channel input, but refuses with "Error: Unsupported file format" (doesn't matter whether I use the TTA devs' version from sourceforge nor @kode54's update).
When it cannot even be played back by the official tool that claims to support it, who cares that it compresses badly? This doesn't make much of a case for opening the "3rd party" can of worms even if ffmpeg is a better choice than tta.exe.

* TAK: Though not supported by the only published encoder, the file format apparently allows for 16 channels, according to https://wiki.multimedia.cx/index.php/TAK .
TAK seems to have some bizarre choices for the file header: sampling rate starts at 8000 - oh, great for those who make files sampled at between 262144 and 27043. And, supporting bit depths from 8 to 39.

* OptimFROG: According to https://wiki.multimedia.cx/index.php/OptimFROG, the file format has only one bit to represent the number of channels. File format has changed before, but I have a hunch that multichannel has been left off pending some other changes mentioned in the InProgress document. Indeed, the developer has an IEEE paper nearly ten years ago on the "asymmetric" extension. Long journey from working algorithm to release, when you are one person and the product isn't going to win world domination anyway.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #12
Thank you so much for your excellent work!

TAK absolutely kills here. Saving nearly four percentage points over FLAC -8 (nearly ten percent file size) - which in turn saves even more over TTA.
From my experience Tak does perform that much better than most other codecs if it can exploit similarities between channel pairs (joint stereo, channel decorrelation).

Most codecs support channel decorrelation of stereo files and this for good reasons. It helps a lot. But for multi channel files this support is often limited or non existent.

Most of the work on Taks multi channel codec dealt with implementation and tuning of channel decorrelation. And how to make it fast: For 6 channels there are 15 possible pairings that have to be evaluated. And Tak provides several methods that have to be tested for each pair. To make it acceptably fast i had to find heuristics to avoid having to try anything out. That's quite some work and the tests also  took very long because then my hardware was much slower than now.

* TAK: Though not supported by the only published encoder, the file format apparently allows for 16 channels, according to https://wiki.multimedia.cx/index.php/TAK .
That's true. There is no support for more than 6 channels, beacuse i haven't tuned the codec for more than 6 (With 8 channels there are 28 possible channel pairs). This would again be a considerable amount of work and since i (possibly wrongly) have the impression, that this feature would be useful for very few users, it's not very high on my to do list.

TAK seems to have some bizarre choices for the file header: sampling rate starts at 8000 - oh, great for those who make files sampled at between 262144 and 27043. And, supporting bit depths from 8 to 39.
But the people who control my mind and tell me what to do are using those rates!

Seriously: I wouldn't do it this way today. It will require some needlessly complicated specification to implement higher sampling rates (as you requested elsewhere).

Now back to work. Tak 2.3.2 should be released within the next days. No speed or compression improvements, therefore not affecting the currency of this comparison.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #13
Tak 2.3.2
Now you missed a minor history-repeating-itself by 35 minutes ...
("New" HA readers: look at the date of that thread. Somebody posted a "hey I'm beating Monkey's!" on April 1st ...)

... if time permits for 2.3.2, a couple of neat possible wishlist items that flac.exe and WavPack support: novice users may appreciate drag and drop - and others may appreciate recompress-in-place like takc -e -p4 takfile.tak. By the way, I don't really understand why I have to type "-e" when I have specified a "-p4", that implies encoding.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #14
That's true. There is no support for more than 6 channels, beacuse i haven't tuned the codec for more than 6 (With 8 channels there are 28 possible channel pairs). This would again be a considerable amount of work and since i (possibly wrongly) have the impression, that this feature would be useful for very few users, it's not very high on my to do list.
If a mapping existed to embed TAK into MKV, such as it exists for FLAC, WavPack, TTA and ALAC, it might be useful for people who rip their Blu-rays to MKV. FLAC and WavPack already compress better than the Blu-ray lossless codecs (Dolby TrueHD and DTS-HD MA), but as the results show, TAK would provide much more benefit. ffmpeg already has a TAK decoder build-in, so it would enable people to play such files with players like VLC.

I think this is also the main use case for multichannel FLAC currently.

edit: before anyone gets too excited, this is obviously a lot of work. Besides defining a mapping, support needs to be added to support this. Either TAK needs a Matroska writer or ffmpeg/mkvmerge need a way to read and write TAK frames, copying the encoding, to a MKV file. I would still be a lot of work for few users.
Music: sounds arranged such that they construct feelings.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #15
Multichannel is fishy matter, it seems. And that is actually an issue in itself: if you encode something to format X and get it back in the wrong channel configuration, then what? Tempting to ask "is that even lossless?".
I'd say it's lossless if it's reversible. Swapping L and R is a lossless process.

* ALAC: Is refalac's "5 fronts" a limitation to ALAC-in-mp4? Could it be circumvented by ALAC-in-mkv?
ALAC's channel layout is described in "chan" box in MP4 (or chan chunk in CAF) like this:

[chan: Audio Channel Layout Box]
    position = 477
    size = 24
    version = 0
    flags = 0x000000
    channelLayoutTag = 0x007f0008
    channelBitmap = 0x00000000
    numberChannelDescriptions = 0

Apple's reference software defines only 8 channel layouts for ALAC, each of which is mapped to a specific channelLayoutTag.
However, the structure of chan box is far more flexible than that, and is actually the same as the one defined in CAF format:
https://developer.apple.com/library/archive/documentation/MusicAudio/Reference/CAFSpec/CAF_spec/CAF_spec.html#//apple_ref/doc/uid/TP40001862-CH210-BCGCIJCF

So, you can at least craft ALAC file with arbitrary channel layout if you exploit this, although I don't recommend doing so.
ffmpeg quite naturally uses 8 defined layouts after the reference implementation, and doesn't look inside of chan box.


Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #16
Now back to work. Tak 2.3.2 should be released within the next days.
Nice to hear. Any chance of a 64-bit DLL or the source release? A 64-bit foobar2000 release is getting closer and FFmpeg is suboptimal for TAK decoding.

* TTA [...] the foobar2000 input component claims to support 8 channel input, but refuses with "Error: Unsupported file format" (doesn't matter whether I use the TTA devs' version from sourceforge nor @kode54's update).
When it cannot even be played back by the official tool that claims to support it, who cares that it compresses badly?
If you wish, you could drop some TTA problem sample archive my way. I have converted the foobar2000 TTA component to new component architecture and added 64-bit support so I might as well take a look at other known problems.
I did this before I knew kode54 had taken over the component. Though it seems kode's version has zero changes, just recompiled with new SDKs and compilers.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #17
I certainly acknowledge that coming up with some heuristic that decorrelates well on all thinkable weird 8 channel combinations might not be a feasible job to do - and that something that does it badly might better remain unreleased (I wouldn't say that Monkey's multichannel handling was outright good for its image ...), but:
Yes, 7.1 in the Blu-Ray allocation would be a Good Thing for TAK. Easy for me to say, who won't do the work.
And a Matroska profile too. If Real Lossless can have an MKV profile, then TAK should have one. (Easy for me to say ...)
(That IETF draft does on numerous occasions link to https://github.com/mbunkus/mkvtoolnix/ which is 404 - such things happen, but mbunkus is one of the authors ... but I digress, it is a draft and will hopefully be updated.)

Not that audio takes up as much hard drive as video on a BD rip - but we are here to get impressed, and TAK does that. Another "Easy for me to say".
Actually I think we'd even be impressed over a -p4TakeYourTimeDecorrelatingTheSeventhAndEighthChannelsAsLongAsYouDecodeFasterThanAnythingExceptFLAC

Also if @TBeck  can ask those people who control his mind whether they can do without the few high prime numbers > 270000, so they can be reserved for higher sampling rates? And it isn't "needlessly complicated" unless the spec prescribes a primality testing algorithm and forbids the lazy hardcoding of the 270001      270029      270031      270037      270059      270071      270073      270097      270121      270131      270133      270143 list?   
Just kidding. But more seriously, as there has never been official support above 192000, one might even consider to forbid the top few (like the top 7999) and reserve for 352800, 384000, 512000, 705600, 768000 and future use.


@Case : If you take the files at https://wiki.hydrogenaud.io/index.php?title=FLAC_decoder_testbench and run a for /r %f IN (*.flac) DO ffmpeg -i "%f" "%f.tta", which ones play as they should? We could compare.
Also, since reference TTA does some horrible things upon errors, I wonder whether those three people who even use that codec, should rather stick to ffmpeg.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #18
@Case : If you take the files at https://wiki.hydrogenaud.io/index.php?title=FLAC_decoder_testbench and run a for /r %f IN (*.flac) DO ffmpeg -i "%f" "%f.tta", which ones play as they should? We could compare.
The three tracks with 7 and 8 channels failed to decode. Here's a version that solves that: <edit: link removed, see below>.
Bitcompare against the original FLACs reveals 100% correct decoding, but the format doesn't store channel maps so speaker layout gets changed. Also 8 bps track apparently gets stored as 16 bps.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #19
The three tracks with 7 and 8 channels failed to decode.
Yep. And surprisingly, the component decodes 3 and 5 channels even if it says it is unsupported.
Your component works as you describe here as well.

Also 8 bps track apparently gets stored as 16 bps.
Unrelated bug in ffmpeg, but I have not gotten reference TTA to support 8 bits at all.
* ffmpeg refuses to "acknowledge 8 bit flac as 8 bit": it will convert it to 16 no matter what. Try ffmpeg -i "23 - 8 bit per sample.flac" "23 - 8 bit per sample.flac-ffmpeg-to.flac"  or to .wv instead of to .flac - it comes out as 16 bits.
But, if you decode the 8 bit flac file with flac -d and then ffmpeg the .wav to .tta, you get a .tta with 8 bits.
* tta.exe refuses to decode the latter 8-bit .tta file. Also the fb2k component (yours too).
* tta.exe also refuses to encode 8-bit .wav files.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #20
Ah, I didn't think to use FFmpeg for creating 8 bit TTA. I noticed official encoder refused such input. Here's a fixed foobar2000 component.

Update 2022-05-19: New modification to the component, it can now read channel mask from WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag field.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #21
Following @TBeck 's points about multi-channel decorrelation in #12 above, I tried to test some of the impact of multi-channel decorrelation. It appears that even if TAK would only implement some "let's try a few decent ideas" for decorrelating channels 7&8, it would be much better than anything that isn't several times as CPU intensive - but see reservations over there. Still, the People's Front For Matroskification of 7.1 TAK will soon be marching in the streets.


But going back to this corpus and looking closer at ktf's figures, there are some surprises.  One thing is to save ten percent even at this already-high compression (gut feeling is that it is easier to save a lot if your competition hasn't already exploited it all eh?) - in the CD test of revision 4, TAK would only save five percent over FLAC. Even with a couple of percent for channel decorrelation, it seems TAK handles the step up from 2 to 6 channels very well.

But then there's Hans Zimmer: Inception - WavPack already at -fx4 beats TAK -p4m.  One should not put too much into a counterexample, but still. There are well-known cases that TAK isn't too happy about.  But Zimmer isn't noise music, it is very far away from it. From what I can hear at https://www.youtube.com/watch?v=vnkiVa4A-F8 , this is a soundtrack with lots of gloomy-but-dreamy synths in between the drama.
(Or is it the same?  Blu-Ray, is that the 39 minutes on disc 2 here? That is shorter than the soundtrack album, and ...)
OTOH, it is actually the second-to-least compressible source.  (After Pink Floyd's quad mix of DSOTM, which is ... which is music, which is what I guess the codecs were developed on.)    And by the gut feeling that TAK and Monkey's often follow each other in compressibility, the observation that Monkey's isn't ecstatic over this either, makes one wonder that there is something to it more than the usual coincidences. But if it isn't noisiness, what is it?


Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #22
OTOH, it is actually the second-to-least compressible source.  (After Pink Floyd's quad mix of DSOTM, which is ... which is music, which is what I guess the codecs were developed on.)    And by the gut feeling that TAK and Monkey's often follow each other in compressibility, the observation that Monkey's isn't ecstatic over this either, makes one wonder that there is something to it more than the usual coincidences. But if it isn't noisiness, what is it?
I can't say anything definitive without having a piece of the sample, but one thing does come to mind. If there are portions that contain digital silence in one channel and something else in the "paired" channel (e.g., left-rear vs. right-rear) then even adding a -x would greatly improve the results. The reason is that basically the only thing that -x adds is checking whether encoding left-right is better than mid-side, and in this pathological situation that makes a huge improvement. And the only thing that WavPack does to take advantage of interchannel correlation in 5.1 files is to group the appropriate channels into 2 mono and 2 stereo "streams" which are then compressed independently using the existing methods.

I did a quick experiment where I created a 16-bit stereo file with digital silence in the left channel and digital Billie Eilish in the right channel. When I compressed this with WavPack default it reduced to 55.9%. Adding just -x improved that to 34.96%!

Of course, this doesn't explain everything because even without -x WavPack does extraordinarily well on this sample. But I'm guessing that it still has to do with the way multichannel and silence is handled rather than another characteristic of the audio (like noisiness, etc.)

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #23
digital silence in the left channel and digital Billie Eilish in the right channel

Damn, new channel layout just dropped.

Re: Lossless codec comparison - part 1: multichannel (Mar '22)

Reply #24
But then there's Hans Zimmer: Inception - WavPack already at -fx4 beats TAK -p4m.  One should not put too much into a counterexample, but still. There are well-known cases that TAK isn't too happy about.  But Zimmer isn't noise music, it is very far away from it. From what I can hear at https://www.youtube.com/watch?v=vnkiVa4A-F8 , this is a soundtrack with lots of gloomy-but-dreamy synths in between the drama.

I too have no clue why WavPack performs so much better than elsewhere. I just checked the logs, and the advantage is spread out evenly over all 10 tracks. The track where WavPack has the least advantage (but still does better then other albums) is Mombasa, which is very lively and percussive. The track with the largest advantage for WavPack is Time, which is indeed gloomy-but-dreamy and very slowly building up in volume.

Quote
(Or is it the same?  Blu-Ray, is that the 39 minutes on disc 2 here? That is shorter than the soundtrack album, and ...)
Yes, that's the one. Tracks probably won't differ much from the album ones.
Music: sounds arranged such that they construct feelings.