Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE? (Read 5367 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Someone who knows this piece of history in more detail?


I mean, channel allocation is a format property of the audio - it is nothing like a <TITLE> tag and that sort of stuff.
FLAC allows several frame-property metadata to refer to STREAMINFO - that is, to a metadata block intended to be immutable, and not to the Vorbis comments.

So, is there any reason why one didn't dedicate one of the >100 reserved metadata BLOCK_TYPEs to a channel mask, and rather left it for a WAVE_FORMAT_EXTENSIBLE Vorbis comment, which could be messed up by an unaware tagger?  Even a FLAC-aware application like foobar2000 had this problem as recent as 2021, thread: https://hydrogenaud.io/index.php/topic,120954.0.html .
That is, "any reason" except the obvious candidate: it was easier to introduce that way in a sorta-quasi-official manner and hope it would be supported?

Yes, there are a lot of applications that would not understand a new BLOCK_TYPE - but most of those would not honour it as Vorbis comment either, so better keep it out of their reach.
Yes, CUESHEET type vs Vorbis comment cuesheet could be a mess - but that also means, developers of those applications that fully support FLAC, are aware of that issue already: to those, the "problem type" is known and handled.

So if there were defined a BLOCK_TYPE for channel mask - and given that it started out in a Vorbis comment, one would likely have to use both - what then? 
 * Some applications would be ignorant about the whole thing.  Makes no difference to them, except the next bullet item:
 * Some applications might allow a Vorbis comment to be destroyed.  Here is where a channel mask in BLOCK_TYPE would safeguard.
 * To retrieve the channel mask when a Vorbis comment is destroyed, some applications would have to be upgraded.  But if not, then they will miss out what is otherwise gone forever when some other application destroyed the Vorbis comment.
So this far, a win-win.  But:
 * If something deliberately changes the dedicated block and not the Vorbis comment, then some other application could read it wrong.


Was that ever considered so crucial that one decided against a BLOCK_TYPE, or did one just ... not get to that decision ever?

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #1
I've found only one place where a new block type was mentioned as a possibility for that purpose:

http://lists.xiph.org/pipermail/flac-dev/2013-January/003582.html
Quote
If it is important to support other configurations, we could _in
addition_ define an extension mechanism, e.g. with the
WAVEFORMATEXTENSIBLE_CHANNEL_MASK in a vorbis comment or custom metadata
block and have that override the default setting from the frame header.

But at that point flac already supported the vorbis comment tag, so it's a bit weird that it was proposed again as one of the solutions.

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #2
The mechanism in the format itself is lacking. Other formats adopted the WAVEFORMATEXTENSIBLE thing and it got defacto added to flac. Probably it never got to a decision to add a new block type or change the format because this was good enough.

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #3
I was curious about that as well.

I'm working on an Icecast-oriented utility that can do things like, re-mux and/or re-encode to http live streaming, supports things like taking metadata and converting it to timed ID3 metadata, and so on.

One issue I can into for all the FLAC in other encapsulation cases (FLAC in Ogg, FLAC in MP4) - almost nobody supports the WAVEFORMATEXTENSIBLE thing. In MP4 especially, it seems like all decoders/demuxers only parse the STREAMINFO block since that's the only block type that could affect decoding behavior. This makes it hard to have non-standard layouts in HTTP Live Streaming scenarios (though to be fair - there's other issues affecting other codecs that make any kind of explicitly-signaled layouts not super feasible).

This seems like a good use-case for expanding the STREAMINFO block to include a channel layout, with a value of 0 indicating "one of the default layouts." I can see not wanting to add to the STREAMINFO block but then again - the header dictates the size of the metadata block, a channel mask could have been added after the STREAMINFO MD5 checksum and (hopefully) most decoders would just ignore the extra data.

STREAMINFO is the only required block, and like Porcus mentioned, it's allowed to be referenced by frame header, meaning it was already allowed to affect decoding behavior. So adding it to STREAMINFO would make sense. Maybe update the number of channels frame header to allow one of the reserved values to indicate "get from the STREAMINFO" block. Though this would mean you can't signal if you're using mid/left/right-side stereo and therefore can't perform stereo de-correlation.

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #4
Changing streaminfo would break everything so cannot be done, there's no wiggle room. Even if there was a way to extend it without a breaking change (there isn't) it would still likely be met with a hard no, because while it may be true that parts of the format technically have left room for extension, the fact is there's now hundreds of implementations that may or may not rely on the format being what it is; So even the "extensible" parts of the format are not really extensible.

Sticking channel allocation in the vorbis is not ideal but also not terrible. A new blocktype could also work, yes there's hundreds of implementations and maybe a few don't skip unknown block types, but they really should. But is it worth it just for channel allocation, are there any other nice-to-haves that are currently missing or being hacked in via vorbis?

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #5
But is it worth it just for channel allocation,
I kinda think that a "no" is best justified by "don't use FLAC for this, use WavPack instead". Which IMHO not a good argument when one has started using it in Vorbis comments.


are there any other nice-to-haves that are currently missing or being hacked in via vorbis?
Musing aloud:

* ReplayGain. That's in Vorbis comments by now, fine with me: it is a tag that players alter upon a deliberate decision - for several audio formats and several tagging formats, so it is fine to have in a tags section. Besides it is a bit up to user discretion (like algorithms and whatnot), much more so than channel allocation.

* "Even worse than the WAVE channel mask" is when channels are repeated in the same stream but for different devices. IMHO it is more of a thing for multiple streams in a suitable container, and even more so for the native FLAC container which is capped at eight.
What is going on? You rip an Australian broadcast, there are eight channels and they happen to be in the following order:
L for stereo only - R for stereo only - L for 5.1 - R for 5.1 - C for 5.1 - LFE for 5.1 - Ls for 5.1 - Rs for 5.1
You rip a Swedish with the same eight channels, but then the "for 5.1" are first and the "for stereo only" are at 7 and 8.
For the WAVE format channel mask, the order is given in the format, and the mask just indicates with a bit whether it is present or not. You cannot change the order of them, and you cannot have more than one.
But again ... who cares.

* An indicator for A-law or µ-law or similar. Technically it isn't hard to store these signals. I mused about it in https://hydrogenaud.io/index.php/topic,124142.msg1027036.html#msg1027036 , but why care? Except apart from morbid curiosity, to which I plea guilty or sometimes the fifth or sometimes the first. (Heck, if anyone wants that extension, couldn't one then abuse CUESHEET and implement a rule not only for FLAGS PRE, but hack a different FLAGS into it?)

* Not nice-to-haves for us, but maybe others would want to infest FLAC with junk like ... fingerprints and keys to control scaMQA playback?

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #6
Changing streaminfo would break everything so cannot be done, there's no wiggle room. Even if there was a way to extend it without a breaking change (there isn't) it would still likely be met with a hard no, because while it may be true that parts of the format technically have left room for extension, the fact is there's now hundreds of implementations that may or may not rely on the format being what it is; So even the "extensible" parts of the format are not really extensible.

So the way I see it is this. No matter how you do it, you'll have broken decoders. If you put it in the Vorbis Comments, you'll have decoders ignore it and decode incorrectly. If you put it in a new metadata block type, you'll either have decoders ignore it and decode incorrectly, or croak on seeing an unknown block type. Put it in an expanded STREAMINFO block, same issue - either decoders decode it incorrectly, or they croak on the larger STREAMINFO block.

I'd rather have something throw an error then silently decode incorrectly. If I get an error I can figure out what's going on and try to fix it. If I silently get the wrong thing I may not even realize what I'm hearing is wrong. Like in an incorrectly-decoded 2.1 setup I'll have LFE get sent to my center speaker. Hopefully those lower frequencies get filtered out somewhere else, somehow.

So, you could make the STREAMINFO size variable - if it's larger than the current, that means you have a channel mask. Encoding an audio stream with a newer version of FLAC that uses a standard layout? The size of the STREAMINFO block remains the same. Existing decoders have no problem. Everybody goes on living their lives.

Need a custom channel layout? Expand the size of the STREAMINFO block, put it in there. You may still have some decoders ignore the new parts of STREAMINFO, there's no perfect way about it. But again I'd rather get an error that I need to fix, than get the wrong data.

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #7
Except not doing an optional thing, like reading settings from vorbis or a new block type, is not broken in the nuclear sense. Going tits up when a decoder can't handle a core format change that no one expected is breaking. Kids that weren't born now have their own kids in the time streaminfo has stayed the way it is, changing it isn't just going to break a few key players it'll likely break nearly everything, including hardware players and long dead/proprietary software players that just won't get updated.

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #8
There might be one argument against a BLOCK_TYPE (edit: well not necessarily against "both"): would it survive recompression .flac -> .flac?
If not, that is a case for Vorbis comments.

I see the point in rather having an application reject what it cannot handle, than pretending it can handle it and outputting wrong - and from that point of view, extending STREAMINFO kinda makes sense. But, seriously: STREAMINFO is so much set that changing it would close in on "then don't use FLAC for this!". Then it isn't far to "use WavPack instead!".
Sure there are tons of devices/applications/OSes which support FLAC and not WavPack, but do they play the multi-channel files in question correctly anyway? If they don't play them correctly and you are in the "then please reject!" camp, then ... use WavPack.

... after all, WavPack also handles more channels than people have brains  :)) 


Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #9
Just to clarify, the idea here isn't to change STREAMINFO for all new FLAC streams.

Make STREAMINFO expandable. If you're producing a stream that uses the existing channel layouts? Then have the encoder produce the same STREAMINFO block it always has. The file will be decoded correctly by all existing decoders. No fuss, no muss.

Producing a file that uses a different layout? Use the expanded STREAMINFO block. Would it break on some decoders? It sure would. But I wouldn't get incorrect sound sent to the wrong speakers.

I'm all for maintaining backwards compatibility, but once you hit the point that the "backwards compatibility" means existing decoders are doing the wrong thing - I would argue you're not actually maintaining compatibility. You're introducing silent, ignored errors.

I'd much rather get an error than to have it keep "working" but it's actually doing the wrong thing. Maybe the vendor fixes it, maybe I just have to update a FLAC library, maybe I have to change to WavPack, whatever. The point is software should error out and not do the wrong thing.

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #10
Just to clarify, the idea here isn't to change STREAMINFO for all new FLAC streams.

[...]
I'm all for maintaining backwards compatibility, but once you hit the point that the "backwards compatibility" means existing decoders are doing the wrong thing - I would argue you're not actually maintaining compatibility. You're introducing silent, ignored errors.

In that case if compatibility is really on the table: update frame header specification - to make for streamability. A frame header has half byte to specify channel allocation, values 0 to 10 are taken, values 11 to 15 are "reserved", and could then be brought into use.

In that case. Which isn't going to happen I think. Not because your argument is rubbish, but I suspect it is too much trouble over too little benefit.




And then this, I stumbled across EBU's BWF/RF64 document that addresses it:
What is going on? You rip an Australian broadcast, there are eight channels and they happen to be in the following order:
L for stereo only - R for stereo only - L for 5.1 - R for 5.1 - C for 5.1 - LFE for 5.1 - Ls for 5.1 - Rs for 5.1
Downmix channels are 0x20000000 for LEFT and 0x40000000 for RIGHT. (0x40000000 is reserved for "ALL")
(Slightly surprising, there is no mono downmix channel. 0x10000000 becomes reserved for something else.)

You rip a Swedish with the same eight channels, but then the "for 5.1" are first and the "for stereo only" are at 7 and 8.
For the WAVE format channel mask, the order is given in the format, and the mask just indicates with a bit whether it is present or not. You cannot change the order of them, and you cannot have more than one.
But again ... who cares.
Those who do, have made up their mind ;)

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #11
You rip a Swedish with the same eight channels, but then the "for 5.1" are first and the "for stereo only" are at 7 and 8.
For the WAVE format channel mask, the order is given in the format, and the mask just indicates with a bit whether it is present or not. You cannot change the order of them, and you cannot have more than one.
But again ... who cares.

2 separate WavPack files multiplexed in a single Matroska container file?

Re: Why channel allocation stays in Vorbis comments rather than its own BLOCK_TYPE?

Reply #12
2 separate WavPack files multiplexed in a single Matroska container file?
Not a bad idea - but, as I just referred to: EBU put them in one BWF/RF64 with downmix channels assigned way past the Microsoft channel assignments.

Though I got this obviously wrong:
Downmix channels are 0x20000000 for LEFT and 0x40000000 for RIGHT. (0x40000000 is reserved for "ALL")
0x80000000 is reserved for "ALL".