Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Codecs able to change sample rate, bit depth and number of channels mid-stream (Read 4136 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Codecs able to change sample rate, bit depth and number of channels mid-stream

Hi all,

TL;DR: Which audio codecs (both lossy and lossless) and containers are able to change sample rate, bit depth and number of channels mid-stream?

Background: Currently an IETF working group is working on creating an RFC for the FLAC format. While the document at https://xiph.org/flac/format.html looks very nice and all, there are quite a few things that aren't written in the spec and have to be found in the libFLAC reference code to get something working.

Recently the issue came up whether a FLAC file that changes its sample rate, bit depth and number of channels mid-stream would be valid. There isn't anything about this in the spec, but libFLAC actually has provisions to allow this behaviour. For example, the function allocate_output_ in stream_decoder.c, which creates the output buffers, is called every frame, and specifically has a comment in the code that is there to allow the number of channels to change mid-stream. Something similar is implemented for seeking. The flac command line utility is not able to cope with this, but I think that is because such a stream cannot be decoded to WAV. However, as there is code in libFLAC, I'd say at some point FLAC was envisioned to be able to carry a stream that changes at least the number of channels mid-stream.

The IETF working group was created specifically for the archiving community (which might embrace FLAC as a substitute for WAV), and there is most certainly material that changes the number of channels in a stream. For example, TV streams using AC3 can change the number of channels between a movie and a commercial break. Therefore, for the purpose the IETF working group has in mind, not limiting the spec to a single sample rate, bit depth and number of channels would be beneficial for the use case.

However, what I was wondering is how many codecs and containers are able to do this? Is it just AC3? Please comment if you know more formats and use cases.

The discussion is here: https://github.com/ietf-wg-cellar/flac-specification/issues/109
Music: sounds arranged such that they construct feelings.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #1
MP3 can technically throw mixed packets together into the same stream, but I don't know that the bit reservoir behavior would like it very much, or whether any decoder in existence can cope with format switches.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #2
TL;DR: Which audio codecs (both lossy and lossless) and containers are able to change sample rate, bit depth and number of channels mid-stream?
Without resetting encoder?  I guess no encoder can do that.
For example, libflac allows call to FLAC__stream_encoder_set_channels() or something only when encoder is uninitialized.

With resetting encoder... why not?
Format change will not be done smoothly due various reasons (codec delay or something). Also you cannot expect good support on the player side. But technically it should definitely be possible in many codecs.

Simple example is what is called "ogg chaining".
Also, you can just concatenate many of MPEG elementary audio streams (with simple cat command).
Containers such as MPEG-TS and ISOBMFF allows multiple descriptors for exactly that purpose.

As for FLAC, the FLAC file format must start with METADATA_BLOCK_STREAMINFO which contains things like sample rate or number of channels, so it doesn't allow format change in the middle.
However, you should technically be able to store FLAC stream with varying format in a single MP4 file using multiple sample descriptions for each segment.

but I don't know that the bit reservoir behavior would like it very much
With full encoder reset, the second stream starts with zero dependencies to preceding frames, so it should not be a problem.
However, simple concatenation of multiple mp3s is usually not a good idea due to ID3 tags/Xing headers or something.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #3
I created an example AAC (ADTS) file.
First 210 frames are encoded in 24kHz, the rest (423 frames) are 44.1kHz.
It seems that fb2k can play this file surprisingly well.


Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #5
Unlike many other codecs, each FLAC frame can be decoded independently of the others. Even STREAMINFO is optional in most cases. A decoder that's expecting the parameters to change mid-stream could handle it almost seamlessly.

I'm not sure it makes much sense to require decoders to handle it when there are containers that provide the same functionality in a codec-agnostic way.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #6
As for FLAC, the FLAC file format must start with METADATA_BLOCK_STREAMINFO which contains things like sample rate or number of channels, so it doesn't allow format change in the middle.
I disagree: why else would there be provisions in the libFLAC decoder for handling a changing number of channels? Certainly STREAMINFO would only be valid for a part of the file.

Quote
However, you should technically be able to store FLAC stream with varying format in a single MP4 file using multiple sample descriptions for each segment.
I'm not familiar with how MP4 is structured: would a new sample description be followed by a completely new FLAC stream, more or less? With a new STREAMINFO block and a sample counter starting at zero, much like in Ogg chaining? Would anything in this document need to change?

You probably know, but Matroska allows you to change streaming codec: https://www.matroska.org/technical/streaming.html
Well, I found something about a codec state element, but nothing in depth. I can't find anything about changing codecs on the page you're linking. At least with mkvmerge, I'm unable to append a FLAC stream to another FLAC stream.

I'm not sure it makes much sense to require decoders to handle it when there are containers that provide the same functionality in a codec-agnostic way.
I agree that doing this at a container level (being either Ogg, MKV or MP4) would be cleaner, yes.
Music: sounds arranged such that they construct feelings.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #7
I created an example AAC (ADTS) file.
First 210 frames are encoded in 24kHz, the rest (423 frames) are 44.1kHz.
It seems that fb2k can play this file surprisingly well.

On my foobar file stays at 24k the whole time. VLC plays first ten seconds or so.
Error 404; signature server not available.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #8
I'm not familiar with how MP4 is structured: would a new sample description be followed by a completely new FLAC stream, more or less? With a new STREAMINFO block and a sample counter starting at zero, much like in Ogg chaining? Would anything in this document need to change?
Structure of MP4 is described in that isoflac document.
Its not so simple like series of 1st description -> frames -> 2nd description + frames.
All frames are stored as MP4 sample in a mdat box, and they are all fully indexed for random access.
All descriptions are stored in a sample description box in the form of an array.
Samples in mdat box can be logically divided into chunks, and each chunk can point to a sample description that describe samples in the chunk.
Normally, only one sample description is used, so all chunk use that same sample description.
However, ISOBMFF allows multiple descriptions for our purpose. You just have to divide samples with different format  in different chunks
I don't think isoflac document needs to be changed in this regards.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #9
Attached is a FLAC file similar to nu774's AAC file: it has a few samplerate changes. To my surprise, my smart TV (which supposedly parses FLAC files with libFLAC internally, as somewhere deep in a menu there is a mention of libFLAC copyright) plays this file perfectly. ffplay and mplayer play it nicely, VLC stops after the first samplerate change. Foobar2000 needs about 0.2 seconds to change the samplerate, in that 0.2 seconds, it plays either too slow or too fast.

Not too bad though  :D
Music: sounds arranged such that they construct feelings.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #10
I never seen the sample rate ever change in a television broadcast from 48 KHz to anything else but I've seen the number of channels change from 2.0 to 5.1 to 2.0 to 5.1 repeatedly.  The standard in my area is AC3.

Sample rate changes could be more useful when using a codec like AAC or FLAC for data compression over a connection between two devices where latency is not important.

Putting the AAC file in MKA container using MKVToolNix causes it to fail to play but putting the FLAC in a MKA container does not.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #11
Attached is a FLAC file similar to nu774's AAC file: it has a few samplerate changes. To my surprise, my smart TV (which supposedly parses FLAC files with libFLAC internally, as somewhere deep in a menu there is a mention of libFLAC copyright) plays this file perfectly. ffplay and mplayer play it nicely, VLC stops after the first samplerate change. Foobar2000 needs about 0.2 seconds to change the samplerate, in that 0.2 seconds, it plays either too slow or too fast.
flac -ac fused.flac aborts with the following error:
Code: [Select]
fused.flac: ERROR, sample rate is 24000 in frame but 32000 in STREAMINFO
fused.flac: ERROR while decoding data
            state = FLAC__STREAM_DECODER_ABORTED
This consistency check seems to be enforced at purely application level by flac command, not by libFLAC.
ISOFLAC doc clearly says the following, but I couldn't find similar comments in the official FLAC format spec:
Quote
Note that the FLAC native FRAME_HEADER structure that begins each FLAC sample redundantly encodes channel count, sample rate, and sample size.  The values of these fields must agree both with the values declared in the FLAC METADATA_BLOCK_STREAMINFO structure as well as the FLACSampleEntry box.
In case of ISOBMFF, this is OK since we can have multiple FLACSampleEntry.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #12
VLC stops after the first samplerate change. Foobar2000 needs about 0.2 seconds to change the samplerate, in that 0.2 seconds, it plays either too slow or too fast.
Not too bad though  :D

OK, I misunderstood what happened. I've played this file in foobar, and I've noticed the difference in quality of spoken words, but the file info stayed the same all the time, it does not refresh. Always shows 32 kHz.
I've played the file in mpvnet player, and if I enable file info button, at each change it shows different samplerate, but with noticeable click and slight pause between them. Also, it shows bitrate change, which I like.
Why foobar's flac decoder doesn't update bit rate (and in this case, sample rate) while playing back file?
Error 404; signature server not available.

 

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #13
With a resampler inserted, the transitions are smooth in Foobar when it doesn't need to reopen the audio device. Older players most often change the speed. I'd expect a SPDIF receiver to support changes in the sample rate if it can play 44100 at all becasue there isn't an out of band signal of a start of stream. What is the speech synthesizer with the English accent (sahmple rate thirty kilohertz)? I hear it often in video uploads.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #14
What is the speech synthesizer
I used Amazon Polly because I believe Amazon does not reserve any copyright on these files. Whether text-to-speech holds any copyright at all is debatable (as it is not human-made), but at least they state it.
Music: sounds arranged such that they construct feelings.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #15
TL;DR: Which audio codecs (both lossy and lossless) and containers are able to change sample rate, bit depth and number of channels mid-stream?

I haven't tried decoding something like that with libflac, but looking at the Format Overview and the Frame Header spec, there should be no problem changing either sample rate, bit depth, or channels on every frame for FLAC. In libflac you can see the frame header every decode and act accordingly. Though probably every player ever assumes this is not the case, because that's a terrible way to work. Playing that back is a nightmare.

I can somewhat see the appeal of supporting this for concatenating disparate clips into a single stream. As mentioned a couple of times, Ogg will let you just concatenate files like that (it doesn't even matter what codec each link is) without much overhead at all. I just split a song on every frame, and the overhead was only 2.6%. And that's with tags being replicated on every frame as well. Although Ogg chaining support and behaviour does vary per player, it is a perfectly valid construction.

As for lossy, lossy is lossy, so just resample, requantize, and mix as needed into a single format and encode, and if so desired also on the decode. This would be the way to go for mastering something to FLAC as well, but not all resampling and requantization is technically lossless (sadly no float32 on FLAC), so I mention it here instead. Opus and Vorbis don't really have bit depth, and Opus doesn't mess with sampling rate non-sense either. These terms are only really meaningful as properties of a PCM format, which is not what many lossy codecs store. If the output happens to be 16 or 24 bit samples at 48khz or 96khz, this is a decoding choice. The actual bandwidth and SNR of the signal are different.

Re: Codecs able to change sample rate, bit depth and number of channels mid-stream

Reply #16
Cog, as of version 1990, now supports both of the above test files. This required copious changes to how it operates.