Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: How does the Bit Reservoir work? (Read 6639 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

How does the Bit Reservoir work?

I am really interested in fully understanding the MP3 format, and after reading through the LAME FAQs, and mp3-tech.org, I still have not found a thorough explanation of the workings of the bit reservoir. Now, I am familiar with what the term means. What I'm interested in is the mechanism in which the bits are reserved for later use, in a sense.

I would appreciate if someone directed me to some literature, or gave a short explanation of its workings.

Thanks,

UED77
UED77
wavpack 4.50 -hx3; lame 3.97 -V4 --vbr-new

How does the Bit Reservoir work?

Reply #1
The term "bit reservoir" is a bit misleading, IMO. What is actually stored in an MP3 frame header is the size of the data and a negative offset relative to the frame header representing where the frame data starts.

Say you have a file whose first frame is 320kbps. In this case a maximum of 1008 bytes can be stored. The frame header would say something like [offset:0 length:1000]. The last 8 bytes in this example aren't used in the first frame's data.
Now the next frame, let's say, is 128kbps, which can hold a max of 381 data bytes. However, the frame can say [offset:8 length 385]. This means that the frame's data starts 8 bytes before the frameheader (that is, 8 bytes into the previous frame) and fills up 377 (= 385 - 8) bytes of the current frame.

The bit reservoir, therefore, is nothing more than that offset, indicating how much of the current frame's data comes before the current frame's header.

If that makes sense...
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

How does the Bit Reservoir work?

Reply #2
Omion already gave an explanation to me in this thread - you may want to take a look there too

How does the Bit Reservoir work?

Reply #3
I do not understand the very details of mp3 encoding, but maybe I can contribute working out the global data structure of mp3. It's not new stuff as for instance Omion has said the essential things before but maybe things get clearer when concentrating on the data streams.

Frame partitioning of mp3 encoding input stream:
The musical information of a CD (to concentrate on that) comes in form of wave samples. Each sample value is a 16 bit number, and the sampling frequency is 44.1 kHz.
For each channel 1152 of these samples are taken together, and these 2 x 1152 samples form an input frame for mp3 encoding.
So input stream for mp3 encoding is a stream of these input frames.

Frame partitioning of mp3 output transport stream:
Each input frame is translated into a corresponding output frame. Output frames' size corresponds to bitrate. A bitrate of 320 kbps for an output frame means for instance that the size of the output frame is 1008 Byte (concentrating again on CD encoding). Bitrate respectively size can vary from output frame to output frame
(that's the case when VBR is used [Lame's specific VBR or ABR methods are both VBR]).
Each output frame has an administrative overhead of 40 Byte (header, side info).
The remainder of the output frame can take up encoded musical information.
I like to call these output frames 'transport frames' as they do not necessarily contain the entire musical information of the corresponding input frame as we will see.

mp3 output audio frames:
The encoded musical information of an input frame is not necessarily contained completely in the corresponding transport frame. It can start in the preceding transport frame in case there is space left there (bit reservoir!). Eventually it can even start in a further preceding transport frame. The restriction is it cannot start more than 511 Byte ahead of the current output frame. The exact offset is stored in the side info of the current output frame (the side info structure dictates the 511 Byte offset limit).
This way the stream of input frames is mapped to a stream of audio frames.
Each audio frame that corresponds to a certain input frame ends within the corresponding transport frame, but can start in a preceding transport frame.

Bitrate:
When we're talking about bitrate we're talking about bitrate of the transport frames. The essential thing however is the bitrate of the audio frames. Thanks to the bit reservoir possibility an audio frame is not restricted to the 320 kbps limit (valid for the transport frames), but can have a bitrate of more than 400 kbps.
Moreover CBR means only constant transport frame bitrate, audio frame bitrate can vary quite a lot.
lame3995o -Q1.7 --lowpass 17

How does the Bit Reservoir work?

Reply #4
Thank you for the in-depth info.

As for the frame length and offset information, where is that stored? A web search yields few usable results... AFAIK, each frame has a 32-bit header, followed by optional CRC and the audio data...
I've also heard you and others mentioning "side info". What exactly do you mean by that? Is that related to the two channels, or is it just a confusing name for supplemental information?

If I understand correctly, mp3 trimmers work by temporarily restructuring a file so that the bit reservoir eliminated, and then perform cutting on potentially invalid, but self-contained frames. Then the file can be collapsed losslessly again, can it not?

UED77
UED77
wavpack 4.50 -hx3; lame 3.97 -V4 --vbr-new

How does the Bit Reservoir work?

Reply #5
Quote
Thank you for the in-depth info.

As for the frame length and offset information, where is that stored? A web search yields few usable results... AFAIK, each frame has a 32-bit header, followed by optional CRC and the audio data...
I've also heard you and others mentioning "side info". What exactly do you mean by that? Is that related to the two channels, or is it just a confusing name for supplemental information?
[a href="index.php?act=findpost&pid=370314"][{POST_SNAPBACK}][/a]

The "side info" is stored right after the header (and after the CRC, if present). It's just more meta-data like the header. It is usually 32 bytes long, as opposed to the 32 bit header. It contains information like the scaling (which mp3gain changes) and also the location and length of the frame's data.
The frame data length is actually divided into 4 parts (for most MP3s) Each frame contains two "granules" (like sub-frames) and two channels, for a total of 4 "parts" of a frame.
The side info also says whether the granules use long blocks or short blocks. There is some more stuff in there, but I don't know what the rest does.

Quote
If I understand correctly, mp3 trimmers work by temporarily restructuring a file so that the bit reservoir eliminated, and then perform cutting on potentially invalid, but self-contained frames. Then the file can be collapsed losslessly again, can it not?

Some mp3 trimmers just cut along frame boundaries, causing a bit of bad data. The good ones, however, work similarly to how you described: The frame sizes are changed so that no frame after the cut has data located before the cut (i.e. the bit reservoir is made 0 at the cut) then the parts are separated.

[note: when I say "for most mp3s", I mean that it holds for 2-channel audio at 32000, 44100, or 48000 Hz]
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

How does the Bit Reservoir work?

Reply #6
So a frame's administrative overhead (header + side info) is 36 byte usually.
Sorry for giving a wrong number of 40.

Another thing I seem to have thought of in a wrong way: when talking about say 320 kbps this seems to mean 320000 bit per second.
I thought it to mean 320*1024 bit per second, but this would lead to a frame size of 1070 Byte (rounded up) which after subtracting the 36 Byte overhead would be more than 1008 Byte of data information.
lame3995o -Q1.7 --lowpass 17

 

How does the Bit Reservoir work?

Reply #7
Yup, 320kbps means 320*1000 bits per second, which results in a 1044/1045 byte frame, depending on if padding is used. And 1045-36=1009 (I guess I was a bit off, too  )

Also, I'm pretty sure the standard limits the amount of data per frame to how much data can be stored in the largest frame, i.e. 1009 bytes. Some encoders stretch this definition to mean that the limit is the largest frame of all bitrates, which is more like 1404 bytes. Not quite the full 1520 that you'd get with a 320kbps frame + full bit reservoir, but close.
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2