Hi,
In my app, I need to determine mp3 frame length (in bytes) quickly. Most of the time it works great, but I have some mp3 files that have frames of 1 bytes longer or shorter than what I compute. I.e. when I scan to the next frame, I don't see the Sync header (FF byte), it's sometimes 1 bytes before or after the position I calculated. Here's an example of 2 such frames. They are only different by the padding bit, so one frame should be 1 byte longer than the other, right? But in my mp3 stream, both frames have exactly the same length, 261 bytes.
First frame is FF F2 50 C0
Next frame is FF F2 52 C0 -- the same, but the padding bit is set
Those are Mpeg V2 Layer3, 40 kbps bitrate, 22,050Hz mono frames, one with one without padding. Frame length should be 144*40000/22050 + padding = 261.2 + padding. The second one should be 262 bytes, but it's 261 in my stream!
This is an older thread, but I ran across it while researching something and thought I'd fill it in for Google fodder, or in case the OP is still listening.
The problem is in the formula used for calculating frame size. The "144" is not a magic number, it's Bits_Per_Sample, which = (Samples_Per_Frame /
.
For a .MP3 file (which is MPEG 1 Layer 3) the magic value 144 would be correct. The number of samples in a frame is 1152, divided by 8 gives 144.
However, the sample header describes MPEG 2 Layer 3, which uses 576 samples per frame. This yields 72 for Bits_Per_Sample. So, 260 bytes is the size of two frames, not one.
In the worst case, your code should never assume that a frame immediately follows another frame. This will be the case most of the time, but tags and broken frames or other corruption will take your parsing routine down hard if you blindly assume the next byte is a valid frame header.
My solution was to write and use mpg_find_frame(), which takes a buffer, a *sizeof(buffer), and an *offset within that buffer. It starts looking at the provided offset for 0xFF, and if it finds it, looks to see if the next byte is ((buf + *offset + 1) & 0xE0) == 0xE0. If so, it will further still check to ensure the version, layer, and bitrate fields are not "Reserved", and then update *offset to the location it found. If it doesn't find the sync, it moves up one byte and checks again. Keep doing this until you run out of buffer length, then return an error. At this point, it's your (or your user's) call to keep searching for a sync pattern in a fresh buffer, or give up.
I've also added code to recognize ID3/ID3v2 tags and return their size so the file reader can skip them intelligently. ID3v2 tags can be large, so you can save a lot of time by fseek'ing instead of parsing them one byte at a time.
Here's part of the code I wrote for handling MPEG audio, specifically the part needed for calculating frame sizes. Hope it helps. (I'd appreciate comments if anyone spots an error.)
// MPEG versions - use [version]
const uint8_t mpeg_versions[4] = { 25, 0, 2, 1 };
// Layers - use [layer]
const uint8_t mpeg_layers[4] = { 0, 3, 2, 1 };
// Bitrates - use [version][layer][bitrate]
const uint16_t mpeg_bitrates[4][4][16] = {
{ // Version 2.5
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, // Reserved
{ 0, 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 0 }, // Layer 3
{ 0, 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 0 }, // Layer 2
{ 0, 32, 48, 56, 64, 80, 96, 112, 128, 144, 160, 176, 192, 224, 256, 0 } // Layer 1
},
{ // Reserved
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, // Invalid
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, // Invalid
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, // Invalid
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } // Invalid
},
{ // Version 2
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, // Reserved
{ 0, 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 0 }, // Layer 3
{ 0, 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 0 }, // Layer 2
{ 0, 32, 48, 56, 64, 80, 96, 112, 128, 144, 160, 176, 192, 224, 256, 0 } // Layer 1
},
{ // Version 1
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, // Reserved
{ 0, 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 0 }, // Layer 3
{ 0, 32, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384, 0 }, // Layer 2
{ 0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448, 0 }, // Layer 1
}
};
// Sample rates - use [version][srate]
const uint16_t mpeg_srates[4][4] = {
{ 11025, 12000, 8000, 0 }, // MPEG 2.5
{ 0, 0, 0, 0 }, // Reserved
{ 22050, 24000, 16000, 0 }, // MPEG 2
{ 44100, 48000, 32000, 0 } // MPEG 1
};
// Samples per frame - use [version][layer]
const uint16_t mpeg_frame_samples[4][4] = {
// Rsvd 3 2 1 < Layer v Version
{ 0, 576, 1152, 384 }, // 2.5
{ 0, 0, 0, 0 }, // Reserved
{ 0, 576, 1152, 384 }, // 2
{ 0, 1152, 1152, 384 } // 1
};
// Slot size (MPEG unit of measurement) - use [layer]
const uint8_t mpeg_slot_size[4] = { 0, 1, 1, 4 }; // Rsvd, 3, 2, 1
uint16_t mpg_get_frame_size (char *hdr) {
// Quick validity check
if ( ( ((unsigned char)hdr[0] & 0xFF) != 0xFF)
|| ( ((unsigned char)hdr[1] & 0xE0) != 0xE0) // 3 sync bits
|| ( ((unsigned char)hdr[1] & 0x18) == 0x08) // Version rsvd
|| ( ((unsigned char)hdr[1] & 0x06) == 0x00) // Layer rsvd
|| ( ((unsigned char)hdr[2] & 0xF0) == 0xF0) // Bitrate rsvd
) return 0;
// Data to be extracted from the header
uint8_t ver = (hdr[1] & 0x18) >> 3; // Version index
uint8_t lyr = (hdr[1] & 0x06) >> 1; // Layer index
uint8_t pad = (hdr[2] & 0x02) >> 1; // Padding? 0/1
uint8_t brx = (hdr[2] & 0xf0) >> 4; // Bitrate index
uint8_t srx = (hdr[2] & 0x0c) >> 2; // SampRate index
// Lookup real values of these fields
uint32_t bitrate = mpeg_bitrates[ver][lyr][brx] * 1000;
uint32_t samprate = mpeg_srates[ver][srx];
uint16_t samples = mpeg_frame_samples[ver][lyr];
uint8_t slot_size = mpeg_slot_size[lyr];
// In-between calculations
float bps = (float)samples / 8.0;
float fsize = ( (bps * (float)bitrate) / (float)samprate )
+ ( (pad) ? slot_size : 0 );
// Frame sizes are truncated integers
return (uint16_t)fsize;
}