Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder? (Read 3102 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Hello there,

I'm encoding speech generated via neural Text-To-Speech (TTS) engines for use in web applications. The obvious choice is Opus, but having a MP3 fallback ensures basically 100% playback success with the <audio>-tag. So I spent some time to find pleasant settings for LAME encoding. Given that the TTS-engine generates 24 kHz mono PCM files, doing the encodes as 24 kHz MPEG-2 Layer 3 files should be obvious for some low-bitrate speech-only encodings.

An example file: https://maikmerten.de/public/mp3-decoder-tests/bla24-v7.mp3 (created using settings specified in https://maikmerten.de/public/mp3-decoder-tests/encode.sh)

I feel that results, overall, are pretty good. However, even when going less aggressive VBR settings I couldn't completely get rid of some high-frequency chirp noises (I find them easy to spot with headphones, not so much on speakers) when encoding 24 kHz MPEG-2 Layer 3 and assumed that this might stem from trying to preserve 12 kHz of audio bandwidth and perhaps running into some sfb21 shenanigans (adding -Y or not did not show any differences, though) or perhaps that LAME simply isn't tuned for 24 kHz.

Turns out that the encoding most likely is fine, but that ffmpeg (which I use for a quick command-line listen) apparently has a MP3 decoder that introduces the chirp artifacts...

This is the file above, decoded with ffmpeg: https://maikmerten.de/public/mp3-decoder-tests/decoded-ffmpeg.flac

This is the same file, decoded with LAME: https://maikmerten.de/public/mp3-decoder-tests/decoded-lame.flac

And this is a 10/10 ABX result comparing those two with the graphical "abx"-Tool available on Ubuntu Linux: https://maikmerten.de/public/mp3-decoder-tests/abx-result.png

(The chirp artifact is pretty obvious, e.g., around the 8.4 seconds mark)

The ffmpeg decoder doesn't appear to be entirely happy with the MP3 file and reports some warnings, such as:

Code: [Select]
[mp3float @ 0x7f3f20005b40] overread, skip -6 enddists: -1 -1=0/0   
[mp3float @ 0x7f3f20005b40] overread, skip -7 enddists: -5 -5=0/0  
[mp3float @ 0x7f3f20005b40] overread, skip -5 enddists: -2 -2=0/0  
[mp3float @ 0x7f3f20005b40] overread, skip -6 enddists: -1 -1=0/0

(Tested with current git)

Now, after this wall of text:

  • Can somebody confirm that ffmpeg and LAME produce perceivably different decoding results? Is this "normal" and to be expected, given that the MP3 specification doesn't expect bit-exact results?
  • Is the MP3 file perhaps malformed? I did tests with the "--strictly-enforce-ISO" LAME parameter, but that didn't resolve the issue.
  • Is this a case of a "bad"/buggy MP3 decoder? In 2022? Deployed basically everywhere (e.g., in Firefox and Chrome - exactly my use case)?

I'd love to gather some opinions before filing a "hard to hear, thus hard to reproduce" bug ticket for ffmpeg.

edit: Perhaps this is https://trac.ffmpeg.org/ticket/1958 - filed in 2012

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #1
Yes, it's easy to hear with headphones. I don't know which application is that for, but most apps today use ffmpeg for decoding, so it is a concern. But before delving into mpeg2layer3, why not AAC? Now it's quite old and widely supported, and should be better choice than mp3.
Error 404; signature server not available.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #2
I believe that's what YouTube does.  It uses Opus and falls back on AAC.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #3
Yes, it's easy to hear with headphones. I don't know which application is that for, but most apps today use ffmpeg for decoding, so it is a concern. But before delving into mpeg2layer3, why not AAC? Now it's quite old and widely supported, and should be better choice than mp3.

Thanks for having a listen and reporting!

Yeah, I did (and do) consider AAC, however ran into problems with the encoders available to me:

  • fdkaac: When encoding the generated speech (https://maikmerten.de/public/mp3-decoder-tests/google-cloud-tts.wav) in CBR mode, even at 40 kbit/s I notice very unpleasant to listen to high-frequency spikes. Basically, with LAME's VBR I seem to get better results than with fdkaac's CBR mode at similar bitrates, which isn't thaaat surprising when looking at how LAME uses a wide spread of frame bitrates to do a decent job. Sadly, fdkaac's VBR mode seems to be in an experimental state and even at lowest VBR quality, it produces an average bitrate of 71 kbit/s, well beyond what I need for really neat LAME encodes.
  • ffmpeg's native AAC encoder: In my tests (and according to ffmpeg's developers) this AAC encoder is basically always worse than fdkaac. At the sort of bitrates I'm looking for (around 40 kbit/s) it produces metallic smeariness no matter what.

So while AAC-LC on a format level should outperform MP3 in terms of capability, LAME's very mature VBR mode seem to keep MP3 competitive when comparing to AAC-LC CBR for my application, with AAC-LC VBR being "experimental" in both AAC encoders I looked at.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #4
^^ see if you can use qaac | qaac wiki

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #5
Try  --abr xx -q4  or --vbr-old -Vx -q4
wavpack -b3.63hhcs.5

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #6
After browsing around through the ffmpeg source and finding that nothing at all is special about the 24 kHz mode, aside from two tables with magic values for the various sample rates, I went to the ffmpeg IRC channel to discuss things. It turns out that one of the tables had two wrong values in it.

Patch by Paul B Mahol: http://ffmpeg.org/pipermail/ffmpeg-devel/2022-October/302928.html

For fun I also tested the ISO "dist10" reference MP3 implementation (which isn't exactly known to be bug-free). Turns out that the dist10 decoder produces chirping noises as well! So this might be a decades old bug, with wrong values (not code!) ending up in ffmpeg from a faulty reference implementation (an unproven theory) - because the original ISO specification is behind a steep paywall...

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #7
Great find! Thanks for digging into the issue and getting it fixed at last!

Quote
Is this a case of a "bad"/buggy MP3 decoder? In 2022? Deployed basically everywhere (e.g., in Firefox and Chrome - exactly my use case)?
Yes, why not?  :)  The 24 kHz version of MP3 seems to be rarely used and then the bug was not obvious - not a crash, just a subtle (?) audio distortion.
Now you can chase the devs of the web browsers and other apps to update their embedded copy of ffmpeg, to include this latest bugfix within the next few years...

Quote
Patch by Paul B Mahol: http://ffmpeg.org/pipermail/ffmpeg-devel/2022-October/302928.html
The new table values in this patch agree with the values in MAD (MPEG Audio Decoder)  https://sourceforge.net/projects/mad/files/libmad/0.15.1b/   (look at sfb_24000_long in layer3.c)

MAD was known for it's high-quality audio output. They even published some test results of compliance tests "conducted [...] in accordance with Annex A of ISO/IEC 11172-4"  https://www.underbit.com/resources/mpeg/audio/compliance
Back in the day, broken MP3 decoders were quite common.  :(
However, even these great MAD results are for MPEG-1 only. Your 24 kHz format is part of the MPEG-2 Audio extensions as defined in ISO/IEC 13818-3.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #8

For fun I also tested the ISO "dist10" reference MP3 implementation (which isn't exactly known to be bug-free). Turns out that the dist10 decoder produces chirping noises as well! So this might be a decades old bug, with wrong values (not code!) ending up in ffmpeg from a faulty reference implementation (an unproven theory) - because the original ISO specification is behind a steep paywall...

Good job! Makes me really curious how long the bug has been around and who or how it was introduced.

Really goes to show how rarely non-standard sample rates are used plus how most people listen very uncritically to audio and don't notice stuff like that. The intersection of those groups (you) seems pretty tiny.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #9
The 24 kHz version of MP3 seems to be rarely used and then the bug was not obvious - not a crash, just a subtle (?) audio distortion.
That's because 24 kHz is not a valid MP3 sampling rate. You get MP2 layer III with 22.1 and 24 kHz sampling rates, not MP3. MP3 supports only 32, 44.1, and 48 kHz.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #10
Yes, why not?  :)  The 24 kHz version of MP3 seems to be rarely used and then the bug was not obvious - not a crash, just a subtle (?) audio distortion.

Yeah, playback itself works, with some subtle birds chirping in the background, which at first I dismissed as "well, sounds like MP3 alright".

Now you can chase the devs of the web browsers and other apps to update their embedded copy of ffmpeg, to include this latest bugfix within the next few years...

I'm hopeful browser vendors will eventually pick this patch up when pulling in fresh ffmpeg versions. I might submit bug reports to keep track of things, though.

The new table values in this patch agree with the values in MAD (MPEG Audio Decoder)  https://sourceforge.net/projects/mad/files/libmad/0.15.1b/   (look at sfb_24000_long in layer3.c)

Thanks for checking!

MAD was known for it's high-quality audio output. They even published some test results of compliance tests "conducted [...] in accordance with Annex A of ISO/IEC 11172-4"  https://www.underbit.com/resources/mpeg/audio/compliance
Back in the day, broken MP3 decoders were quite common.  :(
However, even these great MAD results are for MPEG-1 only. Your 24 kHz format is part of the MPEG-2 Audio extensions as defined in ISO/IEC 13818-3.

Given that MPEG-2 Part 3 doesn't change the underlying algorithms, I'm pretty sure MAD is doing fine with MPEG-2 modes. I think VLC uses MAD - and it sounds fine.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #11
Good job! Makes me really curious how long the bug has been around and who or how it was introduced.

Really goes to show how rarely non-standard sample rates are used plus how most people listen very uncritically to audio and don't notice stuff like that. The intersection of those groups (you) seems pretty tiny.

The file with the non-standard table entries has a 2002 date, and the bug report about 24 kHz files artifacting was filed in 2012. The bug thus at least survived for 10 years, but might be as old as 20 years.

I think this was just somewhat bad luck. Back in the days of dial-up internet sample rates such as 22.05 and 24 kHz were somewhat popular for internet radio streams, but most people would have used something like Winamp or (later) VLC to listen to those (and neither were affected).

Firefox and Chrome picking up ffmpeg (and thus this somewhat subtle decoder bug) happened much later, with HTML5's <audio> tag. At that time, 44.1 and 48 kHz MP3 streams already were the norm.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #12
The file with the non-standard table entries has a 2002 date, and the bug report about 24 kHz files artifacting was filed in 2012. The bug thus at least survived for 10 years, but might be as old as 20 years.
Here's a trick: you can use git blame. github has a nice interface for that. Here: https://github.com/FFmpeg/FFmpeg/blame/master/libavcodec/mpegaudiodec_common.c#L372

As you can see, this patch was committed on the 16th of September 2001
Music: sounds arranged such that they construct feelings.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #13
Here's a trick: you can use git blame. github has a nice interface for that. Here: https://github.com/FFmpeg/FFmpeg/blame/master/libavcodec/mpegaudiodec_common.c#L372

As you can see, this patch was committed on the 16th of September 2001

Oh, that's indeed a neat trick. I wasn't aware of GitHub's blame feature and a bit too lazy to go the CLI route.


Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #15
Small update: Both Chrome and Firefox pulled upstream ffmpeg fixes. This means that starting a version or two in the future 24 kHz MP3 playback in those browsers will be fine.

I also tested Safari on iOS - that one was fine already. I guess they have their own QuickTime-or-whatver-it-is-called-nowadays decoder.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #16
The 24 kHz version of MP3 seems to be rarely used and then the bug was not obvious - not a crash, just a subtle (?) audio distortion.
That's because 24 kHz is not a valid MP3 sampling rate. You get MP2 layer III with 22.1 and 24 kHz sampling rates, not MP3. MP3 supports only 32, 44.1, and 48 kHz.

Given that you are specific about what is and what is not an MP3, I have to correct you.

MP3 is MPEG Layer III (three in roman numbers).
MP2 is MPEG Layer II (two in roman numbers). (The default audio format of Digital Video Broadcast (DVB))
MP1 did exist too, and is MPEG Layer I .( IIRC it was what Digital Compact Cassete (DCC) had, but I might be wrong here.)

MPEG 1, MPEG 2, MPEG 4, MPEG 7... are MPEG standards about different technologies, and sometimes, these revisions were used to increment the support of existing audio codecs.

Concretely, MPEG 1 Layer III ( what you called simply MP3 ) has support for sampling rates of 32Khz, 44,1Khz and 48Khz.
With MPEG 2, sampling rates of 16Khz, 22.05Khz and 24Khz where added to the existing Layer III audio format.
Not officially, (and that's why the name "MPEG 2.5" was used), the sampling rates of 8Khz, 11.025Khz and 16Khz were added too to the Layer III.

So yes, a 24Khz MP3 file is an MP3 file. It is not an MPEG 1 Layer III file, but it is an MPEG 2 Layer III one. And definitely not an MP2 file.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #17
Small update: Both Chrome and Firefox pulled upstream ffmpeg fixes. This means that starting a version or two in the future 24 kHz MP3 playback in those browsers will be fine..
Thanks for the update! I'm  surprised that they updated their ffmpeg copies so quickly.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #18
Thanks for the update! I'm  surprised that they updated their ffmpeg copies so quickly.

Well, for Chrome it *perhaps* helped that I pointed out a use case where a Google service delivers 24 kHz MP3 files (Google Cloud Text-To-Speech). For Firefox, it perhaps helped that Chrome already fixed their ffmpeg copy ;-)

I'm pretty sure that pointing to a ready-and-merged upstream patch accelerated things. If I had filed bugs "Hey <browser vendors>, 24 kHz MP3 files are slightly artifacty" with the expectation that Chrome/Firefox devs hunt down an upstream ffmpeg bug, this could have taken years.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #19
Given that you are specific about what is and what is not an MP3, I have to correct you.
It looks to me like you just rewrote what I said in a more verbose manner. My point was that 24 kHz is not a "rarely used" variant of MP3, because "MP3" is generally assumed to mean MPEG 1 layer III; rather, 24 kHz requires MPEG 2 layer III, which is not the same thing.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #20
because "MP3" is generally assumed to mean MPEG 1 layer III; rather, 24 kHz requires MPEG 2 layer III, which is not the same thing.

No it's not. mp3 is everything that can usually be decoded by the average mp3 decoder, which, at least in this millennium, always includes support for MPEG-2 Layer III.

Learn to admit when you're wrong.

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #21
mp3 is everything that can usually be decoded by the average mp3 decoder, which, at least in this millennium, always includes support for MPEG-2 Layer III.
I believe I was thinking about this post, which claims that MPEG-2 layer III does not have the same universal compatibility as MPEG-1 layer III. Are you saying that's not the case? Celona seems to be implying that you would only get MPEG-2 layer III decoding on a device new enough to support AAC, but maybe that's incorrect. My car stereo supports data CDs with MPEG-1 layer III or WMA, but I've never tested to see if it will play MPEG-2 layer III.

Since most music uses sampling rates supported by MPEG-1 layer III, I wouldn't be at all surprised to see some cheap hardware players without support for MPEG-2 layer III. If I'm not mistaken, the MPEG-2 layer III sampling rates were added a few years after the MPEG-1 layer III specification was released, so that may give some hardware makers wiggle room to claim that they support MP3 when they only support MPEG-1 layer III.

I do accept JAZ's correction that I should not have called MPEG-2 layer III "MP2", since MP2 is a different format entirely. I also accept the correction that MPEG-2 layer III falls under the umbrella term "MP3". I'd still like to know if an MP3 player is required to support both types, because even though MPEG-2 layer III was added to the specification many years ago, that doesn't necessarily mean hardware players will always handle it. Even though both types are technically MP3 on paper, that doesn't necessarily help if a device can claim MP3 support when it only supports MPEG-1 layer III. That's what I was trying to get at when I said they're not the same thing, even though I was wrong on some points.

Learn to admit when you're wrong.
There's no need to get huffy.


 

Re: ffmpeg: A buggy MP3 (MPEG-2 mode) decoder?

Reply #23
"The ISO standard ISO/IEC 11172-3 (a.k.a. MPEG-1 Audio) defined three formats: the MPEG-1 Audio Layer I, Layer II and Layer III. The ISO standard ISO/IEC 13818-3 (a.k.a. MPEG-2 Audio) defined extended version of the MPEG-1 Audio: MPEG-2 Audio Layer I, Layer II and Layer III. MPEG-2 Audio (MPEG-2 Part 3) should not be confused with MPEG-2 AAC (MPEG-2 Part 7 – ISO/IEC 13818-7)"

"MPEG-2.5 Audio Layer III    nonstandard, proprietary"