Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Topic: Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV' (Read 8516 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

2010-05-18 18:41:37

I'm using lame.exe 3.98.4 to convert a 320kbps mp3 to V4 mp3. Stupid or not, that's what I'm doing

I can feed lame.exe with the 'original' mp3 OR decode it to WAV first and then feed it to lame.exe. I get a different file depending on this.
If I feed it the MP3 directly, the resulting file is larger and has a bitrate 3 or 4 kbps higher than when I feed it the WAV.

By the way, it seems RazorLame employs the 'feed the mp3' method and foobar uses the'intermediate wav', or at least that's what it seems looking at the oputput files I get from each.

foobar has me intrigued too ... if I feed it the wav, the output is identical to the one I get from lame.exe (foobar is using lame.exe after all) ... BUT ... if I feed it the mp3 ... the result is bigger and has more bitrate than the result I get from converting the WAV ... but smaller and with less bitrate than the one I get from feeding the mp3 directly to lame.exe (and it shouldn't be, as foobar uses lame.exe ... )

So .... I have two questions :

1. Regarding the 'feed the mp3' vs 'feed the WAV' to LAME ... I don't know what's the preferred method, or if there's any advantage using one or the other. The differences are minimal, but the fact is there are differences... so I was wondering if someone could shed some light on the issue.

2. Regarding foobar ... why when I feed it the mp3 the result I get is not the same I obtain from feeding the mp3 directly to lame.exe via command line ?? Does it employ a diferent decoder than lame.exe to get a WAV and then feed the wav to lame.exe ??

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #1 – 2010-05-18 18:51:03

Different MP3 decoders produce slightly, and almost certainly totally inaudibly, different results--due to slight rounding/computation differences (whether in software or hardware), etc. I haven't read any tests or comparisons of decoders, but I don't think that any is objectively best/worst, and I assume that LAME's is perfectly capable.

However, you didn't specify how you were decoding: Was it via LAME or another program? Moreover, I'm curious about your experience/claims regarding foobar2000, though [JAZ] suggested some good possible reasons that I hadn't considered. (As for hip_decode, I think it may just be a function name in the code, but I'm no authority!)

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #2 – 2010-05-18 19:01:52

I can't tell you the exact reason of this variability (even 3 or 4kbps is more than I would expect, but I would expect differences after all)*

mp3 -> lame -> mp3

This situation makes lame uses its internal decoder. I am unsure at which bitdepth it is decoding (16bits, 24bits, 32bit float?). Any mp3 decoder should produce very similar results to the reference decoder. The only notable differences that there have been over time have been bugs, differences due to using a higher bitdepth and or dithering the output.

wav -> lame -> mp3

This situation makes lame use what it is feed. The wav usually will be 16 or 24bits. This won't be exact to the output of the decoder, but differences, as i said, should be minimal (rarely ABXable and not discernable by the encoder)

xxxx -> foobar -> lame -> mp3

You should check how you have configured foobar for encoding. With lossy encoders, it generally outputs at the highest bitdepth possible (24bit, 32bit..).
Also, there is the posibility that you have enabled dithering the signal, which will undoubtely change the output generated. Of course, these settings can be disabled and/or changed, in which case you should be able to recreate any of the above situations.

*I think they recently made changes to the decoder. (They have had a fork of libmp3 for a long time IIRC. Recently i've heard it called hip decoder. Definitely not MAD decoder).

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #3 – 2010-05-18 19:25:41

1) LAME 3.98.x doesn't use ENC_DELAY and ENC_PADDING values when it transcodes MP3->MP3. This was fixed in 3.99 branch.

2) lame decodes mp3 file to 16-bit WAV file. Foobar2000 decodes mp3 to 24-bit wav file (and then sends it to LAME).

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #4 – 2010-05-18 20:15:45

Quote from: lvqcl on 2010-05-18 19:25:41

1) LAME 3.98.x doesn't use ENC_DELAY and ENC_PADDING values when it transcodes MP3->MP3. This was fixed in 3.99 branch.

2) lame decodes mp3 file to 16-bit WAV file. Foobar2000 decodes mp3 to 24-bit wav file (and then sends it to LAME).

Thanks dv1989, [JAZ] and lvqcl for your answers. I'm afraid I didn't explained myself well enough. The settings used for creating the mp3 were exactly the same - at least they are if foobar doesn't pass a hidden command line option along mine.

[JAZ] answer makes a lot of sense and lvqcl explanation has fully explained what the problem was, I believe. Please allow me to recapitulate ...

1.)

wav -> lame -> mp3
wav -> foobar (lame) -> mp3

These always gave me identical results, as the only step was sending the wav to lame.exe (directly or via foobar).

2.)

mp3 -> lame -> mp3
mp3 -> foobar(lame) -> mp3

These had to be different because lame uses internal decoder and foobar uses a different external library for the process. The output can't be exactly the same.

3.) I'm not sure about this point.

Method A :

Step 1: mp3 -> lame -> wav
Step 2: wav -> lame -> mp3

Method B :

Step 1: mp3 -> lame -> mp3

As both methods use the internal LAME decoder there shouldn't be any differences... but there are.

Excuse me for being so dense but ... Does the ENC_DELAY and ENC_PADDING problem lvqcl pointed before account for these differences then ?

& Thanks for helping me understand this.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #5 – 2010-05-18 20:34:45

Quote from: bokeron on 2010-05-18 20:15:45

As both methods use the internal LAME decoder there shouldn't be any differences... but there are.

[JAZ] explained this, at least by implication.

Quote from: [JAZ] on 2010-05-18 19:01:52

mp3 -> lame -> mp3

This situation makes lame uses its internal decoder. I am unsure at which bitdepth it is decoding (16bits, 24bits, 32bit float?).
…
wav -> lame -> mp3

This situation makes lame use what it is feed. The wav usually will be 16 or 24bits. This won't be exact to the output of the decoder, but differences, as i said, should be minimal (rarely ABXable and not discernable by the encoder)

In other words, decoding to WAV will produce 16-bit audio, whereas when passing itself audio internally, LAME probably uses a higher bitdepth.

By extension, I don't think the difference is due to delay or padding.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #6 – 2010-05-18 22:31:13

Quote from: dv1989 on 2010-05-18 20:34:45

In other words, decoding to WAV will produce 16-bit audio, whereas when passing itself audio internally, LAME probably uses a higher bitdepth.

By extension, I don't think the difference is due to delay or padding.

It makes a lot of sense. If someone could confirm the bitdepth LAME uses internally then all would fall into place nicely.

So ... taking all of the above into account, what process would make more sense ?
I gues there must be no audible difference whatsoever, so the preferred method ought to be the one you're more comfortable with, but still ... any particular reason to use one vs the other ?

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #7 – 2010-05-18 23:13:16

Quote

If someone could confirm the bitdepth LAME uses internally

16 bits. LAME decodes MP3 to 16-bit PCM, then encodes it.

Any MP3 encoder adds a small amount of silence to the beginning and the end of a track. LAME can trim these extra samples, and decoded WAV file will have the same length as original WAV. But 3.98 version doesn't do this for direct MP3->MP3 transcoding.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #8 – 2010-05-19 02:54:25

Quote from: lvqcl on 2010-05-18 23:13:16

Any MP3 encoder adds a small amount of silence to the beginning and the end of a track. LAME can trim these extra samples, and decoded WAV file will have the same length as original WAV. But 3.98 version doesn't do this for direct MP3->MP3 transcoding.

So ... if internally lame decodes to 16 bit and when I ask it to output a wav it also decodes to 16 bit ... then the only difference in processing according to what I understand from your explanation is just this extra samples that doesn't get cut when doing mp3 to mp3 but they do get cut when doing the conversion in two steps, mp3 to wav, then wav to mp3.

Can this extra samples difference affect the 'weight' of the file by a margin of 50KB or change the resulting VBR bitrate by about 3kbps ?

Thanks, once again, to all for your patience

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #9 – 2010-05-19 08:56:20

They don't "[not] get cut"; they get added! I'm also curious about the larger file size; I wouldn't think delay and padding would make such a difference.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #10 – 2010-05-19 10:14:17

Quote from: dv1989 on 2010-05-19 08:56:20

They don't "[not] get cut"; they get added! I'm also curious about the larger file size; I wouldn't think delay and padding would make such a difference.

Maybe different frame boundaries?

Quote from: descrates on 2010-05-19 09:10:00

mp3 -> audacity (libmad) -> wav (32-bit float) -> lame (w/ float support) -> mp3

foobar2000 + lame with float support.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #11 – 2010-05-19 14:45:33

Quote from: lvqcl on 2010-05-18 19:25:41

1) LAME 3.98.x doesn't use ENC_DELAY and ENC_PADDING values when it transcodes MP3->MP3. This was fixed in 3.99 branch.

2) lame decodes mp3 file to 16-bit WAV file. Foobar2000 decodes mp3 to 24-bit wav file (and then sends it to LAME).

The 16 or 24 bit decoding wouldn't make much difference, I think.
So, the explanation is the different usage of enc_delay and padding, indeed.

Here is a similar thread about it, where people noticed the same peculiar differences in re-encoded mp3s.

The explanation of this is that when an mp3 encoder is re-encoding an mp3 file to a second-generation VBR mp3, the resulting average bitrate will vary depending on the sample offset of the input file.
That is, you take an mp3, decode it to wav, add a few extra samples of null-padding, re-encode it to VBR mp3 -- and the bitrate will be different depending on how many padding samples you added. Weird, huh?

The bitrate is usually lower, when this offset (padding) is exactly aligned with the mp3 framing. Funny, but true: when encoding a second-generation mp3, the encoder can 'feel' the presence of the previous mp3 encoding within the wav file. And when this previous mp3 framing is exactly aligned with the second-generation framing, the resulting bitrate is lower.

Lame MP3 re-encoding : 'Direct' vs 'Intermediate WAV'

Reply #12 – 2010-05-19 14:48:33

Quote from: lvqcl on 2010-05-19 10:14:17

Maybe different frame boundaries?

Yes! Exactly.

Notice