HydrogenAudio

Knowledgebase Project => Wiki Discussion => Topic started by: mjb2006 on 2012-07-10 09:06:52

Title: LAME article edits
Post by: mjb2006 on 2012-07-10 09:06:52
Another round of updates (http://wiki.hydrogenaudio.org/index.php?title=LAME&action=historysubmit&diff=23217&oldid=23120) to the article on LAME (click for current version) (http://wiki.hydrogenaudio.org/index.php?title=LAME):

Please note your concerns or fix whatever I broke...
Title: LAME article edits
Post by: greynol on 2012-07-10 16:13:40
Well done!

Regarding 320 vs VBR, IIRC there simply isn't an abundance of evididence showing one to be better than the other, rather depending on the sample and listener, either may show superiority when a difference can be demonstrated.

Is it worth revisiting whether it should be mentioned that the -V setting accepts fractional values?
Title: LAME article edits
Post by: lvqcl on 2012-07-10 16:33:09
Quote
Is it worth revisiting whether it should be mentioned that the -V setting accepts fractional values?

...and -V 9.999 is the lowest quality VBR setting (LAME 3.98+):

Quote
Usage: -V<number> where <number> is 0-9, 0 being highest quality, 9 being the lowest.




Also, lowpass values in "Technical information" section are valid for 3.98 (and for 44.1kHz input), but not for 3.99.
Title: LAME article edits
Post by: [JAZ] on 2012-07-10 20:07:25
Hello.

Do you want me to update the lame documentation in some way? Since I used back then part of the wiki , I guess it might need an update too.
Title: LAME article edits
Post by: greynol on 2012-07-10 20:15:02
Yes, please.  The discussion that led to these changes included the lame documentation.
Title: LAME article edits
Post by: Dynamic on 2012-07-10 20:41:53
Good work, mjb

I noticed that the Bit Reservoir (http://wiki.hydrogenaudio.org/index.php?title=Bit_reservoir) article needed a bit of work in the example section to avoid confusion of bits per frame with kbps, so I believe that's now fixed and essentially correct within rounding error.

Returning to LAME:

Regarding this bit of wording within section 'Recommended encoder settings':
CURRENT
Quote
Maximum quality is achieved when, regardless of listening conditions, you are unable to detect a difference between the MP3 and the original. As demonstrated by blind ABX tests, LAME-encoded MP3s typically achieve this level of transparency when encoded with the default settings, at bitrates well below maximum. Encoding with other settings will have no effect on the quality.


That last sentence isn't completely clear to me, as you intimated, mjb. I'm tempted to modify as follows. Any comments on accuracy/clarity/length?

I suspect my second paragraph should appear near the end of the grey box or outside it.

My PROPOSED
Quote
Maximum quality is achieved when, under optimal listening conditions (e.g. headphones in a quiet environment), you are unable to detect a difference between the MP3 and the original. As demonstrated by blind ABX tests, LAME-encoded MP3s typically achieve this level of transparency when encoded with the standard settings, producing typical bitrates well below maximum (for example -V2 to -V3, since LAME was historically tuned for transparency and to address problem samples at -V2). Once transparency is achieved, higher settings (lower -V values) will not produce meaningfully higher quality.

The VBR scale has been carefully optimized, and permits the use of fractional values (-V9.999 being its lowest quality/bitrate setting), though some graphical user interfaces to LAME VBR restrict selection to the commonly-used integer values only. A change in the -V value (down or up) will respecitively raise or lower the average bitrate, and will also raise or lower the quality unless the threshold of transparency has been exceeded. As a very mature, well tuned, quality-oriented encoder, LAME already has the best 'commandline tweaks' and internal optimizations already built in to its VBR scale. Any further commandline switches are likely to degrade quality, to waste bits or to provide less bang for you bitrate than simply adjusting the -V value for the equivalent bitrate change. The lure of 'secret expert settings' can be strong, but the advantage of 'commandline tweaks' usually tends to vanish when subjected to ABX testing.


BTW, the Y-switch is mentioned, linked to the definition page. That's probably the one commandline switch which is the exception to commandline switches wasting bits and providing less bang-for-your-bitrate than changing the -V value and the one commandline tweak with any merit.

Should we also address Joint Stereo myths, stating that LAME uses only safe Joint Stereo, not Intensity Stereo, with a link to the Joint Stereo page. The myth of unsafe Joint Stereo seems to stem from earlier Fraunhofer encoders using IS.

Another myth we could debunk is that of re-encoding and already lossy source to a higher quality setting (i.e. we know this as transcoding, but a newbie won't).

It might be worth including a section on Myths and Misinformation Explained, as the LAME page is a very useful resource for newcomers or a search result for getting into the HydrogenAudio site. Another myth is that MP3 is always or frequently audibly distinguishable from the original CD or WAV. As we know, at Lame -V2 or -V3 it's actually rare to tell them apart in proper level-matched double-blind tests (e.g. ABX), even for expert listeners.
Title: LAME article edits
Post by: mjb2006 on 2012-07-10 23:28:24
Regarding 320 vs VBR, IIRC there simply isn't an abundance of evididence showing one to be better than the other, rather depending on the sample and listener, either may show superiority when a difference can be demonstrated.

Very well stated. Since it's a recurring topic, I would like to see this addressed. However, I am also wondering if anything not specifically related to LAME really belongs in the LAME article. "-b 320 vs. -V 0" is more generally "CBR 320 vs. highest-quality VBR", which could be in its own article, and we can link to it.

As for Dynamic's proposed text, I would tighten up the second paragraph a bit, but I totally support the idea behind the additional text: steering people away from "advanced settings" that they think can be used to coax higher quality out of the encoder.

Our recommended settings are addressing the users' #1 concern—what settings will give them "maximum quality"—so I do feel we're obligated to define maximum quality in terms of transparency. One thing I'm unsure of re: transparency is whether it depends on the listening conditions being 'optimal'.

Transparency is simply achieved when you can't tell the difference, right? So if the landscaping crew outside is making it such that you can't tell the difference between a -V 8 MP3 and a CD, then -V 0 through -V 8 are effectively all the same; they're all "maximum quality". But naturally, if you're going to also be listening to these same MP3s in your man-cave in the wee hours, headphones on, no noise but your computer's fans and hard drives, then the threshold of transparency is probably a bit higher. Thus, the recommendation would be to encode using the settings that provide transparency under optimal conditions (and you know everyone's going to go one higher out of paranoia). But we shouldn't say that transparency/"maximum quality" is only achieved under those conditions.

In other words, by making transparency contingent on listening conditions being optimal, we imply that only "-V 0 to -V 3" is transparent. It follows that "-V4 to (whatever)" is not transparent. This undermines our recommendations to use those lower settings for portable players/noisy environments. For portable/noisy, the lower settings are highly likely to be transparent.

That said, we are still suggesting that -V 0 is potentially better than -V 1, and that's potentially better than -V 2, and so on. The fact that some of the settings are indistinguishable doesn't really matter to the user who wants maximum quality. They're going to pick the potentially best settings, and they're not interested in a lecture about how, technically, they could just as well choose lesser settings and be completely unable to tell the difference. *sigh*
Title: LAME article edits
Post by: lvqcl on 2012-07-10 23:47:37
Quote
"-b 320 vs. -V 0" is more generally "CBR 320 vs. highest-quality VBR"

I'm not sure. Rather, it is "one high-bitrate setting vs. another high-bitrate setting".

Quote
That said, we are still suggesting that -V 0 is potentially better than -V 1, and that's potentially better than -V 2, and so on.

The higher the bitrate, the more you can do with MP3... I mean cases like this:

I heard a DJ last week who should have used lossless. He was DJing for a kids dancing competition. His CD player failed to read one of the kid's CDs. No problem - he had the same track on his laptop. Problem was, the kid was using the version without vocals (for reasons that will become apparent), and he only had the vocal version. Ah, no problem again - the vocal cut feature in the software would sort that. If only it hadn't been an mp3. Vocal cut only works (sometimes) on the highest quality mp3s, and this one wasn't. The vocal bled through as horrible mp3 artefacts. It was so bad that he gave up and switched the vocal cut off. Just at the point where the lyrics said something like "...and you're no fucking use to me..." - as the five year old girl continued through her dancing routine. I doubt they'll be using that DJ again.

Title: LAME article edits
Post by: greynol on 2012-07-11 00:01:20
The higher the bitrate, the more you can do with MP3...
You have an excellent point, though that specific case may have just as easily benefited from having all frames encoded as stereo if they weren't already.

My $0.02...
This should be considered a truism by our community: given any specific circumstance, if two mp3s deliver transparent results then they are of equal quality regardless of how they were encoded.  So long as this point gets across, it doesn't matter to me whether 320 CBR vs. highest bitrate VBR is addressed specifically.

Thus, the recommendation would be to encode using the settings that provide transparency under optimal conditions (and you know everyone's going to go one higher out of paranoia). But we shouldn't say that transparency/"maximum quality" is only achieved under those conditions.
I agree completely.

The fact that some of the settings are indistinguishable doesn't really matter to the user who wants maximum quality. They're going to pick the potentially best settings, and they're not interested in a lecture about how, technically, they could just as well choose lesser settings and be completely unable to tell the difference. *sigh*
So long as the reader is knowingly discarding sane advice rather than being misled or feeling proselytized to then we've done our part.
Title: LAME article edits
Post by: godrick on 2012-07-11 01:57:51
A suggestion to help people like me not completely conversant on how all of settings of previous versions of LAME relate to the latest version of LAME:

I understand from posts 9 and 10 at http://www.hydrogenaudio.org/forums/index....showtopic=49380 (http://www.hydrogenaudio.org/forums/index.php?showtopic=49380) that the need for -q settings for VBR have changed since 3.97 or thereabouts.  Clarifying the meaning (if any) of -q settings with VBR before and after whatever version breakpoint of any difference would be insightful.  If such clarification belongs on a different page, that seems OK as long as there is a reference and link to that page.  Alternatively, if all settings from versions before 3.99.5 are irrelevant and the existing wiki page is comprehensive in everything one must know to understand all the optional settings of the latest version, then stating that would be helpful.  Thanks for updating the wiki!
Title: LAME article edits
Post by: Remedial Sound on 2012-07-11 03:39:07
Firstly thank you mjb for your time and effort!

The only suggestion I'll make is to keep in mention of the mono switch, which can be quite handy in keeping the bitrate down when encoding spoken content & audiobooks (e.g., using the unofficial preset voice: --abr 56 -mm).  I know other codecs do low bitrates more efficiently but for compatibility this performs to quite acceptable levels.
Title: LAME article edits
Post by: shadowking on 2012-07-11 10:30:42
I think just get rid of any mention to -b320 . I am sure users who desire it know what to do.
Title: LAME article edits
Post by: shadowking on 2012-07-11 11:20:58
I say also that the only thing we know for sure scientificaly is that V5 is perfect or close to for most people in listening tests. Everything else is conjecture, speculation or isolated test samples by individuals. We also know below 120k VBR is no good.  So : V6 - V0 are useable and V5 is a good starting point. Then it is up to the user to decide the level. I see way too many 320 cbr encodes all over the net for no good reason even in legit sites selling mp3 and the anything less than 256k must be no good.
Title: LAME article edits
Post by: greynol on 2012-07-11 19:28:08
isolated test samples by individuals

Which is exactly what a listening test is.

I could ABX -V5 just about every time I tried, thank you very much.
Title: LAME article edits
Post by: [JAZ] on 2012-07-11 19:41:24
Allright, this is how it is now: http://lame.cvs.sourceforge.net/viewvc/lam...ml?revision=1.4 (http://lame.cvs.sourceforge.net/viewvc/lame/lame/doc/html/usage.html?revision=1.4)
Title: LAME article edits
Post by: greynol on 2012-07-11 20:16:55
Great job, thanks!

BTW, how does one actually navigate to that page without a direct link?
Title: LAME article edits
Post by: robert on 2012-07-11 21:31:33
lame.sf.net (http://lame.sf.net) -> Using LAME (http://lame.sourceforge.net/using.php) -> Program documentation (in CVS) (http://lame.cvs.sourceforge.net/viewvc/lame/lame/doc/html/index.html)
Title: LAME article edits
Post by: [JAZ] on 2012-07-11 21:39:30
If you mean the CVS, the root is:

http://lame.cvs.sourceforge.net/viewvc/lame/ (http://lame.cvs.sourceforge.net/viewvc/lame/) . Opening the document is done by clicking on the version number instead of the name, and then clicking download.

And to know how to go to the cvs root, that's found going to the sourceforge page of lame, and then in the horizontal menu bar, to "Code" and "CVS Browse".

Edit: From a user point of view, robert's answer is much better..
Title: LAME article edits
Post by: mjb2006 on 2012-07-11 23:11:51
It was pointed out to me that we're inconsistent in whether we put a space between the "-V" and the number which follows. I know conversationally we like to omit the space, but in the docs and recommendations maybe we shouldn't, just for consistency with the other options.
Title: LAME article edits
Post by: greynol on 2012-07-12 00:56:35
In conversation it seems like using the space only adds confusion, but maybe this impression directly stems from the two people who confused flac settings with lame settings just last week.
Title: LAME article edits
Post by: AliceWonder on 2012-07-16 11:25:04
With respect to higher bitrate when you are already beyond transparency, I believe another case in addition to DJ removing vocals is changing of the beats per minute.
Apparently, and I can't back this up with a test but apparently software that detects BPM so that it can be altered is sometimes fooled by lossy source and is more likely to be accurate at higher bitrate encodings even if the human ear can't tell the difference.

I personally have no need to ever do that but I have seen that discussed by people who at least sounded like they had actual experience with it.
Title: LAME article edits
Post by: mjb2006 on 2012-07-16 14:57:27
That sounds reasonable; higher-bitrate MP3s are more likely to preserve high-frequency content, and tempo detection is likely easier if the high-frequency content is preserved; all that's normally up there, aside from noise, is the upper harmonics of percussion instruments. But as far as recommending bitrate settings in LAME, I don't know if we should mention it. Telling people to take into consideration content that's only "audible" to BPM detection software, when MP3 and LAME are designed with human hearing in mind, seems counterproductive.
Title: LAME article edits
Post by: mjb2006 on 2014-09-11 01:18:51
I made more edits to the LAME article today. Aside from minor copy edits and table formatting improvements, there are some substantive changes:

Title: LAME article edits
Post by: Jan S. on 2014-09-11 10:24:36
Thank you very much. Nice work!
Title: Re: LAME article edits
Post by: mjb2006 on 2016-08-24 18:33:24
More edits made today (diff (http://wiki.hydrogenaud.io/index.php?title=LAME&diff=27042&oldid=26650)):

Quote
When the input sample rate is greater than 48 kHz, LAME will resample it to a maximum of 48 kHz (the maximum supported by MP3). In VBR modes 7 to 9.999, and at CBR bitrates below 104 kbps, the input is resampled to 32000, 24000, 22050, 16000, 12000, 11025, or 8000, depending on the target quality level or bitrate.

Since it is required when resampling, a filter is always applied to remove frequencies above one-half the sample rate. The lowpass info above is indicating whether any additional filtering is done.

LAME's internal resampler is not ideal.[6] (https://hydrogenaud.io/index.php/topic,97479.0.html) If resampling is needed, better results (especially when targeting low bitrates) can be obtained by using a high-quality sample rate converter, such as SoX or SSRC.
Suggestions for improvements welcome.