HydrogenAudio

Lossy Audio Compression => MP3 => MP3 - Tech => Topic started by: windsmurf on 2010-09-22 12:19:44

Title: the gargling soprano
Post by: windsmurf on 2010-09-22 12:19:44
Hi,

I filed a bug in the LAME project but i thought i'd try here too as it affects all apps that use LAME to convert to mp3 on a Mac.

I have here a

-a pcm wave original,
-the command line for all conversions
-a conversion using LAME 3.9.6 on a power pc mac (sounds GREAT)
-a conversion using LAME 3.9.7 on an intel mac (gargles)
-a conversion using LAME 3.9.7 on a power pc mac, (gargles)
-a conversion using LAME 3.9.8 on an intel mac (gargles)
-a conversion using LAME 3.9.8 on a power pc mac, (gargles)

Let's call it the gargling soprano problem. Is there a way for me to upload the files here so you guys can listen to them too ?

I used

/usr/local/bin/lame -h --cbr -b 128 -m m --nores --ignore-tag-errors "O Mio.wav" "O Mio.mp3"

to make all the conversions.

The problem occured to me because several third party software on the Mac uses LAME to export to mp3 (Digital Performer, Barbabatch and a lot more) , and all of those sound like, well, unprofessional when the LAME version is higher than 3.9.6
Of course 3.9.7 is the first one on a mac that supports INTEL processors, so basically everything on modern Macs converted using LAME sounds this way, and indeed I hear the artifact everywhere on the web.

Hope anyone can address it



Title: the gargling soprano
Post by: Slipstreem on 2010-09-22 12:57:36
Welcome aboard!

What happens if you try the latest recommended version, 3.98.4 (http://www.rarewares.org/mp3-lame-bundle.php)?

Also, does it make any difference if you drop the -h, -m, and --nores switches? LAME is tuned to work optimally with no additional quality-related switches, so I wonder if you may be inadvertently breaking it. Ideally, the source file for test encodes needs to be at a samplerate of 44.1kHz as that's where LAME has received the most tuning. Any decent sounding CD track will do.

I've never uploaded any files here, but there is definitely an upload section somewhere if you look around. I can't imagine encodings made with obsolete versions of the encoder attracting a great deal of interest as they're never going to be updated, but there's no harm in uploading them. Please keep any uploads down to a maximum of 30 seconds of play time each to comply with forum 'fair usage' terms.

You may also want to look into VBR encoding unless you plan on using CBR at 320kbps as LAME is heavily tuned for VBR encoding. CBR at anything less than 320kbps strangles the encoder, in a manner of speaking, as it denies the encoder the opportunity of using 320kbps blocks as and when required for high-quality encoding. VBR allows the encoder to decide what block size to use dynamically on-the-fly to achieve the user's desired target quality level.

There's more information on recommended LAME usage HERE (http://wiki.hydrogenaudio.org/index.php?title=Lame).
Title: the gargling soprano
Post by: DonP on 2010-09-22 13:34:23
-a pcm wave original,
-the command line for all conversions
-a conversion using LAME 3.9.6 on a power pc mac (sounds GREAT)
-a conversion using LAME 3.9.7 on an intel mac (gargles)
-a conversion using LAME 3.9.7 on a power pc mac, (gargles)
-a conversion using LAME 3.9.8 on an intel mac (gargles)
-a conversion using LAME 3.9.8 on a power pc mac, (gargles)

Let's call it the gargling soprano problem. Is there a way for me to upload the files here so you guys can listen to them too ?


Just to cover all the bases.... have you tried the same version/sample encoded on a PC, and have you tried listening to the gargling output on some other device (PC, Ipod, ...)?
Title: the gargling soprano
Post by: Alex B on 2010-09-22 14:00:12
Here is a past report of the obvious CBR 128 kbps regression in LAME 3.97: http://www.hydrogenaudio.org/forums/index....st&p=537861 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=537861)

The discussion continues in that thread. I picked a few replies:

http://www.hydrogenaudio.org/forums/index....st&p=537872 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=537872)
http://www.hydrogenaudio.org/forums/index....st&p=537905 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=537905)
http://www.hydrogenaudio.org/forums/index....st&p=537923 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=537923)
http://www.hydrogenaudio.org/forums/index....st&p=538286 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=538286)
http://www.hydrogenaudio.org/forums/index....st&p=538295 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=538295)
http://www.hydrogenaudio.org/forums/index....st&p=538310 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=538310) (a link to a sample is in this reply)
http://www.hydrogenaudio.org/forums/index....st&p=541024 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=58385&view=findpost&p=541024)

Possibly you are hearing a similar artifact. I wonder if this issue has been adressessed anyhow in the later LAME versions.


A forum search found also these threads:
- A report of CBR 128 kbps regression: http://www.hydrogenaudio.org/forums/index....showtopic=62953 (http://www.hydrogenaudio.org/forums/index.php?showtopic=62953) . Unfortunately the OP never provided ABX reports and samples.
- A very old report of low bitrate CBR/ABR regression in 3.96.1: http://www.hydrogenaudio.org/forums/index....showtopic=28910 (http://www.hydrogenaudio.org/forums/index.php?showtopic=28910)
Title: the gargling soprano
Post by: Alex B on 2010-09-22 14:14:42
/usr/local/bin/lame -h --cbr -b 128 -m m --nores --ignore-tag-errors "O Mio.wav" "O Mio.mp3"

I just reread your command line.

You didn't explain why you use -m m. Is the source audio already mono? Usually people prefer to use the mono switch only if they have to encode at about 64 kbps or lower.

In any case you shouldn't use --nores. Disabling the bit reservoir feature decreases quality.
Title: the gargling soprano
Post by: mixminus1 on 2010-09-22 15:00:52
Drop the "-h" - I found at least one sample awhile back (search for "herding_calls") where -h w/CBR in 3.97 made things worse.

Given that the herding_calls sample is kinda-sorta a female soprano voice,  I wouldn't be at all surprised if reducing your command line to just "lame --ignore-tag-errors input.wav output.mp3" improves things (if you don't pass any parameters, LAME's default encoding is CBR 128 with "normal" quality, which is what you should be using).

Be aware, though, that there may still be audible artifacts (it's been my experience that the overall performance of CBR in 3.97 and 3.98 is worse than 3.96.1, at least at lower bitrates such as 128) - unless you must stay at CBR, going to VBR at something like -V5 with the latest version of LAME (3.98.4) is pretty much guaranteed to give you much better results.
Title: the gargling soprano
Post by: windsmurf on 2010-09-22 15:21:50
Thank you guys, you obviously know a lot. Recap: 3.9.6 Constant Bit Rate (CBR) encodes beautifully, every later LAME version gargles.

My (CBR) examples will be available for two weeks right here:
https://www.wetransfer.com/dl.php?code=GG9t...b8c4b512f1cd0b2 (https://www.wetransfer.com/dl.php?code=GG9thouE&hash=4808090263f872f1d24c2336a143b3ed94f31e83470331703e67fef2c7a5a45b8c4b512f1cd0b2)

I tried the suggested nores flag off, same result: gargling.
But: The VBR flag did it. Turn VBR on, and the gargling is gone.

So i am happy, but i do think it is a problem that CBR failed. CBR in LAME 3.9.6 seemed to work wonderfully, and to my ears better than VBR today. CBR, historically, is the more compatible format. Some hardware cannot play VBR (although that hardware is probably slowly going through it's battery nowadays). That type message sticks in the minds of the pro's: stay away form VBR. I guess that's why you still often hear LAME CBR gargling.

The earlier thread you posted from 2007 sums it up nicely, so that was spot on as well:

http://www.hydrogenaudio.org/forums/index....st&p=537861 (http://www.hydrogenaudio.org/forums/index....st&p=537861)

This has been going on for 3 years now, and it is clearly audible here and there on the web once you heard the problem as abvious as in my examples.

Listen to the files if you are interested. It is not subtle.

I will try and update my filed bug at the LAME project.

Title: the gargling soprano
Post by: Alex B on 2010-09-22 16:07:00
Thanks for the samples. So the source file is mono and, as you said, the problem is not subtle.

I tried the latest Rarewares LAME 3.98.4 compile for Windows using the plain -b 128 switch and it does not produce this artifact. ( http://www.rarewares.org/mp3-lame-bundle.php (http://www.rarewares.org/mp3-lame-bundle.php) , the first package)

(Edit: I didn't do a proper ABX test to find out if any subtle quality differences exist, but in a casual listening situation my sample was as good as your 3.96 sample.)

I uploaded the encoded sample here: http://www.hydrogenaudio.org/forums/index....showtopic=83826 (http://www.hydrogenaudio.org/forums/index.php?showtopic=83826)

Could you please upload your samples to the same thread in the uploads forum so that the download link would not expire?
Title: the gargling soprano
Post by: lvqcl on 2010-09-22 16:31:33
I tried the latest Rarewares LAME 3.98.4 compile for Windows using the plain -b 128 switch and it does not produce this artifact.


Add -h and the problem will reappear.
Title: the gargling soprano
Post by: Slipstreem on 2010-09-22 16:50:00
I can't hear the warbling in Alex's no-unnecessary-switches encoding either, but I could clearly hear it in the others. Maybe a kindly moderator might like to edit the subtitle of this thread, as casual visitors may get the impression that LAME 3.98.4 is broken. Further clarification from those with keener ears than mine might be a plus though.
Title: the gargling soprano
Post by: Alex B on 2010-09-22 17:28:02
I can confirm that the -h switch makes the problem to appear. Perhaps the LAME developers should investigate the issue. It isn't logical that the -h switch causes this kind of problem. After all, "-q 2" aka "-h" is recommended in the current LAME documentation (switchs.htm from the above linked LAME bundle):

Quote
-q 0..9    algorithm quality selection

Bitrate is of course the main influence on quality. The higher the bitrate, the higher the quality. But for a given bitrate, we have a choice of algorithms to determine the best scalefactors and Huffman encoding (noise shaping).

-q 0: use slowest & best possible version of all algorithms. -q 0 and -q 1 are slow and may not produce significantly higher quality.

-q 2: recommended. Same as -h.

-q 5: default value. Good speed, reasonable quality.

-q 7: same as -f. Very fast, ok quality. (psycho acoustics are used for pre-echo & M/S, but no noise shaping is done.

-q 9: disables almost all algorithms including psy-model. poor quality.


Regardless of the presence of -h in the command line, the "Bee Gees - New York Mining Disaster 1941" sample is still problematic for LAME 3.98.4 CBR @ 128 kbps. "-b 128" and "-b 128 -h" both produce a warbling artifact. The sample link is here: http://www.hydrogenaudio.org/forums/index....mp;#entry538310 (http://www.hydrogenaudio.org/forums/index.php?showtopic=58385&st=150&p=538310&#entry538310)
Title: the gargling soprano
Post by: windsmurf on 2010-09-22 18:06:43
...
I uploaded the encoded sample here: http://www.hydrogenaudio.org/forums/index....showtopic=83826 (http://www.hydrogenaudio.org/forums/index.php?showtopic=83826)

Could you please upload your samples to the same thread in the uploads forum so that the download link would not expire?


I will as soon as i find out how to upload, i have some time later today. (although kees put the samples on his site too somewhere)
I will also try -b 128 (really only those ?) on the mac builds of LAME.
and i will try and find out what happens if i only leave the -h switch out.

Thanks so far.
Title: the gargling soprano
Post by: pdq on 2010-09-22 18:45:30
I suspect that all of the tuning and testing of each version is done without the -h or -q switches, so it is no wonder that bugs can creep in and not be detected. Another good reason use just the default and stay away from other switches.

Title: the gargling soprano
Post by: botface on 2010-09-22 18:55:46
...... the "Bee Gees - New York Mining Disaster 1941" sample is still problematic for LAME 3.98.4 CBR @ 128 kbps. "-b 128" and "-b 128 -h" both produce a warbling ...

Don't The Bee Gees always sound like that?
Title: the gargling soprano
Post by: Slipstreem on 2010-09-22 18:56:41
I will also try -b 128 (really only those ?) on the mac builds of LAME. and i will try and find out what happens if i only leave the -h switch out.

As already suggested, leave everything out before the file in/out parameters apart from -b 128. If the other switches were supposed to be there then they'd be the defaults, which would mean that they wouldn't be there, which is why they shouldn't be... if you see what I mean.
Title: the gargling soprano
Post by: windsmurf on 2010-09-22 19:31:06
...
I uploaded the encoded sample here: http://www.hydrogenaudio.org/forums/index....showtopic=83826 (http://www.hydrogenaudio.org/forums/index.php?showtopic=83826)

Could you please upload your samples to the same thread in the uploads forum so that the download link would not expire?


The relevant ones are now in your new thread, so y'all have a listen there and hear the soprano gargle.

http://www.hydrogenaudio.org/forums/index....mp;#entry723961 (http://www.hydrogenaudio.org/forums/index.php?showtopic=83826&st=0&gopid=723961&#entry723961)

(and this my first day, i picked a cat avatar. All of you cats seem to know what you're  tlaking about, so cats feel tough here. Although rabits are mighty fine too)
Title: the gargling soprano
Post by: Alex B on 2010-09-22 19:47:14
If the other switches were supposed to be there then they'd be the defaults, which would mean that they wouldn't be there, which is why they shouldn't be... if you see what I mean.

From switchs.htm:
-q 2: recommended. Same as -h.
-q 5: default value. Good speed, reasonable quality.


This implies that the default -q setting (-q 5) produces compromised quality. The user should be able to trust the documentation.   
(I know, the LAME documentation has never been one of the strongest points of LAME, at least not exactly up-to-date, but how the user is supposed to know that.)

Since -b 128 has been the same as --preset cbr 128 for a long time I think the actual default for -b 128 is not anymore -q 5.

In the distant past --preset cbr 128 used to be an easier name for "-h -b 128 --nspsytune -m j --lowpass 17500 --athtype 2 --ns-bass -6 --scale 0.93", but that was many years and several LAME versions ago.
Title: the gargling soprano
Post by: Slipstreem on 2010-09-22 19:56:25
Point taken, Alex. But doesn't it make more sense to recommend that someone unfamiliar with the finer technicalities of the encoder try default behaviour first? It could also be argued that if an ABX test can be passed with defaults then no further manual tuning is required.

I have my doubts that changing from the default -q setting to -q2 would make any audible difference in the vast majority of cases, with the possible exception of high bitrate encodings, but I'm happy to be proven wrong. I also think that it's -q2 now by default anyway, but I'll admit to not having tested for this as I normally stay well clear of CBR encoding given the choice.

We really could do with a major overhaul of the documentation in my opinion. At the very least, it would stop us from confusing each other when trying to advise others. 
Title: the gargling soprano
Post by: Kees de Visser on 2010-09-22 20:23:17
Could you please upload your samples to the same thread in the uploads forum so that the download link would not expire?
I will as soon as i find out how to upload, i have some time later today. (although kees put the samples on his site too somewhere)
I have just put a copy of the samples (garglingsoprano.zip) (http://www.galaxyclassics.com/public/garglingsoprano.zip) online  to make sure they remain available for future use.
Good to see they have also arrived in the HA uploads forum.
Title: the gargling soprano
Post by: lvqcl on 2010-09-23 15:09:34
If the other switches were supposed to be there then they'd be the defaults, which would mean that they wouldn't be there, which is why they shouldn't be... if you see what I mean.

From switchs.htm:
-q 2: recommended. Same as -h.
-q 5: default value. Good speed, reasonable quality.


This implies that the default -q setting (-q 5) produces compromised quality. The user should be able to trust the documentation.   
(I know, the LAME documentation has never been one of the strongest points of LAME, at least not exactly up-to-date, but how the user is supposed to know that.)


BTW -q 3 is default now (and -q 0 for --vbr-new mode)...
Title: the gargling soprano
Post by: Alex B on 2010-09-23 16:08:43
For CBR -q 3 is probably correct. I just tested CBR @ 128 kbps. "-b 128 -q 3" and "-b 128" produced identical files.

With the default VBR mode (--vbr-new) the -q switch seems to have only two effective modes: the default mode which is used without a -q switch and with -q 0...4, and another mode which is used with -q 5...9. (I tested -V 5 with all possible -q values.)

The "other" mode seems to be quite similar to the default mode. According to EncSpot Pro it uses the same MP3 encoding features as the default mode. Only the bitrate distribution and the bit reservoir values differ slighly in my test files.

EDIT

The default -q mode:

Code: [Select]
Bitrates:
----------------------------------------------------
 32                                                    0.0%
 64                                                    0.0%
 80                                                    0.2%
 96    |||                                            3.8%
112    |||||||||||||||||||||||||||                    28.4%
128    ||||||||||||||||||||||||||||||||||||||||      40.6%
160    ||||||||||||||                                14.6%
192    ||||||                                          6.2%
224    |||                                            3.3%
256    ||                                              2.2%
320                                                    0.7%
----------------------------------------------------

Type                : mpeg 1 layer III
Bitrate            : 137
Mode                : joint stereo
Frequency          : 44100 Hz
Frames              : 8839
ID3v2 Size          : 2180
First Frame Pos    : 2180
Length              : 00:03:50
Max. Reservoir      : 208
Av. Reservoir      : 36
Emphasis            : none
Scalefac            : not used
Bad Last Frame      : no
Encoder            : Lame 3.98

Lame Header:

Quality                : 50
Version String        : Lame 3.98
Tag Revision          : 0
VBR Method            : vbr-mtrh
Lowpass Filter        : 16500
Psycho-acoustic Model  : nspsytune
Safe Joint Stereo      : no
nogap (continued)      : no
nogap (continuation)  : no
ATH Type              : 4
ABR Bitrate            : 32
Noise Shaping          : 1
Stereo Mode            : Joint Stereo
Unwise Settings Used  : no
Input Frequency        : 44.1kHz

--[ EncSpot 2.1 ]--
The "other" -q mode:

Code: [Select]
Bitrates:
----------------------------------------------------
 40                                                    0.0%
 64                                                    0.0%
 80                                                    0.2%
 96    |||                                            3.9%
112    ||||||||||||||||||||||||                      25.1%
128    ||||||||||||||||||||||||||||||||||||||||      41.4%
160    ||||||||||||||||                              17.0%
192    |||||                                          6.1%
224    |||                                            3.4%
256    ||                                              2.3%
320                                                    0.6%
----------------------------------------------------

Type                : mpeg 1 layer III
Bitrate            : 139
Mode                : joint stereo
Frequency          : 44100 Hz
Frames              : 8839
ID3v2 Size          : 2180
First Frame Pos    : 2180
Length              : 00:03:50
Max. Reservoir      : 208
Av. Reservoir      : 37
Emphasis            : none
Scalefac            : not used
Bad Last Frame      : no
Encoder            : Lame 3.98

Lame Header:

Quality                : 45
Version String        : Lame 3.98
Tag Revision          : 0
VBR Method            : vbr-mtrh
Lowpass Filter        : 16500
Psycho-acoustic Model  : nspsytune
Safe Joint Stereo      : no
nogap (continued)      : no
nogap (continuation)  : no
ATH Type              : 4
ABR Bitrate            : 32
Noise Shaping          : 1
Stereo Mode            : Joint Stereo
Unwise Settings Used  : no
Input Frequency        : 44.1kHz

--[ EncSpot 2.1 ]--
Title: the gargling soprano
Post by: Aleron Ives on 2010-09-23 21:38:08
With the default VBR mode (--vbr-new) the -q switch seems to have only two effective modes: the default mode which is used without a -q switch and with -q 0...4, and another mode which is used with -q 5...9. (I tested -V 5 with all possible -q values.)

If you enable --verbose when encoding in VBR mode, you will see that --vbr-new with -q 0...4 and -q 5...9 differ only in the huffman search method. Using -q 0...4, you get "huffman search: best (outside loop)"; using -q 5...9,  you get "huffman search: normal".
Title: the gargling soprano
Post by: WonderSlug on 2010-09-24 21:41:43
So i am happy, but i do think it is a problem that CBR failed. CBR in LAME 3.9.6 seemed to work wonderfully, and to my ears better than VBR today. CBR, historically, is the more compatible format. Some hardware cannot play VBR (although that hardware is probably slowly going through it's battery nowadays). That type message sticks in the minds of the pro's: stay away form VBR. I guess that's why you still often hear LAME CBR gargling.


What hardware are you using that can't play VBR?

I had one device that was having trouble with VBR, until I realized it was the header that wasn't written properly.  When I used proper Xing VBR headers it played those VBR files just fine.