Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Wavegain vs. MP3Gain (Read 173496 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Wavegain vs. MP3Gain

Reply #50
If Dibrom, JohnV, Gabriel etc etc are reading this thread, please see my previous (long!) post for the serious questions which I can't answer - this post is to clear up the simpler stuff (which I can help with!).


john33,

It's surprising how little difference it make to the bitrate. Maybe some of the noise shaping options cause more of the dither to pop up above the ath - I don't know. With a suitable sample, you could look in spectral view to see. TBH it's hard to know exactly what's happening in the different modes. From a theoretical stand-point, I know that the scale version is best, but what lame is actually doing is anyone's guess. I'm not about to volunteer to ABX the differences!


mrosscook,

Decreasing the amplitude by 18dB (without dither) is essentially dropping the lower 3 bits of the 16-bit signal that comes off a CD. So, there will only be 13 bits of real information in the resulting file.

To explain: -18dB = dividing by 8 (2*2*2). If you did this exactly, you would end up with the 3 most significant bits of the new file being 0, and the 3 least singificant bits of the old file being lost - the middle 13 bits are shift but remain intact. In practice, we're not dividng by 8 exactly, so it won't be this neat: but it gives an idea of the kind of data that's going into the lossless codecs, and hence why they achieve better compression ratios afte tha 18dB drop.

The lossy codecs just look at the sound. If the file is loudish throughout, you haven't actually lost any audible content by dropping the level by 18dB - everything is still way above the -96dB CD noise floor, and the compression ratios show this: there's still just as much "sound" to store before and after wave gaining.


M,

Maybe you're reading too much into it! The one with dither (no noise shaping) is very very close too. The ones with noise shaped dither are not as close, which is to be expected because they don't contain the same audio data: they contain extra high frequency noise, added intentionally.

Re: dither/not dither... In this -18dB example, you're losing 3 bits. There's an OTT example at the bottom of this page where 10 bits are removed. It shows why (generally) dither is a good thing, though in practice there are times where the signal can self dither, and extra dither doesn't help. It rarely hurts though.


Cheers,
David.

 

Wavegain vs. MP3Gain

Reply #51
Quote
From a theoretical stand-point, I know that the scale version is best, but what lame is actually doing is anyone's guess. I'm not about to volunteer to ABX the differences!

The benefit with using --scale is that the scaling is applied to the integer samples after they have been copied into a float buffer so the fractional parts of the results are preserved for the encoder.

This discussion has centred on this issue vis-a-vis LAME. It could, however, equally be applied to the ogg vorbis encoders and, probably, others. Certainly in oggenc/oggdropXPd, the scaling is applied to the input samples after they have been converted to floats but before being fed to the encoder.

Wavegain vs. MP3Gain

Reply #52
Thanks john. That much I'd assumed. It's how the level difference affects the psychoacoustics that interests me. If we use fewer bits then either

a) we were using too many to start with, or
B) the quieter version will actually sound worse.

If (a), then in theory we could change the lame psychoacoustics. What would be more valuable would be to use a replay gain calculation to help lame set the ath properly, and see what effect this has.

Cheers,
David.

Wavegain vs. MP3Gain

Reply #53
Quote
Thanks john. That much I'd assumed. It's how the level difference affects the psychoacoustics that interests me. If we use fewer bits then either

a) we were using too many to start with, or
B) the quieter version will actually sound worse.

If (a), then in theory we could change the lame psychoacoustics. What would be more valuable would be to use a replay gain calculation to help lame set the ath properly, and see what effect this has.

Cheers,
David.

This issue has been lurking for a long time.  It's probably about time to figure out what might be done, now that replaygain is so well established.  However, it may take a few listening tests to decide if B) is noticeable using the appropriate music replaygained down to 89 dB.

I know that bAdDuDeX once complained about a loss of quality when using --scale to lower the volume before encoding.  His preferred genre was Metal.

ff123

Wavegain vs. MP3Gain

Reply #54
Vorbis should give identical results if scale is applied after float conversion. Would need to check if theory == practise though.

Wavegain vs. MP3Gain

Reply #55
Quote
Vorbis should give identical results if scale is applied after float conversion. Would need to check if theory == practise though.

It does. At least in my very limited test. But with LAME the difference is huge. Would it be possible to use the same ATH adjustment code in LAME as in the vorbis encoder?

Wavegain vs. MP3Gain

Reply #56
Vorbis test:
Code: [Select]
X:\Burn\TEST>dir
Volume in drive X is Temp
Volume Serial Number is 604A-42CF

Directory of X:\Burn\TEST

06/24/2003  12:59 PM    <DIR>          .
06/24/2003  12:59 PM    <DIR>          ..
06/24/2003  12:58 PM        84,907,236 wavegain.wav
06/24/2003  12:58 PM        84,907,236 original.wav
              2 File(s)    169,814,472 bytes
              2 Dir(s)   2,666,123,264 bytes free

X:\Burn\TEST>wavegain -y -g 3 -b 5 wavegain.wav

Analyzing...

   Gain   |  Peak  | Scale | New Peak | Track
---------------------------------------------
 -6.43 dB |  32766 |  0.48 |    15629 | wavegain.wav
Applying Gain to file: wavegain.wav
This file 100% done     All files 100% done
WaveGain Processing completed normally


X:\Burn\TEST>oggenc -q 6 wavegain.wav
Skipping chunk of type "fact", length 4
Skipping chunk of type "PEAK", length 24
Opening with wav module: WAV file reader
Encoding "wavegain.wav" to
        "test.ogg"
at quality 6.00
       [100.0%] [ 0m00s remaining] |

Done encoding file "wavegain.ogg"

       File length:  4m 00.0s
       Elapsed time: 0m 24.0s
       Rate:         10.0278
       Average bitrate: 180.9 kb/s


X:\Burn\TEST>oggenc -q 6 original.wav
Opening with wav module: WAV file reader
Encoding "original.wav" to
        "test2.ogg"
at quality 6.00
       [100.0%] [ 0m00s remaining] \

Done encoding file "original.ogg"

       File length:  4m 00.0s
       Elapsed time: 0m 25.0s
       Rate:         9.6267
       Average bitrate: 186.3 kb/s


X:\Burn\TEST>dir
Volume in drive X is Temp
Volume Serial Number is 604A-42CF

Directory of X:\Burn\TEST

06/24/2003  01:01 PM    <DIR>          .
06/24/2003  01:01 PM    <DIR>          ..
06/24/2003  01:01 PM         5,445,824 wavegain.ogg
06/24/2003  01:00 PM        84,907,280 wavegain.wav
06/24/2003  01:02 PM         5,607,989 original.ogg
06/24/2003  12:58 PM        84,907,236 original.wav
              4 File(s)    180,868,329 bytes
              2 Dir(s)   2,655,064,064 bytes free


This test was done with 32bit Floating Point wavs.

MPC test:

Code: [Select]
X:\Burn\TEST>dir
Volume in drive X is Temp
Volume Serial Number is 604A-42CF

Directory of X:\Burn\TEST

06/24/2003  01:09 PM    <DIR>          .
06/24/2003  01:09 PM    <DIR>          ..
06/24/2003  01:08 PM        84,907,236 original.wav
06/24/2003  01:08 PM        84,907,236 wavegain.wav
              2 File(s)    169,814,472 bytes
              2 Dir(s)   2,666,123,264 bytes free

X:\Burn\TEST>wavegain -y -g 3 -b 4 wavegain.wav

Analyzing...

   Gain   |  Peak  | Scale | New Peak | Track
---------------------------------------------
 -6.43 dB |  32766 |  0.48 |    15629 | wavegain.wav
Applying Gain to file: wavegain.wav
This file 100% done     All files 100% done
WaveGain Processing completed normally

X:\Burn\TEST>mppenc --xtreme --xlevel original.wav
MPC Encoder  1.14  -Beta-   (C) 1999-2002 Buschmann/Klemm/Piecha

encoding file 'original.wav'
      to file 'original.mpc'

SV 7.0 + XLevel coding, Profile 'Xtreme'

   %|avg.bitrate| speed|play time (proc/tot)| CPU time (proc/tot)| ETA
100.0  198.1 kbps 12.85x     4:00.6    4:00.6     0:18.7    0:18.7

WARNING:
 There still occured 1 SCF clippings due to a restriction of StreamVersion 7.
 Use the '--scale' method to avoid additional distortions. Note that this
 file already has annoying distortions due to slovenly CD mastering.


X:\Burn\TEST>mppenc --xtreme --xlevel wavegain.wav
MPC Encoder  1.14  -Beta-   (C) 1999-2002 Buschmann/Klemm/Piecha

encoding file 'wavegain.wav'
      to file 'wavegain.mpc'

SV 7.0 + XLevel coding, Profile 'Xtreme'

   %|avg.bitrate| speed|play time (proc/tot)| CPU time (proc/tot)| ETA
100.0  197.4 kbps 12.86x     4:00.6    4:00.6     0:18.7    0:18.7

X:\Burn\TEST>dir
Volume in drive X is Temp
Volume Serial Number is 604A-42CF

Directory of X:\Burn\TEST

06/24/2003  01:11 PM    <DIR>          .
06/24/2003  01:11 PM    <DIR>          ..
06/24/2003  01:11 PM         5,958,708 original.mpc
06/24/2003  01:08 PM        84,907,236 original.wav
06/24/2003  01:11 PM         5,938,340 wavegain.mpc
06/24/2003  01:10 PM        84,907,236 wavegain.wav
              4 File(s)    181,711,520 bytes
              2 Dir(s)   2,654,224,384 bytes free


The MPC test was performed using 32bit fixed point (longs) wavs.
"You have the right to remain silent. Anything you say will be misquoted, then used against you."

Wavegain vs. MP3Gain

Reply #57
Does wavgain dither?

Edit: the SCF clipping makes me think of another explanation: the psymodel gives identical results but the entropy coders can do better with smaller values

Wavegain vs. MP3Gain

Reply #58
I uploaded a bunch of samples for ABX testing on the other (cleaned-up) thread on the matter. Please post any futher information there.

Go Here

Wavegain vs. MP3Gain

Reply #59
My humble experiences with this topic: http://www.hydrogenaudio.org/forums/index....=ST&f=16&t=8046
The main "problem" seems to be the sfb21 (high freq) content and lame VBR.
As I understood it, if the track is very loud, this high freq. content goes above the ATH and causes the bitrate bloat (sometimes dramatically). Correct me if that's wrong please:)
You may try to use -Y switch in your tests, the difference between original and replaygained encoding should be far less then.

Wavegain vs. MP3Gain

Reply #60
Quote
Does wavgain dither?

Edit: the SCF clipping makes me think of another explanation: the psymodel gives identical results but the entropy coders can do better with smaller values

On the command lines used: no dither, manual gain of +3dB(Why????), output for vorbis as floats and for mpc as 32bit ints.

Wavegain vs. MP3Gain

Reply #61
Quote
My humble experiences with this topic: http://www.hydrogenaudio.org/forums/index....=ST&f=16&t=8046
The main "problem" seems to be the sfb21 (high freq) content and lame VBR.
As I understood it, if the track is very loud, this high freq. content goes above the ATH and causes the bitrate bloat (sometimes dramatically). Correct me if that's wrong please:)
You may try to use -Y switch in your tests, the difference between original and replaygained encoding should be far less then.

It appears Sony666 has identified the problem. It DOES appear to be sfb21 related with LAME, because here are the resulting bitrates for Ministry's "Impossible":

--alt-preset standard                                              --> 270kbps
--alt-preset standard --scale 0.2515                      --> 230kbps
--alt-preset standard -Y                                --> 181kbps
--alt-preset standard -Y --scale 0.2515        --> 180kbps

When the sfb21 band is removed, --scale makes no damn difference!!

This suggests to me that LAME is spending too much encoding high volume, high frequency sounds with modern overcompressed recordings. Using --scale seems to work around this problem, and -Y is of course a workaround. If Sony666 and I are correct, we have found a serious glitch in LAME that needs addressing!

Wavegain vs. MP3Gain

Reply #62
Quote
My humble experiences with this topic: http://www.hydrogenaudio.org/forums/index....=ST&f=16&t=8046
The main "problem" seems to be the sfb21 (high freq) content and lame VBR.
As I understood it, if the track is very loud, this high freq. content goes above the ATH and causes the bitrate bloat (sometimes dramatically). Correct me if that's wrong please:)
You may try to use -Y switch in your tests, the difference between original and replaygained encoding should be far less then.

But how can the encoder know how much I turned my volume knob when I listen to the song (or if I replaygain them afterwards etc)? The only assumption it should be allowed to make about this is IMO that you don't want to ruin your ears when listening and thus that the peak (up until that point in the file to have a "causual" encoder) is below the threshold of pain. LAME seems to make other assumptions as well according to these tests. Garf, do you know how this works in the vorbis encoder?

Wavegain vs. MP3Gain

Reply #63
Does the -Y --scale version sound worse than --scale only one? (I think it should)

The scale version with volume turned up should sound worse than original too IMHO.
(Some information cut out by ATH)

/EDIT\ Vorbis has problems even with quiet classical music, so... Yes. \EDIT/
ruxvilti'a

Wavegain vs. MP3Gain

Reply #64
I believe the issue was adjusting the ath curve based on the volume (i.e., how far from full-scale) the music is.  Perhaps the change in the ath happened mostly in the upper frequencies.  If so, that might explain why it appears to be related to sfb21.  In any case, I think this behavior was meant to be a *feature*, not a bug.

ff123

Wavegain vs. MP3Gain

Reply #65
ff123 That's a 40kbps change in total, 39kbps of which appears to be sfb21 (judging from the -Y results). That is a lot of extra information just for what are essentially louder squeeks, don't you think? Why only 1kbps difference when sfb21 is eliminated?

I think you are right in that it is an ATH issue, but this sfb21 thing helps explain why other codecs don't deal with this issue quite as drastically as LAME does. Don't you think?

Wavegain vs. MP3Gain

Reply #66
Quote
Quote
Does wavgain dither?

Edit: the SCF clipping makes me think of another explanation: the psymodel gives identical results but the entropy coders can do better with smaller values

On the command lines used: no dither, manual gain of +3dB(Why????), output for vorbis as floats and for mpc as 32bit ints.

oh.. I use 92dB because my SlimX output is weak.
I used no dither because I though it wasn't worth using with 32bit wavs.
"You have the right to remain silent. Anything you say will be misquoted, then used against you."

Wavegain vs. MP3Gain

Reply #67
Quote
I believe the issue was adjusting the ath curve based on the volume (i.e., how far from full-scale) the music is.   Perhaps the change in the ath happened mostly in the upper frequencies.  If so, that might explain why it appears to be related to sfb21.  In any case, I think this behavior was meant to be a *feature*, not a bug.

ff123

The problem as I see it is that volume is not necessarily linked with how far from full scale the digital waveform is. It may as we see depend on other things as well - for example replaygain and volume level on your amplifyer.

Wavegain vs. MP3Gain

Reply #68
I guess we just need to roll out some ABX tests.  My question is, how much do you need to amplify the specifically --scale'd mp3 before you can tell a difference? (speaking of 89dB)  Will it solely depend on how loud or compressed the original was?
WARNING:  Changing of advanced parameters might degrade sound quality.  Modify them only if you are expirienced in audio compression!

Wavegain vs. MP3Gain

Reply #69
Some answers (but not enough!)...


ErikS,

Replay gain assumes that you're listening quite loud, so that's not a great worry.



Jebus,

The bits aren't going into sfb21 (i.e. higher frequencies). They're going into making the whole frequency range more accurate, so that sfb21 (above 16kHz) can actually get some bits! sfb21 doesn't have a scale factor of its own (stupid mp3 format), so it gets something related to what the others get (I forget the details - it's been discussed to death).

So, the 39kbps don't go to higher frequencies - they go to all frequencies, as this is the only way to give 16kHz+ "sufficient" bits - probably only a few in this case. Otherwise, it gets starved.


ABX tests: well, if it's a 16kHz+ issue, that's me out! You'll need the people who can hear up there, and also can detect ringing up there, which might be an issue. This should be interesting - we might find that those bits are very much needed! (though not for me!)


Cheers,
David.

Wavegain vs. MP3Gain

Reply #70
Finally reached the end of this thread (to date), and glad to see that sfb21 has been found to be the culprit, as I'd been starting to suspect. I agree with 2Bdecided about how the sfb21 problem manifests itself. It's a common misperception that the extra bits are wasted as 16kHz+, when in fact they're wasted on <16 kHz content if certain things happen in the 16kHz+ content that requires the workaround to maintain quality.

This is my educated guess as to what's happening:

By chance, john33 picked a track where the sfb21 issue wasn't causing much bitrate bloat in normal APS (without the -Y), so it wasn't much higher in the version with no --scale applied.

Jebus picked an album where the sfb21 issue was causing bloat in APS at full volume, but wasn't at lower volume. As 2Bdecided said, there is no scalefactor for 16 kHz+ (sfb21) so the global scalefactor has to be used. If the scale required to get fine enough quantization in sfb21 doesn't match the existing global scalefactor, the workaround is to adjust the global scalefactor, which then causes more bits to be used to encode all the other spectral bands (sfb0 to sfb20).  This forces all other bands to be quantized more finely than the masking threshold requires, so wastes bits in encoding detail in the <16 kHz area that is inaudible. (Encoders like Musepack and Vorbis don't have this sfb21 issue, so there's less difference - perhaps a fraction due to the way the adaptive ATH works).

It so happens that the --scale version doesn't need such a big change in the global scalefactor, so it doesn't get forced to waste bits on masked (inaudible) detail in bands 0-20 just to get the quantization noise in sfb21 low enough to go below the masking threshold.

Probably an analysis with mp3x graphical frame analyzer could help to verify whether this is true or not.

So, providing the psychoacoustics are correct about the masking threshold for <16 kHz components of the signal, the lower bitrate file should sound just as transparent as the high bitrate file that has been mp3gained (or had RG in Foobar2000 or similar).

So, if I'm correct, then only for cases where sfb21 bitrate bloat is happening, one should obtain just as good transparency with the smaller file (much closer to Musepack --standard --xlevel bitrates and APS - Y bitrates) when compared to the original lossless file with replaygain applied (even if played back on a 24-bit system, where the noise floor doesn't rise under ReplayGain, because LAME isn't forced to encode inaudible details just to get around the lack of sfb21 scalefactor.

Mind you, can this be right? If so, it implies that one could obtain just as low a bitrate without affecting the original volume by using, for example,  --scale 0.5 to obtain -6.0 dB volume change in the encode step then applying a corrective mp3gain Constant Gain of +6.0 dB after encoding (or use anything else that edits the global gain, like mp3DirectCut) to restore the original volume.

If so, surely Lame APS would already use --scale accompanied with an opposite adjustment of the global gain value, like mp3gain, as a more efficient workaround for the sfb21 issue, and could apply it to individual frames of the MP3 where the bloat occurs.

The only exception is if it doesn't work for changes in 1.5 dB steps, but only happens to work for some values between the steps of the global gain, or that the times when it works can't be predicted by LAME (though LAME is well aware of when the bloat is occuring, because the -Y switch lets it ignore the +16 kHz content at those times instead of implementing the bloat-inducing workaround for the lack of sfb21 scalefactor).

In that case, perhaps some files would become MORE bloated when --scale is applied.

Hmm, I think we need a Lame VBR expert to help here.

Wouldn't it be awesome if full-bandwidth LAME APS could be 20-30 kbps smaller without sounding any less transparent or breaking the MP3 standard! I'm just suspecting there's something to stop it working like that, or it would have been done before.

Wavegain vs. MP3Gain

Reply #71
Okay, so in summary we need people with good ears to ABX my posted samples, and someone who knows or works on LAME to get in here and bloody sort this out!

I PMed Dibrom and Gabriel but they don't seem to be interested,  or at least I haven't had a response.

Wavegain vs. MP3Gain

Reply #72
My opinion is that if you intend to always play your tracks on a specific adjusted level, it would be wise to take this into consideration while encoding.

It seems to me that extraction->album analysis->encoding with proper scale value would be the best choice.

Disabling the ath adjustment would be a bad idea. The ath adjustement is taking care (a little) about a possible misadjustement of track level (that is when the sound engineer did not used a proper level for the track).
But to my mind, it is also taking care of the constant sensitivity adjustement of the middle ear. Disabling this would reduce the quality. (I used "to my mind" because some do not agree about this)

Using the auto ath adjustement to take into consideration a whole track level misadjutement is not very efficient. First, the ath adjustement is limited in amplitude, and second it is progressive and not instantaneous.

You can alter the ath base level with --athlower, but I would prefer to use --scale.

Wavegain vs. MP3Gain

Reply #73
If it is purely an sfb21 problem, why does Vorbis have a 6kbps and MPC a 0.5kbps difference in encoding bitrate respectively?

Wavegain vs. MP3Gain

Reply #74
Quote
But how can the encoder know how much I turned my volume knob when I listen to the song (or if I replaygain them afterwards etc)? The only assumption it should be allowed to make about this is IMO that you don't want to ruin your ears when listening and thus that the peak (up until that point in the file to have a "causual" encoder) is below the threshold of pain. LAME seems to make other assumptions as well according to these tests. Garf, do you know how this works in the vorbis encoder?

Vorbis assumes that the loudest sound is played back at a level no more than what will blow yours ears out, and then takes the most pessimistic assumptions about ATH and masking that are applicably given the above.