Skip to main content

Topic: vorbisgain/wavgain peak question (Read 357 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
vorbisgain/wavgain peak question
test file foo.ogg has had replaygain added via vorbisgain - with the following

Code: [Select]
   Gain   |  Peak  | Scale | New Peak | Track
----------+--------+-------+----------+------
 -4.55 dB |  35863 |  0.59 |    21240 | foo.ogg

Via mediainfo I get

Code: [Select]
Replay gain                              : -4.55 dB
Replay gain peak                         : 1.094500

decode it to wave and use wavegain I get

Code: [Select]
    Gain   |  Peak  | Scale | New Peak |Left DC|Right DC| Track
           |        |       |          |Offset | Offset |
 --------------------------------------------------------------
  -4.55 dB |  32767 |  0.59 |    19406 |    0  |     0  | foo.wav

The -4.55 dB seems to be consistent but from both vorbisgain and wavegain the "Peak" and "New Peak" appear to have a very different meaning than the vorbistag for REPLAY_GAIN_PEAK does. The latter appears to be a 1.xxxxxx or .xxxxxx real number while the peak values reported by vorbisgain and wavegain appear to be an integer.

Is there a correspondance between [vorbis|wave]gain Peak integer number and the Replay Gain Peak real number?

I'm not understanding what those numbers are intended to mean or how they correspond to each other.

Re: vorbisgain/wavgain peak question
Reply #1
okay when I take the vorbisgain value PEAK and divide it by 32768 I get something very close to the 1.094500 reported for the file by mediainfo, close enough that it is a rounding difference.

On the wave file though (created by decoding foo.ogg with ffmpeg) the wavegain peak reports as 32767 which is almost 32768 - close enough that it's likely a rounding difference and with six places of precision I would get 1.000000.

That seems to suggest the original audio before the vorbis compression was peak normalized.

But why then is the vorbis peak different?

  • Case
  • [*][*][*][*][*]
  • Developer (Donating)
Re: vorbisgain/wavgain peak question
Reply #2
That seems to suggest the original audio before the vorbis compression was peak normalized.
Pretty much everything released digitally is peak normalized. It makes sense as it maximizes the available dynamic range.

But why then is the vorbis peak different?
Lossy encoding changes the audio. Inaudible frequencies can be entirely removed and masked sounds get encoded with limited accuracy. You can notice peak changes without lossy encoding too just by using an equalizer to alter the frequency response.

When Vorbis is decoded the signal is free to exceed regular digital fullscale, that's why you saw peak above 1.0. The WAV you made with ffmpeg is limited to 1.0 as the default 16-bit PCM WAV file you used caused all the peaks to clip. You could decode it to 32-bit floating point WAV and the peaks wouldn't get clipped.

Re: vorbisgain/wavgain peak question
Reply #3
Okay so if I decode to 32-bit is there a way to determine from wavegain what the peak would be in 16-bit integer? Would decoding to 32-bit float (or 24-bit integer) but still dividing by 2^15 give me the 1.094500 value? I suppose I can try.

  • DVDdoug
  • [*][*][*][*][*]
Re: vorbisgain/wavgain peak question
Reply #4
Quote
Okay so if I decode to 32-bit is there a way to determine from wavegain what the peak would be in 16-bit integer? Would decoding to 32-bit float (or 24-bit integer) but still dividing by 2^15 give me the 1.094500 value? I suppose I can try.
You lost me there with the division, but anything above 1.0 has to be derived from a floating-point value.    1.0 = 100% = 0dB = the maximum integer count for the given bit depth.

That 35863 number is "artificial" because you can't count that high with a 16-bit signed integer.    And at 24-bits 35000 is somewhere in the ballpark of -50dB, so it's not on a 24-bit scale.     ...It can be a 32-bit (or more) integer, but once it's converted to 16-bits it gets clipped.         
  • Last Edit: 27 November, 2017, 06:24:33 PM by DVDdoug

  • saratoga
  • [*][*][*][*][*]
Re: vorbisgain/wavgain peak question
Reply #5
Okay so if I decode to 32-bit is there a way to determine from wavegain what the peak would be in 16-bit integer? Would decoding to 32-bit float (or 24-bit integer) but still dividing by 2^15 give me the 1.094500 value? I suppose I can try.

If you decode to 32 bit float (-1 to +1 scale), you should get a floating point peak value (1.094500).  You got an integer peak value before because you were using integers; it is just giving you the actual largest value in the file, whatever format that happens to be.

Re: vorbisgain/wavgain peak question
Reply #6
I couldn't figure out how to decode vorbis to float but when decoding opus there's a --float switch that converts to 32-bit wave and running wavegain on the result gives an integer peak - and one higher than 2^15.

I suspect that dividing the peak from 32-bit wave as reported by wavegain by 2^15 will give me what I'm after but I'm not positive.

-=-

What I'm trying to do is figure out a way to generate accurate replaygain tags to use in Matroska.

I'm hoping that if I decode the lossy to wave and use wavegain I can get tags that are good. Looks like decoding to 32-bit might be necessary for an accurate peak tag, but seems ffmpeg doesn't like doing it, I've only gotten ffmpeg to like making 16 bit wav.

vorbis I don't really care about, I just wanted to see if I could generate similar numbers with wavegain on a decoded vorbis that vorbisgain produced.

opus and acc are the only lossy I really care about. Decoding aac to 32-bit wave without ffmpeg on a linux server may require a fancy gstreamer pipeline with a fluendo plugin.

I don't know why I care about this as much as I do.

Re: vorbisgain/wavgain peak question
Reply #7
Interesting results using faad to decode and then looking with wavegain -

Code: [Select]
    Gain   |  Peak  | Scale | New Peak |Left DC|Right DC| Track
           |        |       |          |Offset | Offset |
 --------------------------------------------------------------
  -4.63 dB |  32766 |  0.59 |    19228 |    0  |     0  | 16aac.wav
  -6.78 dB |  32767 |  0.46 |    15012 |   25  |    58  | 24aac.wav
  -5.97 dB |  32767 |  0.50 |    16479 |  -40  |   -36  | 32aac.wav
  -4.63 dB |  34124 |  0.59 |    20024 |    0  |     0  | 32faac.wav

16 bit, 24 bit, 32 bit, 32 float

The first three seem bound by 2^15 but the float one is not.

24-bit integer and 32-bit integer have DC offset but 16-bit integer and 32-bit float do not (same source .m4a)

Playing in VLC the 24-bit and 32-bit integer faad outputs are clearly severely damaged.

Seems faad decoding is dangerous.

  • Case
  • [*][*][*][*][*]
  • Developer (Donating)
Re: vorbisgain/wavgain peak question
Reply #8
You can decode to float with ffmpeg by using ffmpeg -i <sourcefile> -acodec pcm_f32le <outputfile>.

But if your intention is to just ReplayGain scan and tag Matroska, why don't you do it with foobar2000? It will write accurate peaks and even allows you to write true peak values, if you wish.