How to calculate the REPLAYGAIN_TRACK

Topic: How to calculate the REPLAYGAIN_TRACK_PEAK value? (Read 8491 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

2008-04-20 16:20:50

Hi all,

I was hoping on some help with calculating the values of the REPLAYGAIN_TRACK_PEAK?

What I have is an audiobuffer with TSmallInt values (-32767..32768) i.e. 16bit audio and I am trying to
reproduce the same values as Foobar2000 is doing when calculating their REPLAYGAIN_TRACK_PEAK.

So is there some formula that calculates from audio sample to peak value?

I did manage with the help of replaygain.dll to reproduce the same values for the
REPLAYGAIN_TRACK_GAIN and REPLAYGAIN_ALBUM_GAIN.

Would be great if I also could get the values for the PEAK.

Thanks for any help!
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #1 – 2008-04-20 16:30:22

I believe finding the peak is just a matter of converting each sample to an absolute value, then finding the largest across the whole track, and finally converting that largest value to a float between 0.0 and 1.0 by dividing by 2^15.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #2 – 2008-04-20 16:31:25

REPLAYGAIN_TRACK_PEAK = max( abs(audiobuffer) )/32768.0

By the way: int16 is (-32768..32767)

Edit: too slow

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #3 – 2008-04-20 16:36:28

Hi Guys,

Thanks for the quick replies, very much appriciated.

The problem I have is that I thought the same, but in your formula the max of the peak would be represented by the 32767 value. and as such the peak value would be always 1 or less.

But Foobar2000 gives a value for peak of more then 1 (i.e. 1.156556 or something similar).
So I dont understand how they get to that value.

Thanks
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #4 – 2008-04-20 16:46:45

You should decode mp3 (ogg, aac, ...) to 32-bit float, not to 16-bit int.
So max. sample value of 34652.45 will result to 1.057509 RG peak value.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #5 – 2008-04-20 16:48:49

Quote from: RickYZ on 2008-04-20 16:36:28

But Foobar2000 gives a value for peak of more then 1 (i.e. 1.156556 or something similar).
So I dont understand how they get to that value.

This is possible because foobar internally works with more than 16bits AFAIK. So, during the processing stage, the signal may go way above 1.0 - it is only "clipped" down during output. Correct me if i'm wrong.

edit: got ninjad

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #6 – 2008-04-20 19:17:32

Hi,

thanks for clarifying that, got a zillion questions now, sorry.

Here are some of the basic questions:
1. I thought MP3 was stored in 16bit? so how can from 16bit a 24bit value arise that is above the 32768 line?
2. What is the conversion between 24bit and 16bit? Is it simply 32768 = 1.0000?

Now I switched BASS into FLOATMode, and got indeed all kinds of float values back if I read the buffer.
Now I have to convert this buffer into right/left samples that can be used by replaygain.dll, so I guess
I have to convert these floats to a range of 32768 again?

And the peak value identical, currently, with the below code it ends up with a value of 3.93212^18?
But how does it relate to the 32768? I.e. how can I get to the Foobar value?

Thanks again,
Eric

Code: [Select]

type
  TFloat = Double;            // Float_t
  TSample24 = Double;
  TStereoSample24 = record
     Right: TSample24;
     Left: TSample24;
  end;

function TcbAudioFileReader.GetSongSamples(var LeftSamples,
  RightSamples: Array of TFloat; var Peak: TFloat): Integer;
var GainBuffer: Array[0..4095] of TStereoSample24; //2*Double
    i: integer;
begin 
  //Read values from BASS.dll
  Result := Read(GainBuffer, 4096);

  for i := 0 to result - 1 do begin //Calc Left/Right

      RightSamples[i] := Round(GainBuffer[i].Right);
      LeftSamples[i] := Round(GainBuffer[i].Left);
      Peak := Max(Peak, Abs(RightSamples[i]));
      Peak := Max(Peak, Abs(LeftSamples[i]));

  end;
end;

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #7 – 2008-04-20 20:03:00

Quote from: RickYZ on 2008-04-20 19:17:32

1. I thought MP3 was stored in 16bit? so how can from 16bit a 24bit value arise that is above the 32768 line?

No. MP3 works with float numbers.
And forget about 24bit: Single is 32 bit, Double is 64 bit type.

Quote from: RickYZ on 2008-04-20 19:17:32

Now I have to convert this buffer into right/left samples that can be used by replaygain.dll, so I guess I have to convert these floats to a range of 32768 again?

No.

Quote from: RickYZ on 2008-04-20 19:17:32

And the peak value identical, currently, with the below code it ends up with a value of 3.93212^18?
But how does it relate to the 32768? I.e. how can I get to the Foobar value?

3.93212^18 seems to be absolutely wrong value. I think you should use Single (32 bit) instead of Double (64 bit). And do not use 'Round' function.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #8 – 2008-04-21 08:39:03

Hi,

Ok, thanks for clarifying that, it turns out that BASS is sending the data in DWORDs.
So I updated the code to read DWORD i.e. 2^32 is the maximum and I receive
nice buffers with indeed these values, no issue there.

Of course now the problem is actually still the same, 2^32 = (2*32768)^2, so I
would still be stuck with equivelents of 32768 as max value? So I still would
never be able to calculate a peak value of more then 1 or?

So even if I enable the FLOAT option in BASS I guess I still get some kind of
translated 16bit buffers, or am I missing it again?

Pfff is this difficult stuff.
Regards,
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #9 – 2008-04-21 08:52:30

Data in (signed, hopefully) DWORDS: (2^(32-1)-1 to -2^(32-1)) -2,147,483,648 to +2,147,483,647;

Divide BASS output by 2,148,483,648 to get number in the range +0.9999999995 to -1.

If BASS output is based on 16bit signed integers left shifted into a 32bit signed integer, still divide by 2,147,483,648 but output will be in the range +0.9996948 to -1.

A workaround might be to take BASS output value, add 0.5, divide by 2,147,483,647.5. This will give you a range of +1 to -1 if the input is a 32bit signed integer.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #10 – 2008-04-21 09:39:57

Hi Nick,

Thanks for the help,

Indeed BASS sends unsigned so I need to calculate the signed, understood.
So now I have calculated the bass data into floats ranging from +0.999 to -1.0.

Now the real problem is still this clipped maximum value that foobar calculates.

Quote

lvqcl Posted Yesterday, 17:46
You should decode mp3 (ogg, aac, ...) to 32-bit float, not to 16-bit int.
So max. sample value of 34652.45 will result to 1.057509 RG peak value.

Still I do not get this value over 32767 or now in my new float data over 1.0?
How can I retrieve a peak value of 34567? or in case a float 1.123456?
Would a value of 1.123456 not just clip back to within the range of -1.0 .. 0.99..?
So how would I recognise that it is clipped and is actually a value above this 1.00
or below the -1?

Regards,
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #11 – 2008-04-21 09:43:09

It sounds like a limitation of BASS, in that it will only ever output a value between 0 and 4,294,967,295 in a 32bit unsigned integer.

This will preclude you ever finding a maximum / minimum value which falls outwith the range -1.0 to +1.0.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #12 – 2008-04-21 09:47:25

Hi Nick,

... but I still not understand if a 32bit integer has a max of 4,294,967,295,
how can it then become 5,000,000,000 or even bigger ?

Wouldnt that then not require a larger bitdepth?

I think I am just missing the point, sorry.

Regards,
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #13 – 2008-04-21 09:48:33

Sample code:

http://packages.debian.org/vorbisgain

Follow the links.

http://www.sjeng.org/vorbisgain.html

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #14 – 2008-04-21 10:32:59

It's 32-bit floating point decoding that you want - it has a billion billion+ values above the normalised peak.

http://en.wikipedia.org/wiki/Floating_point

Cheers,
David.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #15 – 2008-04-21 11:55:25

Hi All,

Remember I read 32bit float, so I dont really care about the physical maximum or definition of a floating point. The maximum in this case is set by the fact that I only have 32bit to store/read the value.

This is the real question still:

So I read a value from the file, I read it as an 32bit integer, it has the mentioned maximum, how can it become bigger then this maximum? So how can the peak value that I am trying to calculate be above this 32bit max value as it is shown in foobar or vorbisgain?

Drives me nuts,
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #16 – 2008-04-21 12:05:30

Quote from: RickYZ on 2008-04-21 11:55:25

Hi All,

Remember I read 32bit float, so I dont really care about the physical maximum or definition of a floating point. The maximum in this case is set by the fact that I only have 32bit to store/read the value.

This is the real question still:

So I read a value from the file, I read it as an 32bit integer, it has the mentioned maximum, how can it become bigger then this maximum? So how can the peak value that I am trying to calculate be above this 32bit max value as it is shown in foobar or vorbisgain?

Drives me nuts,
Eric

Do you have the option to read the values as 32-bit float rather than 32-bit integer? Obviously both will take up exactly the same storage space, however the maxima and minima will differ significantly - i.e. if you read 32-bit float you will get values outwith +2,147,483,647 to -2,147,483,648, albeit with fewer significant digits.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #17 – 2008-04-21 13:13:57

Quote from: RickYZ on 2008-04-21 11:55:25

Hi All,

Remember I read 32bit float, so I dont really care about the physical maximum or definition of a floating point. The maximum in this case is set by the fact that I only have 32bit to store/read the value.

This is the real question still:

So I read a value from the file, I read it as an 32bit integer, it has the mentioned maximum, how can it become bigger then this maximum? So how can the peak value that I am trying to calculate be above this 32bit max value as it is shown in foobar or vorbisgain?

Drives me nuts,
Eric

Because of how lossy codecs work, a sharp transient often results in 'overshoot' that makes the peak value slightly higher, even though the spectrum looks identical. If you think about phase, this makes a lot of sense - if you combine a handful of sounds, slight changes in each of their phases will result in vastly different waveforms with only small (or non-existent) changes in sound.

Yes, the values over 1.0 will clip when truncated to 24 or 16 bit for output. However, one of the advantages of Replay Gain is that the correction are predominately negative. A good decoder will apply the replay gain on the floating point output before the information has been lost to clipping during decimation. For most tracks, the negative gain is sufficient to prevent output clipping entirely. Some players (winamp and foobar, for instance) have options to force clip prevention even if it results in more negative gain than specified.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #18 – 2008-04-21 13:59:25

Hi All,

Hmmm, you make a nice comment, so the value will be clipped only when truncating the 32bit to 24bit or 16bit? So the 32bit is never clipped (which makes sence).

So I need to calculate a 16bit corresponding value to my 32bit values that I have? This could then be bigger then 32768 value that corresponds to REPLAYGAIN_TRACK_PEAK = 1? And as such this bigger value could give me the above 1 PEAKs?

So what does 1.000000000 the max 32bit float correspond to in 16bit?

Just a simple formula would do for me.....

Regards,
Eric

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #19 – 2008-04-21 14:10:24

IMHO a better question would be...

What happens to -1, 0 and 1 (16bit int or 24bit int) when I convert them to 32bit float?

They will be -1, 0 and 1 but the max range will be much wider.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #20 – 2008-04-21 14:12:19

Quote from: RickYZ on 2008-04-21 13:59:25

Hi All,

Hmmm, you make a nice comment, so the value will be clipped only when truncating the 32bit to 24bit or 16bit? So the 32bit is never clipped (which makes sence).

So I need to calculate a 16bit corresponding value to my 32bit values that I have? This could then be bigger then 32768 value that corresponds to REPLAYGAIN_TRACK_PEAK = 1? And as such this bigger value could give me the above 1 PEAKs?

So what does 1.000000000 the max 32bit float correspond to in 16bit?

Just a simple formula would do for me.....

Regards,
Eric

Clipping occurs when truncating the floating point value to any integer representation, whether 16, 24 or 32 bits.

A floating point value of 1.00000 corresponds to 32768 in 16 bit integer, but will be truncated to 32767.

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #21 – 2008-04-21 14:39:32

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #22 – 2008-04-21 14:45:16

http://en.wikipedia.org/wiki/IEEE_754-1985

http://en.wikipedia.org/wiki/Floating_Poin...g-point_numbers

How to calculate the REPLAYGAIN_TRACK_PEAK value?

Reply #23 – 2008-04-21 16:08:22

Hi Kjoonlee,

Thanks for the links, it really helped, esp your previous mail.

In my brain the maximum value of the 32bit float corresponded 1:1 to 32767,
but only after realizing that this is not the case (sorry it took me so long)
I understand now that it is the value 1.00000 that corresponds to 32768,
and that the 32bit float can be much bigger, or smaller.

So I understand how to calculate the PEAK now.
Now I still need BASS to give me floats :-(

Thanks alll for your patience and help!!

Regards,
Eric

Notice