HydrogenAudio

Hydrogenaudio Forum => General Audio => Topic started by: 2Bdecided on 2003-11-18 16:04:39

Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-18 16:04:39
Every now and again I wish I had the time to update the ReplayGain website and add some new ideas, and maybe even clarify some old ones. I don't, so this thread will have to do.


Firstly, the format used to store ReplayGain info in files is not documented correctly on the ReplayGain website, and it would be good to "publish" what has emerged as the standard for each format.

Secondly, what is stored is not documented correctly on the ReplayGain website, and I'd like to re-examine what is stored...


One change has already happened, and I think it's a good change:

Forget Radio and Audiophile - Track and Album are much better names.
(that's an open admission of me being wrong, for anyone who discussed this with me previously!)


So, we store:

ReplayGain Track adjustment
ReplayGain Album adjustment
(ReplayGain) Track peak
(ReplayGain) Album peak

[span style='font-size:8pt;line-height:100%'](this last one wasn't in the original proposal, but it has been widely used - I've put it in bold to remind me to include it in the update)[/span]

That makes sense, and most software supports this. I'd like to formalise some extensions, some of which were there from the start, and others that have cropped up more recently:


1. (ReplayGain) undo adjustment
- this is written when the gain of the file is changed (e.g. by mp3gain, or by decoding with ReplayGain enabled), and is the gain change required to put the file back to where it started.

e.g. If I apply -8dB gain change using mp3gain, then
(ReplayGain) undo adjustment = +8dB

e.g. If I use --scale 0.5 when encoding (for whatever reason?!), then
(ReplayGain) undo adjustment = +6dB

If the gain of an already ReplayGained file is changed, the original four values (Track and Album adjustment and peak) should be updated so that they are correct for the new audio data. (see an example in this thread: http://www.hydrogenaudio.org/forums/index....topic=15412&hl= (http://www.hydrogenaudio.org/forums/index.php?showtopic=15412&hl=) )

I can't see any argument against defining this field. It would be zero (or absent) if the audio file hasn't been altered. It's useful in all formats because you can always apply wavgain before encoding, and it would be nice to know that this has been done.


2. ReplayGain calculation method

OK - I've had this argument before, but this really is important. ReplayGain can be improved, but you'll never know whether files are tagged using the old or new ReplayGain calculation unless the calculation method (actually a number which corresponds to the method) is stored. This doesn't increase the complexity of players, as they won't care - it just makes it very easy to pick out files that were tagged with the old version, and update them.


3. ReplayGain lossy approximation

This is just a single bit: 0 or 1.

0= this ReplayGain info has been calculated from the data in this file
1=this file has been lossily encoded/transcoded since this ReplayGain info was calculated.

What's the point of this? If you have a file with ReplayGain info, you can transcode it and copy the RG info across. It'll be close enough to give you excellent loudness equalisation, and you won't have to re-calculate it. Yet they'll be a label there to tell all you anal retentives that it's not quite right, and should be recalculated if you want to be 100% sure (especially important for peak amplitude).

You could (should?) have one “ReplayGain lossy approximation” bit for each of the four values, which gives you the chance (for example) of re-calculating the peak values (quick, and important - so let's do it), but leaving the ReplayGain values (slow, and unimportant - so let's not do it).


4. ReplayGain user adjustment

Instead of suggesting that users should change the calculated values if they wish, give them a field to enter their own value if they really have to. Players should give the option to read the user value in preference to any others (i.e. let it act as an over-ride), and taggers should give the option of removing the user values from all (downloaded) files.


5. ReplayGain RealLife adjustment

The gain required to give the actual SPL of the original event (in a calibrated system), or a human judged sensible replay level (see the explanation behind the original "Audiophile (http://replaygain.hydrogenaudio.org/faq_radio.html)" level and the work of Bob Katz (http://www.digido.com/) if you think this is an impossible idea). I've found a few DVD-A discs that have this information (it's in the MLP stream), so it would be nice to have somewhere to store it. It's unlikely to get used much, but it would be a useful thing to have. It would be the last link in some of the best recordings out there.



I'd like to come to a consensus of which ones of these (if any/all) should be included, and then get some specs as to how they are/should be stored in each file format (especially APE2.0 tags) finalised and published on-line.

Comments? Suggestions? Offers of help?


btw I've received a couple of suggestions for improving the ReplayGain calculation. One is trivial, and seems like a great idea. I'll post it for testing when the problem of version numbering is solved. If anyone else has slightly or totally re-worked the ReplayGain algorithm/concept, now would be a good time to step forward! We could do listening tests to find the best candidate for "calculation version 2".


Cheers,
David.

Newbie warning: this thread is not for asking questions about ReplayGain that are already answered on www.replaygain.org or in previous threads on HA. (I'm always happy to answer "silly" questions via email – half of them aren't silly at all.)

However, if you do already have some understanding of ReplayGain then this thread is the perfect place for clarifying anything to do with the above proposals which is not clear.
Title: Improving ReplayGain
Post by: Gabriel on 2003-11-18 16:11:37
I think that a point that should be clarified/update is in which format to store the rg values.

In the current Lame header, it is stored as floating point data. However, this could be a source of problems on some platforms. It appeared that we probably need an integer representation.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-18 16:22:11
Frank heavily criticised my proposal for storing RG info in .wav files because I used floating-point representation for the peak values; in what was basically a fixed point format.

I agree - there needs to be a resolution of this problem. Frank's idea was to use fixed point 16-bit, representing 0-65535 (i.e. 0-200% peak). It's one solution, with its own advantages and disadvantages: You can't store peaks above 200% (which do happen! Lossy encoding of modern CDs), and you can't do perfectly accurate 100% normalisation or clip prevention on 24-bit decodes (though you can get close enough - depends how anal you are).

A 32-bit INT would offer greater flexibility, but would it bring its own problems? I'm thinking: middle 16-bit as normal, lower 8 bits for increased resolution, upper 8 bits for >100%. Would this be difficult to program?


EDIT: Aren't the RG values themselves fixed point, using that horrible binary format I invented for the task? I don't propose using that binary format anymore, but fixed point would be good.

Cheers,
David.
Title: Improving ReplayGain
Post by: Gabriel on 2003-11-18 16:32:56
You might be interested by:
http://sourceforge.net/mailarchive/forum.p...8&forum_id=5500 (http://sourceforge.net/mailarchive/forum.php?thread_id=3328808&forum_id=5500)
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-18 16:35:24
Moderators:

On this page:
http://replaygain.hydrogenaudio.org//typic...al_results.html (http://replaygain.hydrogenaudio.org//typical_results.html)

The downloadable files aren't there. Moved? Deleted? Never there? I can find them if you want them, but it'll take a while.


Developers:

Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

Cheers,
David.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-18 16:38:42
Quote
You might be interested by:
http://sourceforge.net/mailarchive/forum.p...8&forum_id=5500 (http://sourceforge.net/mailarchive/forum.php?thread_id=3328808&forum_id=5500)

Thanks - I agree. So - what range? And how?

It certainly needs to hold values above 100%, otherwise it's useless.

Cheers,
David.
Title: Improving ReplayGain
Post by: Digga on 2003-11-18 22:26:57
Quote
(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

I think it's a good idea to store the actual number, instead of the adjustment (just in terms looking good and beeing more clear, don't realy know about the technical problems included).
why too confusing? because ppl have got used to the method how it is now?
ppl can change. for me, this change would be something nice.
Title: Improving ReplayGain
Post by: saratoga on 2003-11-18 22:49:52
I would just like a way to write replaygain info into the gain/volume/whatever its called field on MP4 files.  That way i could get some hardware support (Ipod).
Title: Improving ReplayGain
Post by: Gabriel on 2003-11-19 14:28:38
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Agree.


Quote
So - what range? And how?

Our problem is only with the peak value. In the Lame tag, it is stored using 32bits, so we have 32bits to define a format.

I would suggest just using an unsigned integer. Our needs are:
*beeing able to have enough precision for 0-100% range
*beeing able to store values higher than 100% (btw, how much higher?)

Ideally, a 24bits precision for the 0-100%range would be nice.

First proposal:
Use 0 - 100 000 as 0 - 100% range.
Precision is more than 24 bits (a little more than 26bits), and this would allow for about up to 4000% (considering that the maximum unsigned int value is 4G). Moreover, it is quite simple, just a linear scale.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-20 11:43:04
That would be fine.

Or...


Would the following work. It's the same, but using a different linear scale factor, which fits in neatly with 16- and 24-bit data, like this:

Field = 32-bit INT.


For 16-bit audio data, use

00000000xxxxxxxxxxxxxxxx00000000

Where xxxxxxxxxxxxxxxx is the peak value.
(1000000000000000 is the largest possible value for linear 16-bit data, e.g. a .wav file)


For 24-bit audio data

00000000xxxxxxxxxxxxxxxxxxxxxxxx

Where xxxxxxxxxxxxxxxxxxxxxxxx is the peak value.

etc

If the peaks are greater than 200% then obviously the leading 0s would be used to indicate this. So, in the mp3 case, you find the peak using a decoder which allows headroom, and muliply the normalised result by (2^23).

Using (2^23) rather than 100000 (which you suggested) as the scale factor sounds strange, but it means 16 and 24-bit data can simply be pasted into the field just by shifting the bits, which would avoid multiplication and rounding errors.


digital full scale is
00000000100000000000000000000000
i.e. 2^23

You get exactly 24-bit accuracy, and 54dB of headroom (i.e. 51200%, I think!)


Would this be easy to program?


Should we change peak values to fixed point in all implementations?

Would it be easy for players to use, because I'm thinking about this being a useful convention to employ in all formats, since floating point isn't strictly needed, and is causing rounding confusion.

Or would it be stupid to change to fixed point for the peak value in other formats, because this would break compatibility with old players?

Cheers,
David.
Title: Improving ReplayGain
Post by: Gabriel on 2003-11-20 12:04:59
Seems interesting. It would be nice to hear other opinions.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-21 11:24:32
Do none of the developers have any comments?


Two more issues:

1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

2. MTRH has reminded me that a ReplayGain logo is long overdue. Shall I launch a competition? If so, I'll wait until the HA one is well out of the way.

Cheers,
David.
Title: Improving ReplayGain
Post by: phwip on 2003-11-21 11:31:59
Quote
1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

Would people really trust replay gain values stored on freedb to be correct?  I use freedb with EAC to get track titles, etc.  But this is because I know I can check these titles against the correct ones on the CD cover and change them where appropriate.  Often there are spelling errors or other issues.

With replay gain values the only way I would know whether they are are correct would be to scan the files, and if I'm going to do that I don't need freedb anyway.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-21 13:04:11
Quote
Quote
1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

Would people really trust replay gain values stored on freedb to be correct?  I use freedb with EAC to get track titles, etc.  But this is because I know I can check these titles against the correct ones on the CD cover and change them where appropriate.  Often there are spelling errors or other issues.

Certainly the information on there has many errors. But these are human errors, and there's no room for human error when calculating and automatically submitting ReplayGain values.

There could be other problems:

1. Different releases of the same CD with different loudnesses.
Hopefully the different mastered versions will have slightly different TOCs. This is usually the case. In which case, they can be detected and catalogued as different versions by freedb.

2. The values are calculated from a different format (e.g. .wav when you have mp3, mp3 when you have mpc etc etc)
That's one reason for suggestion 3 in my first post. See there.

3. Someone has intentionally submitted incorrect values / someone changed the gain of an album before calculating the ReplayGain
Yes - that's a problem. As with other fields, people can correct the data, and/or the server can weed out erroneous entries because they'll be swamped with correct ones.


Quote
With replay gain values the only way I would know whether they are correct would be to scan the files, and if I'm going to do that I don't need freedb anyway.


You would have to calculate the peak values yourself anyway (easier and quicker than the ReplayGains) because they're encoding dependent, so the accuracy of the peak values is not an issue. (freedb should hold the peak values for the lossless versions).

For the actual ReplayGain values (Track and Album), if they make the tracks sound the same loudness as other tracks on playback, then it's doing its job, and that's fine. If they don't, then you'll notice, and you can recalculate them if you want.

But it doesn't matter if they're "correct" to how ever many decimal places, because ReplayGain is just an estimate. What matters is that the ReplayGain values work. If you want to, you can check if they work or not very quickly just by skipping through the album. If it's too loud or too quiet, they're wrong!

So there's no reason to recalculate them all to check their accuracy. If it was me, I'd happily grab all the ReplayGain values I needed from freedb, and only re-tag them myself if I heard a problem.

But maybe that's just me?

Cheers,
David.
Title: Improving ReplayGain
Post by: n68 on 2003-11-21 13:43:51
Quote
Quote


1. Different releases of the same CD with different loudnesses.
Hopefully the different mastered versions will have slightly different TOCs. This is usually the case. In which case, they can be detected and catalogued as different versions by freedb.



gday..

i guess the UPC code will take care of that..
(assuming there is a original rip)


Title: Improving ReplayGain
Post by: robUx4 on 2003-11-23 16:03:08
Just a question from a user point of view. iTunes has the ability to calculate the replain gain of a track. Is it the same base as for the ReplayGain values ? (I never checked if iTunes store the value in the file or not)
Title: Improving ReplayGain
Post by: robUx4 on 2003-11-23 16:05:13
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.
Title: Improving ReplayGain
Post by: guruboolez on 2003-11-23 16:23:40
Quote
Just a question from a user point of view. iTunes has the ability to calculate the replain gain of a track. Is it the same base as for the ReplayGain values ? (I never checked if iTunes store the value in the file or not)

I played a short and very quiet sample (part of an orchestral recording) in iTunes : it was terribly much quieter than RG recommandations (+20 dB).
Can't sure that we could extrapolate this difference, but I suppose that iTunes gain system is different (less accurate too, according to the calculation speed).
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-23 16:30:13
Quote
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

If you have time, please read the entire thread

You'll see I suggest switching back to this method.

However, I don't think it's realistic to switch now, because it would dramatically break compatability with existing players. This would be a very bad thing, unless someone can see a way around it.

The other additions will not break compatability with existing players, so it's just a question of whether developers want to implement them.

Cheers,
David.
Title: Improving ReplayGain
Post by: Case on 2003-11-23 16:55:27
Quote
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

This would not work. Players would still need to figure how much the gain needs to be changed since playback loudness isn't calibrated in any way. Media Jukebox would calculate volume change need with formula 83dB - 84dB = -1dB when others would calculate it with 89dB - 84dB = +5dB.
PS. your example is incorrect, 82dB + +1dB = 83dB, thus value to store would be 82 and not 84.
Title: Improving ReplayGain
Post by: dev0 on 2003-11-23 17:36:28
3. ReplayGain lossy approximation

Storing this seems pointless to me, since ReplayGain calculations will become inaccurate after transcoding and no tool should be copying ReplayGain values when transcoding.
Title: Improving ReplayGain
Post by: /\/ephaestous on 2003-11-24 05:02:05
Quote
1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

I RG my discs before burning a backup, so if a friend pops that copy, the TOC will match and give back erroneous RG info.

To solve this we could store the RG info, plus a RG value for, say, the first 30 seconds of the album, so the RG info for that part of the disc is calculated and sent with the query (generated Disc ID). This way one could be sure the RG info is correct if the sent value and the value in the db match (with a +-5% confidence).
Title: Improving ReplayGain
Post by: /\/ephaestous on 2003-11-24 06:01:12
Quote
3. ReplayGain lossy approximation

Storing this seems pointless to me, since ReplayGain calculations will become inaccurate after transcoding and no tool should be copying ReplayGain values when transcoding.

the inaccurancy is minimal enough to be dismissed. This is a small test I made:

Iron Maiden - [Dance of Death #04] Montségur [5:48]

PCM
-10.54 dB

PCM --> MP3
-10.55 dB

PCM --> MP3 -->Musepack
-10.53 dB

PCM --> MP3 -->Musepack --> Vorbis
-10.57 dB

PCM --> MP3 -->Musepack --> Vorbis --> Wavpack (Lossy)
-10.57 dB

PCM --> MP3 -->Musepack --> Vorbis --> Wavpack (Lossy) --> Nero MP4
-10.54 dB

The biggest difference was -0.03dB which is a -0.284% diff from the original, I picked this track because is loud enough to make most lossy encoders go beyond full scale.
Title: Improving ReplayGain
Post by: guruboolez on 2003-11-24 10:18:35
Try lame abr for exemple (there's a --scale 0.98 included in the preset). Difference will be higher.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-24 10:50:31
Quote
Quote
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

This would not work. Players would still need to figure how much the gain needs to be changed since playback loudness isn't calibrated in any way. Media Jukebox would calculate volume change need with formula 83dB - 84dB = -1dB when others would calculate it with 89dB - 84dB = +5dB.

Sorry Case, but I think you're wrong.

At the moment, people store the gain change needed to match a standard loudness. Most use 89dB as that standard, but some use 83dB. So, there's confusion.

But they all measure the "perceived" loudness of the track the same way. (They're all taking my "pink_ref.wav" file, or whatever it was called, to be 83dB, after SMPTE RP-200 - after a real, and long existing standard). So if you store the "perceived" loudness, there's no confusion.

e.g. perceived loudness of track = 93dB SPL.
Musepask relates this to 89dB, and stores a ReplayGain of -4dB
MediaPlayer relates this to 83dB, and stores a ReplayGain of -10dB
But in both cases, the perceived loudness of the original track is 93dB.

It should be apparent that by just storing 93dB, any player can figure out what to do. (target volume - 93dB = required gain change, e.g. 89-93=-4dB).


BUT, though I think it would be nice to do this, I'm not saying we should; it would break compatibility with existing players, wouldn't it? They're expecting the gain changed in the tag, and would read it as a +93dB ReplayGain - that's just a bit too loud!

Cheers,
David.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-24 10:51:24
Quote
3. ReplayGain lossy approximation

Storing this seems pointless to me, since ReplayGain calculations will become inaccurate after transcoding and no tool should be copying ReplayGain values when transcoding.

The ReplayGain values will be close enough. The peak values may not be, but they are much quicker to re-calculate.

Cheers,
David.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-24 10:58:30
Quote
Try lame abr for exemple (there's a --scale 0.98 included in the preset). Difference will be higher.

But in the case where an intentional gain change is applied, it's each to correct the ReplayGain values.

No software current "transcode" the ReplayGain values by default, so nothing is copying over incorrect values.

I'd suggest that any software that does "transcode" the ReplayGain values should
a) set the "lossy" (or whatever it gets called) flag, and
b) correct the values for any known gain change applied during the process

Both are much much quicker and easier than re-calculating the ReplayGain values.



I'd better mention something here: All this should make things a lot easier for the user. Any extra complexity introduced by these additions will go into the software, and be hidden from the user. The result should be that the software is able to do the "right thing" by default. Very simple.

Cheers,
David.
Title: Improving ReplayGain
Post by: guruboolez on 2003-11-24 11:26:26
And what about the idea of a personnal track/album gain, set by the user? Purpose:
- useful if flaws in the RG calculation model
- useful if an audiophile want to keep a better coherency between different albums. For exemple, RG makes a gamba, an harpsichord, a flute sound as loud as an orchestra or a heavy metal band. It's the purpose of RG to do it. The idea is nice, but in some case, it doesn't have sense. I've recently bought a CD, anthology of the best sound recording of te year. The booklet is clear: some instrumental tracks have to be played much quieter than others in order to maintain high-fidelity principles. I've another disc, with an instrument called "clavicorde" (small harpsichord). The mastering level is very quiet; why? Because instrument sound is covered by human voice. RG will explode the volume (and background noise), and ruin the engineer and artist's will.

I suppose that RG can't determine if an instrument should be louder than another. Therefore, manual correction (and software tool for batch correction) is really needed.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-24 11:54:55
See my reply in your other thread, and my suggestion for "user" and "real" ReplayGains in my first post in this thread.

Please reply in this thread.

Cheers,
David.
Title: Improving ReplayGain
Post by: Case on 2003-11-24 17:06:54
Quote
Sorry Case, but I think you're wrong.

Yup, I realized it seconds after posting. I had reference levels and calibrations in my mind and didn't consider the possibility of skipping all that during scanning.
Title: Improving ReplayGain
Post by: 2Bdecided on 2003-11-26 10:02:58
Another suggestion (this isn't fundamental)...

It would be useful to copy over the DialNorm and MixLev values from Dolby Digital (AC-3) data when it's transcoded.

MixLev should go into the new "ReplayGain Real" field, and DialNorm could probably go into the existing Album Gain field (in which case the new field to indicate how the gain was calculated would be useful).

I'll figure out appropriate conversion factors, and maybe seek help from the Doom9 crowd.

Cheers,
David.
Title: Improving ReplayGain
Post by: andyh on 2004-01-02 16:44:53
I'm confused as to how the RealLife level is different than the artist/producer origin code in the id3 proposal.  Would it be more consistent to include a separate track and album setting for this setting?  I also don't understand how storing the calculation method would work.  Theoretically the track gain could have been calculated by version 1 of the algorithm and the album gain could have been read from the cd.  Which value would be stored in the calculation method field?  Is it stored seperately for the track and the album field?

I think it would be a good idea to keep the id3 tag spec up to date with these suggestions.  Since nobody has implemented it yet, I think that we should not worry about keeping it compatible. 

Since David has said that he doesn't like the format of the gain values, I would like to change those as well.  I think that the gain values should be stored as signed integers by simply multiplying the value by ten(or one hundred if the extra precision is usefull).  Information about which values are set could be stored in a bitfield along with the lossy bit. 

If the lame header is going to be changed to use an int for the peak value, now would probably be the best time to change the formats of the gain values as well.  It might be nice if they would allocate space for the album peak value as well. 

Here is my proposal for the contents of the id3 frame:

#define LOSSY 0x1
#define HAS_AUTO_TRACK_GAIN 0x2
#define HAS_AUTO_ALBUM_GAIN 0x4
#define HAS_USER_TRACK_GAIN 0x8
#define HAS_USER_ALBUM_GAIN 0xf
#define HAS_PRODUCER_TRACK_GAIN 0x10
#define HAS_PRODUCER_ALBUM_GAIN 0x20

struct {
  long track_peak;
  long album_peak;
  char calculation_method;
  short reference_gain;
  short bitfield;
  signed short auto_track_gain;
  signed short auto_album_gain;
  signed short user_track_gain;
  signed short user_album_gain;
  signed short producer_track_gain;
  signed short producer_album_gain;
  short right_undo;
  short left_undo;
};

I have included both left and right undo values because mp3gain is storing both in the APE tags.  I don't think anybody really scales the channels seperately, but I think that it would be good to store the same data in the different tag formats. 

I would also like to know whether anyone intends to request that the new frame be added to the id3 spec.  Section 3.3 (http://www.id3.org/id3v2.3.0.html#sec3.3) of the 2.3.0 spec says:

The frame ID made out of the characters capital A-Z and 0-9. Identifiers beginning with "X", "Y" and "Z" are for experimental use and free for everyone to use, without the need to set the experimental bit in the tag header. Have in mind that someone else might have used the same identifier as you. All other identifiers are either used or reserved for future use.

If no one intends to propose adding replaygain to id3, we will need to rename the frame.  Would "XRGA" be acceptable? 

Any comments or suggestions would be welcome.
Title: Improving ReplayGain
Post by: Lear on 2004-01-02 18:10:27
Quote
(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

Interesting...  I suggested changing it like that a year and a half ago, but you weren't too fond of the idea then (see here (http://www.hydrogenaudio.org/forums/index.php?showtopic=1709&view=findpost&p=16589))... 

Quote
At the moment, people store the gain change needed to match a standard loudness. Most use 89dB as that standard, but some use 83dB. So, there's confusion.

But they all measure the "perceived" loudness of the track the same way. (They're all taking my "pink_ref.wav" file, or whatever it was called, to be 83dB, after SMPTE RP-200 - after a real, and long existing standard). So if you store the "perceived" loudness, there's no confusion.

And this is the very reason why I suggested the change! 

(Btw, I must've missed this thread when it started... I should read through it, in case I have any comments.)

[span style='font-size:8pt;line-height:100%'](Edit: Added second quote.)[/span]
Title: Improving ReplayGain
Post by: Lear on 2004-01-02 21:46:58
Quote
Field = 32-bit INT.


For 16-bit audio data, use

00000000xxxxxxxxxxxxxxxx00000000

Where xxxxxxxxxxxxxxxx is the peak value.
(1000000000000000 is the largest possible value for linear 16-bit data, e.g. a .wav file)


For 24-bit audio data

00000000xxxxxxxxxxxxxxxxxxxxxxxx

Where xxxxxxxxxxxxxxxxxxxxxxxx is the peak value.

One problem is that you can't differentiate "24 bit where the low 8 bits just happen to be 0" from "16 bit". So why not keep it simple, i.e. fixed point, where 1.0 is full scale. 23 bits fraction is enough, but I think 24 bits would be "cleaner" (e.g., 1.0 would then be 0x01000000). Allowing 256 times full scale ought to be enough... 

Quote
Should we change peak values to fixed point in all implementations?

Would it be easy for players to use, because I'm thinking about this being a useful convention to employ in all formats, since floating point isn't strictly needed, and is causing rounding confusion.

Or would it be stupid to change to fixed point for the peak value in other formats, because this would break compatibility with old players?

Doing it only for consistency isn't that important, IMO. Both are about as easy, I'd say (not that I've done much fixed-point stuff). It could be good to keep the precision about the same though (VorbisGain does that).

If they are stored in human readable format (i.e. Vorbis or APE tags), I'd say floating point is preferable, as it is easier to understand, even if it would require a bit more code on (embedded) systems without an FPU.
Title: Improving ReplayGain
Post by: SamK on 2004-01-04 14:48:04
I think it's the right time to switch to absolute replaygain value (90dB instead of +1dB).
Most the programs that support replaygain atm are frequent updates program, so backward player compatibility shouldnt be a problem too long.
If some player take months to update, its users would just have to stick to relative gain values.

Anyway, changing the representation of the number (fixed / float / ..)  would break the compatibility all the same, wouldnt it ?  So it's definitely the right time to do both changes at once.

I don't think it's a problem as long there is backward compatability for the files themselves, ie an old file with replaygain value should still be supported by new-replaygain supporting players.

if both value meaning and value encoding are to be changed, it sounds safer to choose between old and new meaning from another data. And the proposed 'method calculation Version'  field presence would be enough to know it's a new gain tag.

I'm for applying all the good changes at once.

If you're really concerned about the risk of someone sueing you after playing a new replaygained file  with an old player and  blowing his ears up due to ludicrous pre-amping , let's just use another name for the gain value. RG2, whatever, and this wont be a risk anymore.

--
SamK
Title: Improving ReplayGain
Post by: knik on 2004-01-05 11:16:16
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

Does the '+92dB' approach use 16-bit min RMS (+-1 samples) as a reference or am I missing something?
I think reference level should be bit depth independent e.g. max RMS.
If the current ref level is some (maxrms - 7dB) then I think it's not bad.

Edit:
After closer look:
83dB = 14125.4 and 16-bit maxrms = 32768, hence 83dB = maxrms - 7.3dB

I would suggest to redefine reference level from 83dB to maxrms-7dB. It would be much less confusing.
Title: Improving ReplayGain
Post by: saratoga on 2004-01-06 05:28:27
Stupid question:  Is 0dB relative also 96 dB in 16 bit?  I'm not sure what it means when i set the volume to -89dB.
Title: Improving ReplayGain
Post by: SamK on 2004-01-06 13:52:15
Quote
After closer look:
83dB = 14125.4 and 16-bit maxrms = 32768, hence 83dB = maxrms - 7.3dB

I would suggest to redefine reference level from 83dB to maxrms-7dB. It would be much less confusing.

ah ok, I see what you mean.
Considering a signal as a flow of unitless, infinite precision numbers.
ReplayGain computes a reference level (95-th percentile of all 0.05s frames RMS values). this unitless number is turned into a dB, let's call it absRL.

If a signal  in [-1, 1] is multiplied by 2^depth, the absRL is shifted :
8 bit  : (max/oldmax)^2 =(2^8)^2  ~= 6.5 *10^5 ~= 10^4.8  =>  absRL += 48dB
16 bit : (max/oldmax)^2=(2^16)^2 ~= 4.2 *10^9 ~= 10^9.6  =>  absRL += 96dB
24bit  : (max/oldmax)^2=(2^24)^2                                          => absRL +=144dB

so let"s call those values:
fullScaleDB(bit_depth) = (bit_depth /8) * 10*log(2^16)
(adds 48.165dB every 8bit..)

If files of varying bitdepths were common,  someone looking at their absRL would need to substract them with this 48.165*bit_depth/8  in order to know which one sounds louder when played at full volume.

So you're right, it's better to store :

(absRL(song) -  fullScale_dB(bit_depth) )

which is in the fact the absRL of the songs if its samples are scaled back to [-1, 1].
it would be the 'absolute normalized Reference Level', ANRL.

btw I think the ANRL can still be positive, due to the filtering done before computing RMSs - which boosts human-sensitive frequencies and dampens others, so it can produce some samples > 1.0 from a [-1,1]-normalized signal.
(a song in 16 bit can be  at absRL=100 dB or even a bit more)
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-06 14:12:14
Quote
Quote
(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

Interesting...  I suggested changing it like that a year and a half ago, but you weren't too fond of the idea then (see here (http://www.hydrogenaudio.org/forums/index.php?showtopic=1709&view=findpost&p=16589))... 

Quote
At the moment, people store the gain change needed to match a standard loudness. Most use 89dB as that standard, but some use 83dB. So, there's confusion.

But they all measure the "perceived" loudness of the track the same way. (They're all taking my "pink_ref.wav" file, or whatever it was called, to be 83dB, after SMPTE RP-200 - after a real, and long existing standard). So if you store the "perceived" loudness, there's no confusion.

And this is the very reason why I suggested the change! 

(Btw, I must've missed this thread when it started... I should read through it, in case I have any comments.)

[span style='font-size:8pt;line-height:100%'](Edit: Added second quote.)[/span]

Hi Lear!

I remember that thread! There was no way I was going to change it back again and confuse everyone again, since the argument was basically about whether or not to add 83dB at the end. I naively assumed that everyone would follow the suggestion, and there would be no confusion. Ha - some chance!     

It's reminded me of something though: I expected people to think that things were too quiet, so suggested the player should default to adding 6dB to the values. What people chose to do instead was to make the calculation add 6dB to the values (if you think about it, the values stored in every file are 6dB greater than I suggested - because they get you to 89dB, not 83dB).

I wonder if I'd stuck with my original thought (what you proposed) if there still would have been confusion because someone would get the calculation to add 6dB to the value to have the same effect. Or else they would see that all the players used 89dB as a reference, but their calculator used 83dB as a reference, and change it. Or they'd just take the ref_pink.wav file and boost it by 6dB.

I do, in retrospect, think adding 83dB (and hence storing 92dB instead of -3dB or -9dB) is a better solution. But I have a feeling that someone would still have managed to mess it up!
Title: Improving ReplayGain
Post by: knik on 2004-01-06 17:45:44
Quote
I do, in retrospect, think adding 83dB (and hence storing 92dB instead of -3dB or -9dB) is a better solution. But I have a feeling that someone would still have managed to mess it up!

I really think we should forget about 16-bit dynamic range and use maxrms as a reference otherwise we will always have some confusion.
Title: Improving ReplayGain
Post by: knik on 2004-01-06 19:56:23
Quote
So you're right, it's better to store :

(absRL(song) -  fullScale_dB(bit_depth) )

which is in the fact the absRL of the songs if its samples are scaled back to [-1, 1].
it would be the 'absolute normalized Reference Level', ANRL.

Yes, that's the point. We should use 1.0 as a reference for [-1,1] samples and we don't need any sample bit-depth assumption here.
It can always be rescaled to the actual output sample depth.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-07 13:07:02
Quote
Quote
So you're right, it's better to store :

(absRL(song) -  fullScale_dB(bit_depth) )

which is in the fact the absRL of the songs if its samples are scaled back to [-1, 1].
it would be the 'absolute normalized Reference Level', ANRL.

Yes, that's the point. We should use 1.0 as a reference for [-1,1] samples and we don't need any sample bit-depth assumption here.
It can always be rescaled to the actual output sample depth.

knik,

I didn't get around to replying to your (and other people's) posts because I didn't have the time, but I'd better squash this idea before it goes any further.

ReplayGain is referenced to SMPTE RP 200, a calibration by which a -20dB FS RMS pink noise signal will give a real world SPL of 83dB. All RG figures come from this concept, and all ReplayGain values are the gain adjustments needed to make that track (or album) match the perceived loudness of that test signal. (+6dB in most implementations)

The values are not based on bit depth. The notion of "how loud" a full scale sine wave is flows from SMPTE RP 200, and it is not 90dB, 96dB or 144dB. It's frequency dependent, but will be 103dB SPL for 2kHz (IIRC in the calculations I originally proposed).

The exact values depend on the "psychoacoustic" model used to determine the loudness of a given track or album. Different psychoacoustic models can be calibrated to the SMPTE RP 200 standard and used interchangeably (This means people can improve or change the ReplayGain calculation without messing everything up - compatibility and interchangeability is ensured).

Taking a non psychoacoustic standard (i.e. choosing digital full scale to equal some dB value) would make it very difficult to update the psychoacoustic model and calibrate it with previous versions. There are already several incompatible, uncalibrated, and largely unused methods for “correcting the loudness differences between tracks or albums”. I didn’t want to create yet another one!

The common sense approach to calibrating a system which judges perceived loudness is to define a specific test signal, and how loud this signal should be. As the industry has already done this, it made sense to follow this existing calibration.

Hope this helps. Please read http://www.replaygain.org/ (http://www.replaygain.org/) for more information.

Cheers,
David.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-07 13:16:36
Quote
Quote
Field = 32-bit INT.


For 16-bit audio data, use

00000000xxxxxxxxxxxxxxxx00000000

Where xxxxxxxxxxxxxxxx is the peak value.
(1000000000000000 is the largest possible value for linear 16-bit data, e.g. a .wav file)


For 24-bit audio data

00000000xxxxxxxxxxxxxxxxxxxxxxxx

Where xxxxxxxxxxxxxxxxxxxxxxxx is the peak value.

One problem is that you can't differentiate "24 bit where the low 8 bits just happen to be 0" from "16 bit". So why not keep it simple, i.e. fixed point, where 1.0 is full scale. 23 bits fraction is enough, but I think 24 bits would be "cleaner" (e.g., 1.0 would then be 0x01000000). Allowing 256 times full scale ought to be enough... 

But it is fixed point, and I don't see why you'd need to "differentiate" between 24-bits (last 8 bits zero) and 16-bits. Can you explain?


Quote
Quote

Should we change peak values to fixed point in all implementations?

Would it be easy for players to use, because I'm thinking about this being a useful convention to employ in all formats, since floating point isn't strictly needed, and is causing rounding confusion.

Or would it be stupid to change to fixed point for the peak value in other formats, because this would break compatibility with old players?

Doing it only for consistency isn't that important, IMO. Both are about as easy, I'd say (not that I've done much fixed-point stuff). It could be good to keep the precision about the same though (VorbisGain does that).

If they are stored in human readable format (i.e. Vorbis or APE tags), I'd say floating point is preferable, as it is easier to understand, even if it would require a bit more code on (embedded) systems without an FPU.


You might say that, but Frank Klemm simply said "Floating point is a stupid idea" and coded it fixed point, 16-bit, with 6dB headroom above digital full scale. And he did that on the format "MusePack" which has 24-bit encoders and decoders, and can easily peak above 6dB above digital full scale. His argument was that he had 16 bits spare, he didn't want to use floating point, and what he stored should be enough to prevent clipping in all but the most severe situations.

When other people are coding it, you have to try to please them as well as yourself!

Cheers,
David.
Title: Improving ReplayGain
Post by: Gabriel on 2004-01-07 13:23:18
Lame is using the fixed point representation from David since 3.94b
Title: Improving ReplayGain
Post by: SamK on 2004-01-07 14:57:49
Quote
ReplayGain is referenced to SMPTE RP 200, a calibration by which a -20dB FS RMS pink noise signal will give a real world SPL of 83dB. All RG figures come from this concept, and all ReplayGain values are the gain adjustments needed to make that track (or album) match the perceived loudness of that test signal. (+6dB in most implementations)

The values are not based on bit depth. The notion of "how loud" a full scale sine wave is flows from SMPTE RP 200, and it is not 90dB, 96dB or 144dB. It's frequency dependent, but will be 103dB SPL for 2kHz (IIRC in the calculations I originally proposed).

ah ok, the replaygain is already bitdepth independant. I had read most of replaygian documents, but this wasnt clearly stated anywhere.
If I had known matlab's wavread function returns an array of numbers in [-1, 1], I would have gotten the clue from the matlab demonstration code..
Maybe you should add a first step in the 4-step "General Concept" at http://replaygain.hydrogenaudio.org/rms_energy.html (http://replaygain.hydrogenaudio.org/rms_energy.html),  like :
0. the signal is converted to floating point numbers, and divided by the full scale of the original format. (which is 2^15 for 16 bit integer encoding)

or something, to insure everyone gets this point.

To sum up what I understood,
replaygain computations are bitdepth independant from the start,
and the proposal is to store

Vrms = 83+ (replaygain(filename) - ref_Vrms);

(with ref_Vms being the gain of the standard digital signal corresponding to 83db SPL
ref_Vrms = replaygain("pink_ref.wav");  )

instead of previous :
Vrms = - (replaygain(filename) - ref_Vrms);

Then players would now use the stored value like that :
average_song_Vrms = 89; // user setting
rel_gain = average_song_Vrms - Vrms;
ratio = 10^(rel_gain/20);
// multiplies decoded samples by ratio.
Title: Improving ReplayGain
Post by: SamK on 2004-01-07 19:34:21
reading http://home.earthlink.net/~bobkatz24bit/integrated.html (http://home.earthlink.net/~bobkatz24bit/integrated.html), and the K-N VU meters, I realized there is no reason why the magic 83dB number from SMTPE RP200 standard should appear in ReplayGain. The computation is all in the digital domain, no SPL number should arise.

What SMTPE RP200 brings to us is only a standard -20dBFS signal to calibrate measures on.
The fact that this signal is supposed to actually produce sound at 83dB SPL in a calibrated hi-fi system is of no importance here, as we're only doing things *before* the actual sound system.

in fact, if you have a song with replaygain = +20 dB (relative to the original 83dB reference), it really means it is measured to perceptually sound 10 times louder overall than the reference pink noise signal (which is used as calibration reference for replaygain = +0dB)

That's all.

The real point of Replaygain is to compute
HR = replaygain -20 
- aka :  (AbsReplaygain-83) - 20
as a good measure of the overall headroom of the song. (ratio between peak capability of medium and "average level").

Indeed, if you take the -20dB FS  standard pink noise sound, whose replaygain is exactly 83dB (by definition),  HR will be exactly -20dB.
translate that to any signal, and you get HR to be indicative of the overall headroom.
It will be slightly negative values for most pop songs (maybe possibly slightly positive for a real loud sound concentrated in frequencies boosted by the psychoacoustics filter in use)
And could be lower than -10dB for classical music or anything with a bit more dynamic range.

So, if it is decided to switch to storing an absolute value, I'm suggesting storing the value HR.
(which is in fact the relative value minus 20 .. )
It gives all the info replaygain has to give, is independent to bit_depth AND the psychoacoustics filter used just as well as current replaygain is.
Plus it only takes from the SMTPE RP200 standard what it really uses : the choice of a reference signal so that different psychoacoustic implementations can calibrate on it. 

And its value is much more intuitive, much less confusing than expressing the value in terms of SPL produced by calibrated system, which does not belong here.

Is it not ?
Title: Improving ReplayGain
Post by: Lear on 2004-01-07 19:47:24
Quote
Quote

One problem is that you can't differentiate "24 bit where the low 8 bits just happen to be 0" from "16 bit". So why not keep it simple, i.e. fixed point, where 1.0 is full scale. 23 bits fraction is enough, but I think 24 bits would be "cleaner" (e.g., 1.0 would then be 0x01000000). Allowing 256 times full scale ought to be enough... 

But it is fixed point, and I don't see why you'd need to "differentiate" between 24-bits (last 8 bits zero) and 16-bits. Can you explain?


If you decode the value in the same way, regardless of bit depth, you'll get a kind of rounding error (or whatever it should be called) when dealing with the 16-bit value. E.g., 0x3FFF00 (half scale in 16 bit) is not the same as 0x3FFFFF (half scale in 24 bit). Sure, the error will be small, but it'll be there.  (Of course, if the processing is all done in 16 bits it doesn't matter, as the low bits will be thrown away.)

Quote
You might say that, but Frank Klemm simply said "Floating point is a stupid idea" and coded it fixed point, 16-bit, with 6dB headroom above digital full scale. And he did that on the format "MusePack" which has 24-bit encoders and decoders, and can easily peak above 6dB above digital full scale. His argument was that he had 16 bits spare, he didn't want to use floating point, and what he stored should be enough to prevent clipping in all but the most severe situations.


I'd guess he did it that way because there were 16 bits of reserved space in the file format he could use, so he squeezed in what he could. But that doesn't mean other file formats should do it like that. Still, the actual format in the tag isn't very important, IMO, as long as the necessary resolution is there.
Title: Improving ReplayGain
Post by: knik on 2004-01-07 20:32:22
Thanks for explanation, 2Bdecided. It really helped.
Now I see RG reference level is well defined.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-08 11:29:40
Quote
Quote
ReplayGain is referenced to SMPTE RP 200, a calibration by which a -20dB FS RMS pink noise signal will give a real world SPL of 83dB. All RG figures come from this concept, and all ReplayGain values are the gain adjustments needed to make that track (or album) match the perceived loudness of that test signal. (+6dB in most implementations)

The values are not based on bit depth. The notion of "how loud" a full scale sine wave is flows from SMPTE RP 200, and it is not 90dB, 96dB or 144dB. It's frequency dependent, but will be 103dB SPL for 2kHz (IIRC in the calculations I originally proposed).

ah ok, the replaygain is already bitdepth independant. I had read most of replaygian documents, but this wasnt clearly stated anywhere.
If I had known matlab's wavread function returns an array of numbers in [-1, 1], I would have gotten the clue from the matlab demonstration code..
Maybe you should add a first step in the 4-step "General Concept" at http://replaygain.hydrogenaudio.org/rms_energy.html (http://replaygain.hydrogenaudio.org/rms_energy.html),  like :
0. the signal is converted to floating point numbers, and divided by the full scale of the original format. (which is 2^15 for 16 bit integer encoding)

or something, to insure everyone gets this point.

I think, if you follow it through, it doesn't matter whether wavread returns [1,-1] or [-32768,32767] (you're right saying that it returns the former). As long as the value "ref_Vrms" has been calculated by the same method (which is essential anyway), then calibrating to it (i.e. subtracting it at the end) will cancel out whatever scaling or units or whatever are used at the input. That's because both the file in question, and the ref_pink.wav file will be scaled the same on the way in (to [1,-1] or [-32768,32767] or whatever). Subtracting in the logarithmic domain (which dB is) is the same as dividing in the linear domain. So any scaling is cancelled in this last step.


Quote
To sum up what I understood,
replaygain computations are bitdepth independant from the start,
and the proposal is to store

Vrms = 83+ (replaygain(filename) - ref_Vrms);

(with ref_Vms being the gain of the standard digital signal corresponding to 83db SPL
ref_Vrms = replaygain("pink_ref.wav");  )

instead of previous :
Vrms = - (replaygain(filename) - ref_Vrms);

Then players would now use the stored value like that :
average_song_Vrms = 89; // user setting
rel_gain = average_song_Vrms - Vrms;
ratio = 10^(rel_gain/20);
// multiplies decoded samples by ratio.


Yes exactly - though I'm not strongly suggesting we change it. I was saying it's a pity it isn't like this already, but should it be changed now?

I'll answer your other post, and them expand on that point...


EDIT: 1000th post! Should have made it better! 
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-08 12:11:30
Quote
reading http://home.earthlink.net/~bobkatz24bit/integrated.html (http://home.earthlink.net/~bobkatz24bit/integrated.html), and the K-N VU meters, I realized there is no reason why the magic 83dB number from SMTPE RP200 standard should appear in ReplayGain. The computation is all in the digital domain, no SPL number should arise.

[snip]

And its value is much more intuitive, much less confusing than expressing the value in terms of SPL produced by calibrated system, which does not belong here.

Is it not ?

No, because perceived loudness depends on loudness!

This isn't built into the current psychoacoustic model, but could well be implemented in a future improvement....

If you're listening to a bass heavy track at 60dB, you'll hear much less bass (relatively) than you will at 80dB. This means that increasing the gain on a bass heavy track by 20dB will cause its subjective loudness to be increased more than a 20dB boost to a bass light track. What's more, the perceived loudness increase of that 20dB boost will be different if it's a boost from 40dB to 60dB than if it's a boost from 80dB to 100dB.

If the equal loudness curves were parallel lines, then we wouldn't really have to worry about real world sound pressure. They're not, so it's an issue, and it can only be solved if we make some kind of guess (like the floating ATH in the lame encoder), or calibrate the system properly to a real world loudness - which is what I've chosen to do.

Hope this makes sense.

Cheers,
David.

EDIT: plus see my previous response about how many other schemes exist which are unused because no one knows how they are supposed to be calibrated, or re-calibrated.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-08 12:18:51
I remembered something last night.

When it stored the absolute loudness of the file (e.g. 92dB), it was called Replay Level.

When it changed to storing the relative gain (e.g. -3dB), it was renamed to Replay Gain.


I'm not about to change the name, so it's staying with relative gains, referenced to making the loudness 83dB through a calibrated system, plus 6dB.

(plus, conceptually, saying the file sounds x dB loud at reference playback level doesn't actually tell you how much to change the gain if the non-parallel equal loudness curves are even taken into account (see my previous post). Whereas saying "shift it x dB to make it the reference playback loudness" can include that factor in the calculation).

Cheers,
David.
Title: Improving ReplayGain
Post by: SamK on 2004-01-08 13:15:40
Quote
No, because perceived loudness depends on loudness!

ohhh, you're right...
thanks for the explanations, now I think I understood all there is to understand about replaygain

Quote
If you're listening to a bass heavy track at 60dB, you'll hear much less bass (relatively) than you will at 80dB. This means that increasing the gain on a bass heavy track by 20dB will cause its subjective loudness to be increased more than a 20dB boost to a bass light track.


this frequency dependent sensitivity is handled in the digital domain, so this is not contradictory to avoiding any use of SPL out of replaygain. but then the next point settles it :

Quote
What's more, the perceived loudness increase of that 20dB boost will be different if it's a boost from 40dB to 60dB than if it's a boost from 80dB to 100dB.


okay, if you want to allow for that kind of tuning of the model, you need to make assumption on how the digital signal will be transformed to SPL, so it's natural to express the results in term of SPL  since the computation (may) depend on this assumption.
I'm convinced, it's good that the 83dB SPL number appear in replaygain.

Quote
If the equal loudness curves were parallel lines, then we wouldn't really have to worry about real world sound pressure. They're not, so it's an issue, and it can only be solved if we make some kind of guess (like the floating ATH in the lame encoder), or calibrate the system properly to a real world loudness - which is what I've chosen to do.

Hope this makes sense.


Yes it does, and now I  knowingly agree with the choices made in replaygain 
The fact those curves are not parallel hadnt hit my mind as an issue, and calibration on real world loudness seems the best solution to me.

It felt unnatural at first that algorithms in the digital domain would want to guess what amplification will end up on the signal and how loud I would be listening to my music, but now I see it's not. And further more I realize that the loudness I usually listen to my music is not as much arbitrary as I thought.


Well, now I understood I'm even more for storing the absolute replaygain (à la "92dB").

oh, and if final loudness was used in the psychoacoustic model, the choice of the right SPL reference when computing the gain matters.
the difference in results with reference SPL set to 83 and set to 89 (err, in fact the real other reference is rather 83-6=77 dB SPL, ie the SPL obtained by playing the pink noise on my machine if I've set the amplification to make pop music sound right) is not simply always a difference of 6 dB due to what you explained.

It makes the whole process of replaygain computation+playback a bit more complex if used as a purist.
let's see :
1. you compute the replaygain, choosing as reference the one corresponding to the genre of music it is.
(so that replaygain can make the good assumption on the digital-to-SPL correspondance you'll typically be using at playback)
2. when playing, how do you choose reference ?
if you'll be playing files that were all of the same genre (ie replaygained with same reference), it doesnt really matter as it will just compensate other amp settings.
if you'll be playing different genres of songs,
a. either you want all of them to sound as loud as the others
then you pick one reference (that means either classical music will not be allowed to play loud passages as loud as an audiphile would want, or the pop songs will really be annoyingly loud, depending on which reference you pick)
b. either you want each song to be played with the gain adapted to its kind.
And then, .. , hmm, what to do..
storing relative or absolute replaygain doesnt change the issue.
I guess the 'audiophile' gain was supposed to handle this, but it won't if it really is an album gain.
(or you have to assume all albums of a given genre will be at the same album gain, which may be true for classical music, but not at all for pop songs).

It seems to me the real purist replaygain fanatic will want the file to include the reference used at computation. (or any other way to get the reference to use for the genre of that song)
Else, the user will have to change the setting between songs if he wants to play pop and classical songs in one playlist.
So we should store both replaygain (track and album) and replaygain reference in each file, or did I miss something ?
Title: Improving ReplayGain
Post by: SamK on 2004-01-08 13:27:56
Quote
I'm not about to change the name, so it's staying with relative gains, referenced to making the loudness 83dB through a calibrated system, plus 6dB.

hmm, I think it's in fact 83dB SPL,  -6 dB  => 77dB SPL.
(or you mean the system is calibrated, then is added 6dB attenuation.. )
89 db SPL is the average SPL level if you play pop songs on a SMTPE RP200 calibrated system.
If you play the reference pink noise on a system calibrated to be good to the ear for pop CDs, it will sound 6dB quieter than if it was calibrated for wide range dynamics, i.e. 77 dB SPL.


Quote
(plus, conceptually, saying the file sounds x dB loud at reference playback level doesn't actually tell you how much to change the gain if the non-parallel equal loudness curves are even taken into account (see my previous post). Whereas saying "shift it x dB to make it the reference playback loudness" can include that factor in the calculation).


see the 'purist replaygain fanatic' paragraph at the end of my previous post, I think relative or absolute replaygain are all the same insufficient to have a good method for replaying, if assumed-playback-loudness impact the pyschoacoustics computations.
That's because each song will be computed to a reference which varies according to its genre (else you didnt have to make the model depends on the playback loudness in the first place !)
so the relative gain itself will still be broken, ie. not uniform among songs of different genres.
Title: Improving ReplayGain
Post by: SamK on 2004-01-08 15:29:47
Quote
so the relative gain itself will still be broken, ie. not uniform among songs of different genres.

that is not exactly right, the relative replaygains can be made uniform across music genres,
but the point is, the problem then is we lost the recommended-for-this-genre replaylevel  of the file.

So listening to mozart after metallica will sound dull.
(and that's not the point of replaygain. Well, it's one of its features, but the point of replaygain is also to make it possible to compensate differences of loudness among CDs of a same genre because of CD marketing policies while still allowing to replay each genre at the right gain compared to other genres.
And in general allow the user to do what he wants
Title: Improving ReplayGain
Post by: SamK on 2004-01-08 16:07:33
To sum up my opinion,
I am for storing 3 values instead of 2 :

1. assumed_system_gain == it's a gain compared to SMTPE RP200 calibrated, 83dB SPL, system.
(think of SMTPE RP200 as a home stereo on which the volume has been permanently set to a precise level, chosen by a comity so that movies and classical music play loud and clear, but not too loud either)
this value depends on the genre of the music  (that's a genre-dependant reference)
For instance :
->  +0dB for wide range dynamics genres (classical, movies, ..)
->  -6dB for pop songs (and that's simply because pop songs are already digitally loud,
          so you typically lower your amplifier settings compared to if you were playing classical)

2. replaylevel, in db SPL  == the perceived overall loudness if played on a system with gain=assumed_gain.  It can range from eg. 80 db SPL to 100  or more.

3. album replaylevel  == loudness of the whole album ..

The assumed_system_gain (let's use a more precise name than 'reference gain' if we want to avoid
confusion) might seem uselessly redondant to store in the file.
But it has to be stored if the replay level computation really takes the expected
real world loudness into account and we want to keep the possibility of playing files of different
genres (and different assumed_system_gain) at the same perceived loudness - see my post
with the 'purist replaygain fanatic' example.
[ and if after all the assumed_system_gain is not to impact the psychoacoustics algorithm,
then there's no real need to use SPL numbers at all, the replaylevel is just a
digital domain computation giving a measure of headroom, and would be better expressed in dB FS.
True it's computed with a calibration designed with one correspondance to real world SPL assumed,
but if it's only assumed and fixed, no point in making it appear in the result. ]

Each of those values could also be stored in other forms, but I think this scheme make them all easy to grasp, and really mean what they are.
You can for instance know the intrinseque loudness as

loudness on RP200 system = (replay_level - assumed_system_gain)

storing the real-world SPL obtained on assumed system (83 for classical-tuned system, 77 for pop-tuned) seems a bit less intuitive, as it is good to be able to get total real world SPL (rp-200 calibrated) of a file  by simply adding 2 values.

And I also find relative replaygain less intuitive than sound level, because it's confusing to look at a number which varies inversely to the level of the song..

Technically, that makes a complete change of the replaygain values, but I dont see how it would be better to keep the same tag names if changing the values's meaning or encoding.
In fact, the next-generation tag readers will be as happy both ways,
while using a new set of tags make it possible to support previous-generation tag readers smoothly.
users would have the possibility to add new tags, with possibly more info, and generate compatiblity tags if they want. re-generate them later if they want other replaygain settings, etc..

I find this kind of spec update scheme the most comfortable.

PS :
the assumed_system_gain value might be even easier to grasp if we stored -assumed_system_gain, and called it
genre_shift, as this value is somewhat a measure of the difference between the typical enjoyed loudness of songs of the given genre, and the typical enjoyed loudness of a wide range dynamics reference material (I think I read they used Star Wars movie sound actually)
'genre_shift' would be a bit easier to grasp, but also a bit further from its precise definition.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-01-08 17:03:04
I think you've got more time to write that I have to reply!


You don't need a genre or system gain.

The assumed system is calibrated to SMPTE RP 200 a la -=20dB pink noise = 83dB SPL.

The stored values are the gain adjustments to that system needed to make this track (or album) sound 83 dB SPL loud.

Before we store it, we add 6dB to each gain adjustment value, because that's what everyone has chosen to do. (It should have (optionally) been done in the player, but peple have chosen to do it at the tag calculation/writing stage instead - so be it).

Normal players: apply gain change.

Very good players: subtract 6 before applying it

Best players: give the user a slider so they can add or subtract whatever they want. (Discussions on good labelling and 0dB point for slider can be left for another time. I'd suggest 0dB equiv to 83dB or 89dB, with K-20 and K-14 marked too - up/right (depending on position of slider) = louder sound on play back).


As for the discussion you started with yourself about album gain vs audiophile gain vs different things needing to sound a different loudness - I've been there, hence my suggestion for RealLife RG adjustment (see my very first post in this thread).

Cheers,
David.
Title: Improving ReplayGain
Post by: SamK on 2004-01-08 19:31:23
Quote
I think you've got more time to write that I have to reply!

eh eh

Quote
You don't need a genre or system gain.

well, personnally, I do
It really is the same thing as the "real life" replay gain you mention, whatever name we call it.
And to me replaygain is half useful without such a thing.

Quote
Before we store it, we add 6dB to each gain adjustment value, because that's what everyone has chosen to do.


and reading bob Katz, it's not really a surprise. Majoritiy of files come from pop CD which have been recorded with 6dBFS less headroom than the kind of material used for RP200 calibration. So this 6dB is not arbitrary, it's characteristic of a genre. Choosing a 'genre-shift' for each song on audio-engineer concerns has the same goal as knowing its real-life RG value - except you can have a try at it yourself.
That's why I suggested some kind of genre_shift tag. It could introduce RealLifeG in the files and at the same time solve the problem of both 83 / 89 distinct references in use around.

It seems you want to stick as much as possible to the original replaygain tag specs, and I was not planning on arguing to death for storing a precise set of value with a precise set of names, and introducing a "RealLifeRG" tag is just as good to me.

Quote
As for the discussion you started with yourself about album gain vs audiophile gain vs different things needing to sound a different loudness - I've been there, hence my suggestion for RealLife RG adjustment (see my very first post in this thread).


well yeah, and I'm all for that tag. But you don't suggest much about it.
Do you suggest it be restricted to real SPL measure data during real-life event ?
or possibly personnal estimation of appropriate Real life SPL would go there too ?
do you have something in mind so that it's still possible to tell  the latter from the former ?

I'm thinking, if I store  89dB there when replaygain-computing my pop-songs, 83 for classical and maybe 93 for real heavy genres, I would still distinguish it from measured values like 83.46673 with a very, very good chance, and I could use some degree of 'reallife RG' feature now and for most my files, while I dont yet own a recording with reallife RG measures.
(that's the 'genre_shift' aspect of the tag I was debating about. probably more commonly useful than than its other, stricter-sense, aspect of real-recording-calibration-data tag)
Title: Improving ReplayGain
Post by: Iconoclast_a on 2004-07-13 10:03:56
Well, I think the genre dependent adjustment could be a nice extra for some users, but it doesn't have to be part of ReplayGain at all, all that is needed is for the decoding software to check for a genre tag and then adjust according to the user's preference for that genre. This way it would instantly work with all existent ReplayGained files.
Title: Improving ReplayGain
Post by: Kuuenbu on 2004-07-18 00:54:36
There should be an option to turn the equal loudness contour when making calculation.  It's useful for determining how loud an album sounds, but it tends to make ReplayGain unreliable for reporting how much headroom the material on a certain disc has.  Two songs with similar arrangements at the same RMS can have very different ReplayGain values depending on their frequency response.
Title: Improving ReplayGain
Post by: Pio2001 on 2004-07-18 08:07:04
I'd still like to see a plugin that would read the album peak value and use it to prevent clipping during playback !
Don't laugh, it doesn't exists ! We have the choice between applying replaygain, or let the files clip...
How do you burn audio CDs from lossy files ? RG or clipping ? Personally, I apply a -2 db volume correction, which removes 90 % of the clippings, because I'm too lazy to compute how much db is the album peak value.
Title: Improving ReplayGain
Post by: Ariakis on 2004-07-18 08:43:30
For CDs from lossy files in fb2k, you can clean your DSP to just Advanced Limiter (and resampler if necessary), and then burn with the CD Writer component with ReplayGain and the DSP enabled. Voila: CDs with RG leveling and limiter clip elimination.  That's what I do, at least.  Or is this not what you meant, Pio2001?
Title: Improving ReplayGain
Post by: David Nordin on 2004-07-18 09:10:19
Quote
Quote
I think you've got more time to write that I have to reply!

eh eh

Quote
You don't need a genre or system gain.

well, personnally, I do
It really is the same thing as the "real life" replay gain you mention, whatever name we call it.
And to me replaygain is half useful without such a thing.

Quote
Before we store it, we add 6dB to each gain adjustment value, because that's what everyone has chosen to do.


and reading bob Katz, it's not really a surprise. Majoritiy of files come from pop CD which have been recorded with 6dBFS less headroom than the kind of material used for RP200 calibration. So this 6dB is not arbitrary, it's characteristic of a genre. Choosing a 'genre-shift' for each song on audio-engineer concerns has the same goal as knowing its real-life RG value - except you can have a try at it yourself.
That's why I suggested some kind of genre_shift tag. It could introduce RealLifeG in the files and at the same time solve the problem of both 83 / 89 distinct references in use around.

It seems you want to stick as much as possible to the original replaygain tag specs, and I was not planning on arguing to death for storing a precise set of value with a precise set of names, and introducing a "RealLifeRG" tag is just as good to me.

Quote
As for the discussion you started with yourself about album gain vs audiophile gain vs different things needing to sound a different loudness - I've been there, hence my suggestion for RealLife RG adjustment (see my very first post in this thread).


well yeah, and I'm all for that tag. But you don't suggest much about it.
Do you suggest it be restricted to real SPL measure data during real-life event ?
or possibly personnal estimation of appropriate Real life SPL would go there too ?
do you have something in mind so that it's still possible to tell  the latter from the former ?

I'm thinking, if I store  89dB there when replaygain-computing my pop-songs, 83 for classical and maybe 93 for real heavy genres, I would still distinguish it from measured values like 83.46673 with a very, very good chance, and I could use some degree of 'reallife RG' feature now and for most my files, while I dont yet own a recording with reallife RG measures.
(that's the 'genre_shift' aspect of the tag I was debating about. probably more commonly useful than than its other, stricter-sense, aspect of real-recording-calibration-data tag)
[a href="index.php?act=findpost&pid=171081"][{POST_SNAPBACK}][/a]



Using merely genres to set playback mode sounds messy, then atleast something like a K-factor metatag you set: "K-20"/"K-n..."

RealLife SPL is a nice idea, but not so easy to do.

I still want you to send me material we discussed long ago 2Bdecided, altho you haven't.
Title: Improving ReplayGain
Post by: Pio2001 on 2004-07-18 17:32:35
Quote
For CDs from lossy files in fb2k, you can clean your DSP to just Advanced Limiter (and resampler if necessary), and then burn with the CD Writer component with ReplayGain and the DSP enabled. Voila: CDs with RG leveling and limiter clip elimination.  That's what I do, at least.  Or is this not what you meant, Pio2001?
[a href="index.php?act=findpost&pid=226903"][{POST_SNAPBACK}][/a]


Thanks for answering, but this is not what I need. When I burn a CD, I want it to be as close to the original as possible. I don't want to apply replaygain to it. If I would, there would be no need for the limiter : replaygain features the needed option to prevent decoding clipping.
Title: Improving ReplayGain
Post by: indybrett on 2004-07-18 18:09:53
If it hasn't been brought up yet, I would like to add my one feature request. MP3Gain gives you the option of setting the volume at the "max no-clipping" level, on a per track or per album basis.

I could really benefit from being able to do the same thing with FLAC or Vorbis files. Maybe I can do that already and just don't know it. Perhaps with the album_peak info.
Title: Improving ReplayGain
Post by: snek_one on 2004-07-18 18:52:15
only thing i have to get off my chest about replaygain is the fact that the mp3's are very soft when burned to cd... i have to normalise with nero to make it a decent volume.. most bars or disco's won't even play replaygained tracks/mp3 burnt to audiocd since they are way off compared to original cd's.. this makes it extremely hard for the dj to determine the correct volume, and generally causes major strain on the equipment used..

any thoughts on this?
Title: Improving ReplayGain
Post by: Cyaneyes on 2004-07-18 19:30:17
Quote
only thing i have to get off my chest about replaygain is the fact that the mp3's are very soft when burned to cd...

any thoughts on this?


http://replaygain.hydrogenaudio.org//faq_quiet.html (http://replaygain.hydrogenaudio.org//faq_quiet.html)

Extremely hard for the DJ to determine the correct volume?  Major strain on equipment?  From what is usually at most a -12 db adjustment?  Does his equipment not have a volume control?
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-07-19 12:01:33
Quote
I'd still like to see a plugin that would read the album peak value and use it to prevent clipping during playback !
Don't laugh, it doesn't exists ! We have the choice between applying replaygain, or let the files clip... [a href="index.php?act=findpost&pid=226899"][{POST_SNAPBACK}][/a]


The MusePack plug-in for Winamp gave this functionality (for mpcs only, obviously!) two years ago.

I don't know why it's not in foobar. Maybe the foobar forum is the right place to ask?

Cheers,
David.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-07-19 12:02:59
Quote
I still want you to send me material we discussed long ago 2Bdecided, altho you haven't.
[a href="index.php?act=findpost&pid=226909"][{POST_SNAPBACK}][/a]


I couldn't find most of it - I'll forward what I do have.

EDIT: I couldn't find _any_ of it! I believe Glen (mp3gain) had a copy of the revised implementation which someone came up with. If I could search my email more efficiently I might be able to find the persons address. I know it was based on windowing and 50% overlap before FFT, with some optimisations which meant overall there was still a speed increase. IIRC there were no other changes to the algorithm.

Cheers,
David.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-07-19 12:06:29
Quote
There should be an option to turn the equal loudness contour when making calculation.  It's useful for determining how loud an album sounds[a href="index.php?act=findpost&pid=226854"][{POST_SNAPBACK}][/a]


Yes, because that's what it's for!!!

Quote
... but it tends to make ReplayGain unreliable for reporting how much headroom the material on a certain disc has.


The peak sample value tells you _exactly_ how much headroom the material has - usually none!

Quote
Two songs with similar arrangements at the same RMS can have very different ReplayGain values depending on their frequency response.


Yes, because they'll souind different!

If you want a pure RMS measurement, then measure the RMS. It's got little to do with judging or matching loudness, so it's not part of ReplayGain. Sorry!

Cheers,
David.
Title: Improving ReplayGain
Post by: Pio2001 on 2004-07-19 20:50:08
Quote
only thing i have to get off my chest about replaygain is the fact that the mp3's are very soft when burned to cd... i have to normalise with nero to make it a decent volume.. [a href="index.php?act=findpost&pid=226991"][{POST_SNAPBACK}][/a]


This is, in an indirect way, the purpose of replaygain : burning the CD softer. Volume adjustments on CD can only be performed downwards, not upwards (look at the link provided by 2BDecided). Normalizing the CD before burning is almost always the same thing (exept with rare weirdly mastered CDs, like old Depeche Mode reissues in 1985) as eliminating the replaygain adjustment !
Title: Improving ReplayGain
Post by: Peter on 2004-07-19 21:08:11
Quote
Quote
I'd still like to see a plugin that would read the album peak value and use it to prevent clipping during playback !
Don't laugh, it doesn't exists ! We have the choice between applying replaygain, or let the files clip... [{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=226899")


The MusePack plug-in for Winamp gave this functionality (for mpcs only, obviously!) two years ago.

I don't know why it's not in foobar. Maybe the foobar forum is the right place to ask?

Cheers,
David.
[a href="index.php?act=findpost&pid=227165"][{POST_SNAPBACK}][/a]

[a href="http://foobar2000.org/temp/replaygain_peak.png]click me[/url]
Title: Improving ReplayGain
Post by: indybrett on 2004-07-19 21:51:08
I don't think that is the same as what he is asking for. That scales down a track that still clips after applying replaygain.

Isn't he asking about just keeping the output from clipping, without applying replaygain?
Title: Improving ReplayGain
Post by: dev0 on 2004-07-19 22:01:03
This can be done by raising the Pre-Amp.
Title: Improving ReplayGain
Post by: indybrett on 2004-07-19 22:29:33
Quote
This can be done by raising the Pre-Amp.
[a href="index.php?act=findpost&pid=227320"][{POST_SNAPBACK}][/a]

How does raising the pre-amp scale down the global volume of a track to keep it from clipping? I'm serious. I really don't know.
Title: Improving ReplayGain
Post by: Pio2001 on 2004-07-19 22:50:05
Quote
Quote
click me (http://foobar2000.org/temp/replaygain_peak.png)
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=227306")


I don't think that is the same as what he is asking for. That scales down a track that still clips after applying replaygain.

Isn't he asking about just keeping the output from clipping, without applying replaygain?
[a href="index.php?act=findpost&pid=227317"][{POST_SNAPBACK}][/a]


That's right, in your picture, the replaygain mode is "use album gain". However, if the mode is "disabled", when the box is checked, I still get clip warnings.

Quote
This can be done by raising the Pre-Amp.
[a href="index.php?act=findpost&pid=227320"][{POST_SNAPBACK}][/a]


You mean lowering the preamp ? Yes it can. I just have to right click the file, ask its properties, check the album peak, open the Windows calculator in scientific mode, convert the peak value in decibels, and set the preamp to the resulting value.
What I was suggesting is that Foobar would do it by itself  More details in the Foobar2000 forum : [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=20209&view=findpost&p=227318]http://www.hydrogenaudio.org/forums/index....ndpost&p=227318[/url] , where we can, as 2BDecided suggested, go on with this topic.
Title: Improving ReplayGain
Post by: Xenno on 2004-07-19 23:01:09
2B > The peak sample value tells you _exactly_ how much headroom the material has - usually none!

...and I am at a loss on how RG actually does this. If I take a 20 kHz sine wave (or whatever that will yield 2 sample points per cycle) and encode at 44.1 kHz, there is no guarantee that the sample points will fall on the amplitude maximums (unless phase locked). They could fall on the x-axis crossing nodes...or anywhere on the waveform up to the peaks. Given my example above, is RG actually reconstructing the wave to determine the peak value...or is it using the data in the file?

xen-uno
Title: Improving ReplayGain
Post by: SamK on 2004-07-20 00:52:29
Quote
2B > The peak sample value tells you _exactly_ how much headroom the material has - usually none!

...and I am at a loss on how RG actually does this. If I take a 20 kHz sine wave (or whatever that will yield 2 sample points per cycle)


in fact 2+epsilon sample points are required by shannon theorem in the case of real valued samples, if you look at it closely enough.
(ie : notice that cos(Wt) = 0.5*(exp(iWt) + exp(-iWt)), and thus its bandwidth is not in  [-W,+W[  )

Quote
and encode at 44.1 kHz, there is no guarantee that the sample points will fall on the amplitude maximums (unless phase locked). They could fall on the x-axis crossing nodes...or anywhere on the waveform up to the peaks.


on a long enough sequence of such a sine wave, some of the sample points will fall very close to maximums of the continuous wave.
By a quick estimation, 4000 samples are enough to insure that the discrete peak lies within 100/(4000^2) percents of the continuous wave's real peak for high frequencies up to 22.05/(1+1/4000) = 22.044 kHz.
(I'm using 1-x^2/2 as an estimate of the sine wave near the optimums)

Even with only  10 samples, you get 1% peak precision for high-frequency sines up to 20.04 kHz.
(i.e., from 2.2kHz to 20.04kHz. low frequencies are of no interest here, since they don't show much max difference between discrete and continuous signal)

Conclusion : for a sine wave, you don't really have to worry about the difference between the discrete peak and the underlying continuous peak.

Quote
Given my example above, is RG actually reconstructing the wave to determine the peak value...or is it using the data in the file?


my opinion is it doesn't matter, even slightly, though  I only made my point with sine waves and not the general case of just any sampled sound.
Title: Improving ReplayGain
Post by: SamK on 2004-07-20 00:55:17
Quote
my opinion is it doesn't matter, even slightly, though  I only made my point with sine waves and not the general case of just any sampled sound.
[a href="index.php?act=findpost&pid=227357"][{POST_SNAPBACK}][/a]


on top of that, you can consider it's not really clipping as long as the digital signal is conserved.
Then,  if the DAC chops of the true peaks of the analog signal due to that kind of issue, I'd say it's his fault.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-07-20 13:07:28
The RG peak value is the largest absolute value within the digital data.

I'm aware that the true reconstructed peak can be between sample values (and oversampling DACs will often clip this), but isn't the ReplayGain calculation slow enough already - without taking this into account?

It's worth remembering that the people who master squashed CDs don't take this into account either.


The reason I didn't worry about this with ReplayGain is because the highest reconstructed inter-sample value you can contrive is around 1.5x digital full scale. As ReplayGain drops most over-compressed tracks by 6-12dB, you've got more than enough headroom.

I suppose you could store an "analogue" peak value, and use this for clipping prevention. That's a nice project, if anyone wants it!

However, ReplayGain will keep most music away from clipping. If you don't use ReplayGain, simply dropping the gain by 3-6dB will keep everything away from clipping. What's more, the existing peak value is more than good enough in most cases, and leaving an extra fraction of a dB headroom will make it fine in all but contrived cases.

You've got to wonder: if someone puts a signal onto a CD where the analogue peak is at digital full scale plus 50%, maybe the intention is to make the DAC in your CD player clip? Is so, what's the point in de-clipping it?

Cheers,
David.
Title: Improving ReplayGain
Post by: SamK on 2004-07-20 14:08:04
Quote
The RG peak value is the largest absolute value within the digital data.

I'm aware that the true reconstructed peak can be between sample values (and oversampling DACs will often clip this)


Are there any DACs that can reconstruct the analog signal with full peak above full-scale ?
I guess DACs can behave very differently on such digital signals.

Quote
The reason I didn't worry about this with ReplayGain is because the highest reconstructed inter-sample value you can contrive is around 1.5x digital full scale.


Wow, 1.5x is much more than possible with sine waves..
can you tell how to make such a signal ?
or do you get this value from mathematically bounding the reconstructed signal formula ?
Title: Improving ReplayGain
Post by: SamK on 2004-07-20 14:19:22
Quote
isn't the ReplayGain calculation slow enough already - without taking this into account?


it might be possible to take it into account without much more computations.
From the DCT transform, you can bound the analog peak by adding the moduli of the DCT  coefficients.
I don't know how unprecise that can be on real music signals, but my guess is it shouldn't be too bad.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-07-20 14:50:48
Quote
Wow, 1.5x is much more than possible with sine waves..
can you tell how to make such a signal ?
or do you get this value from mathematically bounding the reconstructed signal formula ?
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=227495")


OK, it was 1.41, but it's possible with an 11.025kHz sine wave (44.1kHz sampling)...

[a href="http://www.hydrogenaudio.org/forums/index.php?act=Attach&type=post&id=818]http://www.hydrogenaudio.org/forums/index....ype=post&id=818[/url]

I'd imagine that's the maximum you can get from a sine wave, but if you drag samples around in Cool Edit you can get bigger peaks between samples. If you drag 2 more samples high in the above example, you can reach 1.78x digital full scale between samples (verified by resampling to 10x the sample rate and checking the middle sample value). The "true" peak will be slightly higher still. I'll leave you to figure out which two samples you have to drag up!

Cheers,
David.
Title: Improving ReplayGain
Post by: dev0 on 2004-07-20 14:59:15
Code: [Select]
replaygain_track_gain = -10.99 dB
replaygain_track_peak = 1.647949


Transcoded from a Musepack --standard encode to AoTuV b2 -q 0.
Title: Improving ReplayGain
Post by: 2Bdecided on 2004-07-20 15:13:23
That's different to the latest issue discussed in this thread, because that Peak value is based on actual samples, not inter-sample reconstructed peaks.

However, it illustrates Pio's earlier point very well!

Cheers,
David.
Title: Improving ReplayGain
Post by: Kuuenbu on 2004-07-26 02:18:31
Quote
The peak sample value tells you _exactly_ how much headroom the material has - usually none!
I'm referring more to peak-to-average ratio here.  Headroom is a rather vague term that could mean anything, so I probably shouldn't have used it.

Quote
If you want a pure RMS measurement, then measure the RMS. It's got little to do with judging or matching loudness, so it's not part of ReplayGain. Sorry!
Yes, but current methods of calculating RMS are rather cumbersome.  You have to open up each individual file in a wave editor, run the analysis feature, write the RMS down, and do that over and over again for every track on an album.  Plus the RMS scanners in wave editors don't have the "intelligent" calculation factors that ReplayGain uses; it simply averages all the samples in a selection (unless you specify the scanner to ignore everything under a certain level, which alreeady adds work that shouldn't be neccessary for the user).  Adding a non-contour feature to ReplayGain would give people a quick and easy way to measure RMS values.
Title: Improving ReplayGain
Post by: danbee on 2004-10-27 17:54:36
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)


Could this be solved by storing the reference level in the file as well as the replaygain?

Edit: Didn't realise I was such a late comer to this thread!
Title: Improving ReplayGain
Post by: jcoalson on 2004-10-27 18:23:32
not a bad idea, e.g.

replaygain_reference_level=90dB

absence of the tag implies 89dB
Title: Improving ReplayGain
Post by: davygrvy on 2015-06-03 20:09:30
5. ReplayGain RealLife adjustment

The gain required to give the actual SPL of the original event (in a calibrated system), or a human judged sensible replay level (see the explanation behind the original "Audiophile (http://replaygain.hydrogenaudio.org/faq_radio.html)" level and the work of Bob Katz (http://www.digido.com/) if you think this is an impossible idea). I've found a few DVD-A discs that have this information (it's in the MLP stream), so it would be nice to have somewhere to store it. It's unlikely to get used much, but it would be a useful thing to have. It would be the last link in some of the best recordings out there.


Can we have this, please?

Maybe use %REPLAYGAIN_RWTRIM% as the tag.  If it exists, playback for only album gain is affected and added to it.  FWIW, I posted a feature request to FLAC (https://sourceforge.net/p/flac/feature-requests/112/) about this.

Having my Boston Philharmonic play at the same loudness as Pantera is a bit weird while my calibrated home theater system has a listening level of -23dB.  The missing link to SPL referred playback would add that next level of "shiny"