Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Improving ReplayGain (Read 56128 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Improving ReplayGain

Every now and again I wish I had the time to update the ReplayGain website and add some new ideas, and maybe even clarify some old ones. I don't, so this thread will have to do.


Firstly, the format used to store ReplayGain info in files is not documented correctly on the ReplayGain website, and it would be good to "publish" what has emerged as the standard for each format.

Secondly, what is stored is not documented correctly on the ReplayGain website, and I'd like to re-examine what is stored...


One change has already happened, and I think it's a good change:

Forget Radio and Audiophile - Track and Album are much better names.
(that's an open admission of me being wrong, for anyone who discussed this with me previously!)


So, we store:

ReplayGain Track adjustment
ReplayGain Album adjustment
(ReplayGain) Track peak
(ReplayGain) Album peak

[span style='font-size:8pt;line-height:100%'](this last one wasn't in the original proposal, but it has been widely used - I've put it in bold to remind me to include it in the update)[/span]

That makes sense, and most software supports this. I'd like to formalise some extensions, some of which were there from the start, and others that have cropped up more recently:


1. (ReplayGain) undo adjustment
- this is written when the gain of the file is changed (e.g. by mp3gain, or by decoding with ReplayGain enabled), and is the gain change required to put the file back to where it started.

e.g. If I apply -8dB gain change using mp3gain, then
(ReplayGain) undo adjustment = +8dB

e.g. If I use --scale 0.5 when encoding (for whatever reason?!), then
(ReplayGain) undo adjustment = +6dB

If the gain of an already ReplayGained file is changed, the original four values (Track and Album adjustment and peak) should be updated so that they are correct for the new audio data. (see an example in this thread: http://www.hydrogenaudio.org/forums/index....topic=15412&hl= )

I can't see any argument against defining this field. It would be zero (or absent) if the audio file hasn't been altered. It's useful in all formats because you can always apply wavgain before encoding, and it would be nice to know that this has been done.


2. ReplayGain calculation method

OK - I've had this argument before, but this really is important. ReplayGain can be improved, but you'll never know whether files are tagged using the old or new ReplayGain calculation unless the calculation method (actually a number which corresponds to the method) is stored. This doesn't increase the complexity of players, as they won't care - it just makes it very easy to pick out files that were tagged with the old version, and update them.


3. ReplayGain lossy approximation

This is just a single bit: 0 or 1.

0= this ReplayGain info has been calculated from the data in this file
1=this file has been lossily encoded/transcoded since this ReplayGain info was calculated.

What's the point of this? If you have a file with ReplayGain info, you can transcode it and copy the RG info across. It'll be close enough to give you excellent loudness equalisation, and you won't have to re-calculate it. Yet they'll be a label there to tell all you anal retentives that it's not quite right, and should be recalculated if you want to be 100% sure (especially important for peak amplitude).

You could (should?) have one “ReplayGain lossy approximation” bit for each of the four values, which gives you the chance (for example) of re-calculating the peak values (quick, and important - so let's do it), but leaving the ReplayGain values (slow, and unimportant - so let's not do it).


4. ReplayGain user adjustment

Instead of suggesting that users should change the calculated values if they wish, give them a field to enter their own value if they really have to. Players should give the option to read the user value in preference to any others (i.e. let it act as an over-ride), and taggers should give the option of removing the user values from all (downloaded) files.


5. ReplayGain RealLife adjustment

The gain required to give the actual SPL of the original event (in a calibrated system), or a human judged sensible replay level (see the explanation behind the original "Audiophile" level and the work of Bob Katz if you think this is an impossible idea). I've found a few DVD-A discs that have this information (it's in the MLP stream), so it would be nice to have somewhere to store it. It's unlikely to get used much, but it would be a useful thing to have. It would be the last link in some of the best recordings out there.



I'd like to come to a consensus of which ones of these (if any/all) should be included, and then get some specs as to how they are/should be stored in each file format (especially APE2.0 tags) finalised and published on-line.

Comments? Suggestions? Offers of help?


btw I've received a couple of suggestions for improving the ReplayGain calculation. One is trivial, and seems like a great idea. I'll post it for testing when the problem of version numbering is solved. If anyone else has slightly or totally re-worked the ReplayGain algorithm/concept, now would be a good time to step forward! We could do listening tests to find the best candidate for "calculation version 2".


Cheers,
David.

Newbie warning: this thread is not for asking questions about ReplayGain that are already answered on www.replaygain.org or in previous threads on HA. (I'm always happy to answer "silly" questions via email – half of them aren't silly at all.)

However, if you do already have some understanding of ReplayGain then this thread is the perfect place for clarifying anything to do with the above proposals which is not clear.

Improving ReplayGain

Reply #1
I think that a point that should be clarified/update is in which format to store the rg values.

In the current Lame header, it is stored as floating point data. However, this could be a source of problems on some platforms. It appeared that we probably need an integer representation.

Improving ReplayGain

Reply #2
Frank heavily criticised my proposal for storing RG info in .wav files because I used floating-point representation for the peak values; in what was basically a fixed point format.

I agree - there needs to be a resolution of this problem. Frank's idea was to use fixed point 16-bit, representing 0-65535 (i.e. 0-200% peak). It's one solution, with its own advantages and disadvantages: You can't store peaks above 200% (which do happen! Lossy encoding of modern CDs), and you can't do perfectly accurate 100% normalisation or clip prevention on 24-bit decodes (though you can get close enough - depends how anal you are).

A 32-bit INT would offer greater flexibility, but would it bring its own problems? I'm thinking: middle 16-bit as normal, lower 8 bits for increased resolution, upper 8 bits for >100%. Would this be difficult to program?


EDIT: Aren't the RG values themselves fixed point, using that horrible binary format I invented for the task? I don't propose using that binary format anymore, but fixed point would be good.

Cheers,
David.


Improving ReplayGain

Reply #4
Moderators:

On this page:
http://replaygain.hydrogenaudio.org//typic...al_results.html

The downloadable files aren't there. Moved? Deleted? Never there? I can find them if you want them, but it'll take a while.


Developers:

Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

Cheers,
David.


Improving ReplayGain

Reply #6
Quote
(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)

I think it's a good idea to store the actual number, instead of the adjustment (just in terms looking good and beeing more clear, don't realy know about the technical problems included).
why too confusing? because ppl have got used to the method how it is now?
ppl can change. for me, this change would be something nice.
Nothing but a Heartache - Since I found my Baby ;)

Improving ReplayGain

Reply #7
I would just like a way to write replaygain info into the gain/volume/whatever its called field on MP4 files.  That way i could get some hardware support (Ipod).

Improving ReplayGain

Reply #8
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Agree.


Quote
So - what range? And how?

Our problem is only with the peak value. In the Lame tag, it is stored using 32bits, so we have 32bits to define a format.

I would suggest just using an unsigned integer. Our needs are:
*beeing able to have enough precision for 0-100% range
*beeing able to store values higher than 100% (btw, how much higher?)

Ideally, a 24bits precision for the 0-100%range would be nice.

First proposal:
Use 0 - 100 000 as 0 - 100% range.
Precision is more than 24 bits (a little more than 26bits), and this would allow for about up to 4000% (considering that the maximum unsigned int value is 4G). Moreover, it is quite simple, just a linear scale.

Improving ReplayGain

Reply #9
That would be fine.

Or...


Would the following work. It's the same, but using a different linear scale factor, which fits in neatly with 16- and 24-bit data, like this:

Field = 32-bit INT.


For 16-bit audio data, use

00000000xxxxxxxxxxxxxxxx00000000

Where xxxxxxxxxxxxxxxx is the peak value.
(1000000000000000 is the largest possible value for linear 16-bit data, e.g. a .wav file)


For 24-bit audio data

00000000xxxxxxxxxxxxxxxxxxxxxxxx

Where xxxxxxxxxxxxxxxxxxxxxxxx is the peak value.

etc

If the peaks are greater than 200% then obviously the leading 0s would be used to indicate this. So, in the mp3 case, you find the peak using a decoder which allows headroom, and muliply the normalised result by (2^23).

Using (2^23) rather than 100000 (which you suggested) as the scale factor sounds strange, but it means 16 and 24-bit data can simply be pasted into the field just by shifting the bits, which would avoid multiplication and rounding errors.


digital full scale is
00000000100000000000000000000000
i.e. 2^23

You get exactly 24-bit accuracy, and 54dB of headroom (i.e. 51200%, I think!)


Would this be easy to program?


Should we change peak values to fixed point in all implementations?

Would it be easy for players to use, because I'm thinking about this being a useful convention to employ in all formats, since floating point isn't strictly needed, and is causing rounding confusion.

Or would it be stupid to change to fixed point for the peak value in other formats, because this would break compatibility with old players?

Cheers,
David.

Improving ReplayGain

Reply #10
Seems interesting. It would be nice to hear other opinions.

Improving ReplayGain

Reply #11
Do none of the developers have any comments?


Two more issues:

1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

2. MTRH has reminded me that a ReplayGain logo is long overdue. Shall I launch a competition? If so, I'll wait until the HA one is well out of the way.

Cheers,
David.

Improving ReplayGain

Reply #12
Quote
1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

Would people really trust replay gain values stored on freedb to be correct?  I use freedb with EAC to get track titles, etc.  But this is because I know I can check these titles against the correct ones on the CD cover and change them where appropriate.  Often there are spelling errors or other issues.

With replay gain values the only way I would know whether they are are correct would be to scan the files, and if I'm going to do that I don't need freedb anyway.

Improving ReplayGain

Reply #13
Quote
Quote
1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

Would people really trust replay gain values stored on freedb to be correct?  I use freedb with EAC to get track titles, etc.  But this is because I know I can check these titles against the correct ones on the CD cover and change them where appropriate.  Often there are spelling errors or other issues.

Certainly the information on there has many errors. But these are human errors, and there's no room for human error when calculating and automatically submitting ReplayGain values.

There could be other problems:

1. Different releases of the same CD with different loudnesses.
Hopefully the different mastered versions will have slightly different TOCs. This is usually the case. In which case, they can be detected and catalogued as different versions by freedb.

2. The values are calculated from a different format (e.g. .wav when you have mp3, mp3 when you have mpc etc etc)
That's one reason for suggestion 3 in my first post. See there.

3. Someone has intentionally submitted incorrect values / someone changed the gain of an album before calculating the ReplayGain
Yes - that's a problem. As with other fields, people can correct the data, and/or the server can weed out erroneous entries because they'll be swamped with correct ones.


Quote
With replay gain values the only way I would know whether they are correct would be to scan the files, and if I'm going to do that I don't need freedb anyway.


You would have to calculate the peak values yourself anyway (easier and quicker than the ReplayGains) because they're encoding dependent, so the accuracy of the peak values is not an issue. (freedb should hold the peak values for the lossless versions).

For the actual ReplayGain values (Track and Album), if they make the tracks sound the same loudness as other tracks on playback, then it's doing its job, and that's fine. If they don't, then you'll notice, and you can recalculate them if you want.

But it doesn't matter if they're "correct" to how ever many decimal places, because ReplayGain is just an estimate. What matters is that the ReplayGain values work. If you want to, you can check if they work or not very quickly just by skipping through the album. If it's too loud or too quiet, they're wrong!

So there's no reason to recalculate them all to check their accuracy. If it was me, I'd happily grab all the ReplayGain values I needed from freedb, and only re-tag them myself if I heard a problem.

But maybe that's just me?

Cheers,
David.

Improving ReplayGain

Reply #14
Quote
Quote


1. Different releases of the same CD with different loudnesses.
Hopefully the different mastered versions will have slightly different TOCs. This is usually the case. In which case, they can be detected and catalogued as different versions by freedb.



gday..

i guess the UPC code will take care of that..
(assuming there is a original rip)



Improving ReplayGain

Reply #15
Just a question from a user point of view. iTunes has the ability to calculate the replain gain of a track. Is it the same base as for the ReplayGain values ? (I never checked if iTunes store the value in the file or not)

Improving ReplayGain

Reply #16
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

Improving ReplayGain

Reply #17
Quote
Just a question from a user point of view. iTunes has the ability to calculate the replain gain of a track. Is it the same base as for the ReplayGain values ? (I never checked if iTunes store the value in the file or not)

I played a short and very quiet sample (part of an orchestral recording) in iTunes : it was terribly much quieter than RG recommandations (+20 dB).
Can't sure that we could extrapolate this difference, but I suppose that iTunes gain system is different (less accurate too, according to the calculation speed).
Wavpack Hybrid: one encoder for all scenarios
WavPack -c4.5hx6 (44100Hz & 48000Hz) ≈ 390 kbps + correction file
WavPack -c4hx6 (96000Hz) ≈ 768 kbps + correction file
WavPack -h (SACD & DSD) ≈ 2400 kbps at 2.8224 MHz

Improving ReplayGain

Reply #18
Quote
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

If you have time, please read the entire thread

You'll see I suggest switching back to this method.

However, I don't think it's realistic to switch now, because it would dramatically break compatability with existing players. This would be a very bad thing, unless someone can see a way around it.

The other additions will not break compatability with existing players, so it's just a question of whether developers want to implement them.

Cheers,
David.

Improving ReplayGain

Reply #19
Quote
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

This would not work. Players would still need to figure how much the gain needs to be changed since playback loudness isn't calibrated in any way. Media Jukebox would calculate volume change need with formula 83dB - 84dB = -1dB when others would calculate it with 89dB - 84dB = +5dB.
PS. your example is incorrect, 82dB + +1dB = 83dB, thus value to store would be 82 and not 84.

Improving ReplayGain

Reply #20
3. ReplayGain lossy approximation

Storing this seems pointless to me, since ReplayGain calculations will become inaccurate after transcoding and no tool should be copying ReplayGain values when transcoding.
"To understand me, you'll have to swallow a world." Or maybe your words.

Improving ReplayGain

Reply #21
Quote
1. Is there any chance of a service like freedb storing the replay gain values for tracks and albums to save us all a lot of time?

I RG my discs before burning a backup, so if a friend pops that copy, the TOC will match and give back erroneous RG info.

To solve this we could store the RG info, plus a RG value for, say, the first 30 seconds of the album, so the RG info for that part of the disc is calculated and sent with the query (generated Disc ID). This way one could be sure the RG info is correct if the sent value and the value in the db match (with a +-5% confidence).
"You have the right to remain silent. Anything you say will be misquoted, then used against you."

Improving ReplayGain

Reply #22
Quote
3. ReplayGain lossy approximation

Storing this seems pointless to me, since ReplayGain calculations will become inaccurate after transcoding and no tool should be copying ReplayGain values when transcoding.

the inaccurancy is minimal enough to be dismissed. This is a small test I made:

Iron Maiden - [Dance of Death #04] Montségur [5:48]

PCM
-10.54 dB

PCM --> MP3
-10.55 dB

PCM --> MP3 -->Musepack
-10.53 dB

PCM --> MP3 -->Musepack --> Vorbis
-10.57 dB

PCM --> MP3 -->Musepack --> Vorbis --> Wavpack (Lossy)
-10.57 dB

PCM --> MP3 -->Musepack --> Vorbis --> Wavpack (Lossy) --> Nero MP4
-10.54 dB

The biggest difference was -0.03dB which is a -0.284% diff from the original, I picked this track because is loud enough to make most lossy encoders go beyond full scale.
"You have the right to remain silent. Anything you say will be misquoted, then used against you."

Improving ReplayGain

Reply #23
Try lame abr for exemple (there's a --scale 0.98 included in the preset). Difference will be higher.
Wavpack Hybrid: one encoder for all scenarios
WavPack -c4.5hx6 (44100Hz & 48000Hz) ≈ 390 kbps + correction file
WavPack -c4hx6 (96000Hz) ≈ 768 kbps + correction file
WavPack -h (SACD & DSD) ≈ 2400 kbps at 2.8224 MHz

Improving ReplayGain

Reply #24
Quote
Quote
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

Instead of storing +1dB compared to a reference (83dB). Why don't you store 84dB directly ? This way anyone can decide for his/her reference playback loudness.

This would not work. Players would still need to figure how much the gain needs to be changed since playback loudness isn't calibrated in any way. Media Jukebox would calculate volume change need with formula 83dB - 84dB = -1dB when others would calculate it with 89dB - 84dB = +5dB.

Sorry Case, but I think you're wrong.

At the moment, people store the gain change needed to match a standard loudness. Most use 89dB as that standard, but some use 83dB. So, there's confusion.

But they all measure the "perceived" loudness of the track the same way. (They're all taking my "pink_ref.wav" file, or whatever it was called, to be 83dB, after SMPTE RP-200 - after a real, and long existing standard). So if you store the "perceived" loudness, there's no confusion.

e.g. perceived loudness of track = 93dB SPL.
Musepask relates this to 89dB, and stores a ReplayGain of -4dB
MediaPlayer relates this to 83dB, and stores a ReplayGain of -10dB
But in both cases, the perceived loudness of the original track is 93dB.

It should be apparent that by just storing 93dB, any player can figure out what to do. (target volume - 93dB = required gain change, e.g. 89-93=-4dB).


BUT, though I think it would be nice to do this, I'm not saying we should; it would break compatibility with existing players, wouldn't it? They're expecting the gain changed in the tag, and would read it as a +93dB ReplayGain - that's just a bit too loud!

Cheers,
David.