HydrogenAudio

Hydrogenaudio Forum => General Audio => Topic started by: Notat on 2010-12-11 23:20:11

Title: Replay Gain specification
Post by: Notat on 2010-12-11 23:20:11
I've taken the first step towards fulfilling my threat to produce an up-to-date edition of the Replay Gain specification.

The working draft is published on the Hydrogen Audio Wiki (http://wiki.hydrogenaudio.org/index.php?title=Replay_Gain_specification). As it currently stands, this is a copy-paste from David's (2Bdecided)original proposal (http://replaygain.hydrogenaudio.org). The next steps include  copy editing to make it read like a standard and digging through the post-publication discussion on Hydrogen Audio forums (and elsewhere?) and conforming the specification to current practice.

If you would like to make small changes and corrections to the draft, feel free to edit the wiki. If you know of larger changes that need to be made, let's discuss them in this thread first.
Title: Replay Gain specification
Post by: 2Bdecided on 2010-12-12 13:38:44
Great work Notat.

Cheers,
David.
Title: Replay Gain specification
Post by: pbelkner on 2010-12-16 11:40:43
In almost all cases RG as currently defined gives very well results. It's a real advantage having RG. However, there are a few exceptions:
Title: Replay Gain specification
Post by: 2Bdecided on 2010-12-16 14:07:48
That's not a bad idea, but I think it's best for the wiki to be developed so that it accurately reflects ReplayGain v1 (as widely implemented) before building on it.

The problem at the moment is that the original ReplayGain website is out of date, and a defacto standard exists out there which is based on the original but with several important modifications and improvements. That's what needs to be set in stone here IMO.

Then by all means improve it!

Cheers,
David.
Title: Replay Gain specification
Post by: C.R.Helmrich on 2010-12-16 18:03:21
Thanks a lot for posting this specification! I've been curious about how Replay Gain works for quite some time.

Before I start with suggestions for improvement of RG in general, off to the current text version.



Best,

Chris
Title: Replay Gain specification
Post by: 2Bdecided on 2010-12-17 00:26:40
How do I edit this? Who do I ask for permission?

Cheers,
David.
Title: Replay Gain specification
Post by: Notat on 2010-12-17 05:27:44
How do I edit this? Who do I ask for permission?

Cheers,
David.


You have to ask Jan (http://www.hydrogenaudio.org/forums/index.php?showtopic=42543).
Title: Replay Gain specification
Post by: Notat on 2010-12-17 05:46:56
The problem at the moment is that the original ReplayGain website is out of date, and a defacto standard exists out there which is based on the original but with several important modifications and improvements. That's what needs to be set in stone here IMO.


This is indeed the plan. The immediate project is to document current practice. Without that in place, we don't have a stable platform from which to make improvements.

It would probably be best to open separate threads (http://www.hydrogenaudio.org/forums/index.php?showtopic=85614&hl=) to propose and discuss individual improvements.
Title: Replay Gain specification
Post by: Notat on 2010-12-17 17:14:05
I'm working on section 1.4 (Calibration with reference level). 83 dB SPL is mentioned frequently. This strikes me as a red herring. Replay Gain does not endeavor to tell anyone how loud, in absolute terms, they should be listening.

The important point taken by Replay Gain from the SMPTE standard is that -20 dBFS pink noise is the the reference to be used for average loudness. In other words, Replay Gain specifies a playback system with 20 dB of headroom to accommodate peaks.

Later in section 3.2 (Pre-amp), a 6 dB boost enabled by default is specified. This has the effect of bringing headroom down to 14 dB.

Does it seem reasonable to remove references to 83 dB SPL and speak in terms of headroom? I think 83 dB is causing confusion. I suspect it has lead several players to present user calibration parameters in terms of dB SPL.
Title: Replay Gain specification
Post by: 2Bdecided on 2010-12-17 18:10:19
No - because you have to assume some listening level to use any psychoacoustics. Talking only about samples values in files with no real world reference is exactly how you create a dead-end standard which no one can ever improve.


There is a major change to make though: what's stored is the 83dB referenced result, plus an arbitrary 6dB. That's a defacto change from the original proposal.

Cheers,
David.
Title: Replay Gain specification
Post by: greynol on 2010-12-17 18:57:13
Possibly it is a good idea to let (expert) users overwrite the 95% value at scan time in order to more reflect the character of audio under consideration. The following, including manual post-processing, is not uncommon:

Beatles tracks tend to sound much louder than other stuff in my collection after RG, so I normally bump them downward.

I edited my post to remove the word "much" as it was an unintended exaggeration.  My apologies to everyone.  If it matters any, my Beatles tracks are pre-2009.  I don't know if this is still a noticeable problem with the new remasters.
Title: Replay Gain specification
Post by: Notat on 2010-12-17 19:02:16
No - because you have to assume some listening level to use any psychoacoustics. Talking only about samples values in files with no real world reference is exactly how you create a dead-end standard which no one can ever improve.


There is a major change to make though: what's stored is the 83dB referenced result, plus an arbitrary 6dB. That's a defacto change from the original proposal.

Cheers,
David.


An 83 dB SPL listening level assumption contradicts what is (not) said in section 1.1.2 (Required equal loudness filter) - "As we don't know the playback level the listener will choose, and don't want to use a different filter for sounds of differing loudness, a representative average of the above curves will is chosen as the target filter."

Its not entirely unambiguous what a representative average response curve is but I gather you did not use an 83 dB loudness contour to build the filter.

A simple option is for me to edit out conflicting non-normative detail in both sections. What's normative is the filter design (which I have yet to include in the specification) and the formula for calculating gain (I've got work to do there as well).

I believe the +6 dB is correctly called out in section 3.2 (Pre-amp). The text on replaygain.hydrogenaudio.com says 6 to 12 dB. I removed the 12 dB option in my early edits because I knew 6 dB was current practice.
Title: Replay Gain specification
Post by: 2Bdecided on 2010-12-17 19:27:31
I believe the +6 dB is correctly called out in section 3.2 (Pre-amp).
No, that's nothing to do with what is stored.

Quote
The text on replaygain.hydrogenaudio.com says 6 to 12 dB. I removed the 12 dB option in my early edits because I knew 6 dB was current practice.
You know, I haven't read all this through since I wrote it! There were some nuances of meaning that don't seem important now, and others that seem more important.

I will try to contribute, time allowing. Sadly it can't be top of my list. Well, not "sadly" - new house to get ready, new baby on the way, job, Christmas - all good!

Cheers,
David.
Title: Replay Gain specification
Post by: pbelkner on 2010-12-17 20:26:12
I don't know if this is still a noticeable problem with the new remasters.

If RG is restricted to remasters it should be clearly stated in the standard.
Title: Replay Gain specification
Post by: greynol on 2010-12-17 20:34:04
That response seems unnecessary.  It was never my intention to suggest that RG be restricted to remasters.

The reason for my making that comment was to avoid having people come back saying they don't notice; later to find out that they are checking with the remasters.
Title: Replay Gain specification
Post by: twittles on 2010-12-17 21:52:05
I think updaitng the documentation of RG V1 and structuring a V2 is a great idea.

A suggestion - please clarify for both RG V1 and any V2 whether the use of RG tags (as opposed to applying RG to mp3 data) prevents "bit perfect" playback.  I seen it posted elsewhere that the use of RG violates bit perfect playback, but I've never seen it clarified what modes of RG use this assumes.  Perhaps I know too much as an engineer and too little in this specific space (my engineering expertise lay elsewhere), but I thought that volume or level changes via RG were communicated digitally to a RG-capable playback device without altering the actual digital bits of the music, and that the playback device altered the level using the RG tag info.

Perhaps clarifying what is meant by "bit perfect" in any clarification will also help.  I think "bit perfect" can mean "no bits are lost, added, or inadvertently altered or distorted".  Assuming this , even if RG alters the actual bits during playback even when using tags, if no bits are lost or added and the only change is intentional, this seems to fit a reasonable defintion of "bit perfect with intended modification" or something like that.

Thanks for any clarification on RG and bit perfect playback, stating all assumptions for all judgments.
Title: Replay Gain specification
Post by: Notat on 2010-12-17 23:01:31
If the replay gain is applied in the digital domain, bit transparency is lost. The original proposal included a short discussion of a digitally-controlled analog implementation. For some reason I had not carried that discussion over to the new revision. I have updated the new revision to include it (http://wiki.hydrogenaudio.org/index.php?title=Replay_Gain_specification#Hardware_implementation). This demonstrates that a bit-transparent implementation is possible. I'm not aware of any such implementation, however.
Title: Replay Gain specification
Post by: Notat on 2010-12-17 23:09:55
I will try to contribute, time allowing. Sadly it can't be top of my list. Well, not "sadly" - new house to get ready, new baby on the way, job, Christmas - all good!

Congratulations on all that!

I've yet to dig carefully through the discussion on RG changes. Hopefully some of these points will iron themselves out as I do.
Title: Replay Gain specification
Post by: twittles on 2010-12-17 23:38:48
If the replay gain is applied in the digital domain, bit transparency is lost. The original proposal included a short discussion of a digitally-controlled analog implementation. For some reason I had not carried that discussion over to the new revision. I have updated the new revision to include it (http://wiki.hydrogenaudio.org/index.php?title=Replay_Gain_specification#Hardware_implementation). This demonstrates that a bit-transparent implementation is possible. I'm not aware of any such implementation, however.



Thanks for addressing this.  I'm very impressed that V1 considered this back in 2002.  I think digitally-controlled analog implementation of RG or other bit-transparent RG methods are and will be increasingly important for those creating high quality home theater and whole home audio systems.  My guess is that a bit-transparent RG implementation in a uPnP/DLNA environment will require changes to uPnP or DLNA standards or practices, but I see that as an opportunity to suggest such changes to the relevant bodies that control those specs and practices rather than see that as a permanent barrier.  All things considered, everything evolves, and I'd love to see high quality audio evolve with RG as part of it.
Title: Replay Gain specification
Post by: greynol on 2010-12-17 23:39:45
I'm not aware of any such implementation, however.

I'd like to see a justification of such an implementation based on the results of blind tests with real-world examples.
Title: Replay Gain specification
Post by: twittles on 2010-12-17 23:49:54
I'm not aware of any such implementation, however.

I'd like to see a justification of such an implementation based on the results of blind tests with real-world examples.



I'd like to see what difference this makes as well under various real-world situations.  However, I don't know if I think of it as a "justification" - that implies to me that such an implementation has been proven to be tougher or less desireable to do in some undefined way.  Has it been proven to be tougher or less desireable to implement?  In what way?  What is the standard for "justification"?  What factors do you consider for justification?  Programming cost?  Hardware cost?  Ease of user setup?  Sound quality (double-blind confirmed, of course)?  What relative weights apply to each factor to arrive at a justification?  I'm not trying to be difficult, but as an Electrical Engineer who designed ADCs and DACs long ago this strikes me as just as easy to implement, inertia of current practices notwithstanding.
Title: Replay Gain specification
Post by: greynol on 2010-12-17 23:57:23
Telling your amplifier to adjust the volume by a specific amount?  I would certainly think so.

EDIT: Noting your edit: sound quality.  Back on the topic of difficulty in implementation, perhaps you can explain the mechanism by which this can be accomplished within the current framework of digital transmission.  If it falls outside the current framework, please explain how you would get universal adoption and implementation.
Title: Replay Gain specification
Post by: 2Bdecided on 2010-12-17 23:58:16
Maybe this is a helpful comment, and along the lines of what Notat is already doing...

The "Replay Gain Specification" wiki should be first and foremost an implementation guide. The "fluffy bits" can go. There will still be the ReplayGain HA wiki, which is already doing a very good job explaining it.

However, one thing is important: RG defines a calculation, a way of storing the result of the calculation, and a way of reading and using those values...
1. The most important thing is that you store two gain values and two peak values, with meanings + references/scales as defined.
2. The second most important thing is that a player does something sensible with these - about the most sensible thing I can think of is pretty much what was suggested a decade ago in the original spec, but I'm sure there are variations
3. The third most important thing is the calculation of those values. Only third-most, because you could improve it while remaining completely compatible with the intent of ReplayGain and all players.

Oh, while we're talking about defacto standards, I think ReplayGain is better than Replay Gain. FWIW Google seems to think it's more common.

Cheers,
David.
Title: Replay Gain specification
Post by: twittles on 2010-12-18 01:27:24
Telling your amplifier to adjust the volume by a specific amount?  I would certainly think so.

EDIT: Noting your edit: sound quality.  Back on the topic of difficulty in implementation, perhaps you can explain the mechanism by which this can be accomplished within the current framework of digital transmission.  If it falls outside the current framework, please explain how you would get universal adoption and implementation.



I see you think that inertia of current practices is the obstacle, and I already anticipated that even if I also hold such inertia as more of an excuse than valid reason. 

Hardware perspective:  From a "clean slate" view for a new product I hope you can see that this is trivial.  Even from the inertia perspective, any playback device that reads tag information will already have the RG tag information if there is a RG tag in the song file, and even if it doesn't the change is straightforward.  Adding logic to a playback device to use the tag information for volume control is truly trivial.  I've been designing integrated circuits for many years, so I am confident in this assessment.  The incremental approach to any such implementation approaches zero if any change is made when stepper reticles or masks are changed for other reasons.  The microcode, logic and transistor redesign, layout design, design verification, and testing verification changes are all trivial if done in conjunction with other scheduled changes.  Package or pinout change should be zero.  Been there, done all of that.  Any such changes can be timed for product refreshes (as opposed to just for this one change), as is done across the industry.

If the design change is made using discrete components (trying to use existing integrated curcuits that have not designed this in), the component cost goes up a little, but not much.  I can definitely make the change cost look astronomical if I burden it with fixed and other allocated costs, but the true incremental cost for a competent designer is small.  I also have an MBA and I take on bogus business cases at work all the time from those that want to kill a change with nonincremental costs.

From a digital transmission perspective, I hope you were joking.  Surely you aren't proposing that ethernet(TCP/IP), 802.11x, USB, S/PDIF, or other digital transmission standards need to be modified to accommodate this.

I am not an expert on application changes, but I would truly be disappointed if the above can be done fairly easily and still have an application architect or programmer claim it's difficult.  Yes, it is different than today, but that doesn't automatically make it difficult.  Taking a step back, this can't be more difficult than implementing RG from scratch, and that happened years ago with far less capable technology. 

How get this adopted?  I'd go after the IC makers first for the reasons above and time the changes for an already-scheduled refresh.  I know this is not done most of the time, but I think it's because of a lack of relationships at the IC level.  I'd go after Sigma Designs or TI first.  I'm on the defense side of the biz, so I don't know if these market leaders are hungry or complacent in this consumer market, but if complacent, go after their competitors who may be more open to something that will differentiate their product.  You want fast adoption, go after the IC designers.

I hope we can at least agree that a rigorous sound comparision with well-executed implementations of the two approaches would be very informative.
Title: Replay Gain specification
Post by: greynol on 2010-12-18 05:37:13
I guess I shouldn't have lumped this in with the thread you started about bit-perfect playback then.  Seeing that bit-perfect playback is about passing digital data from your media player unaffected through your soundcard to your external DAC, I don't see RG information being passed downstream to your preamp, integrated amplifier or receiver as a trivial endeavor.

Otherwise what you're suggesting has already been implemented in audio hardware such as the various Squeezebox devices or DAPs enabled with Rockbox or in a comparable but non-RG manner such as the iPod devices via soundcheck.

Anyway, if you haven't already, please read-up on this forum as I don't think any of this is new territory.

If I'm wrong on any of these points, please feel free to correct me provided someone else doesn't beat you to it.
Title: Replay Gain specification
Post by: Notat on 2010-12-18 16:21:06
I'd like to see what difference this makes as well under various real-world situations.

In the commercial designs I've reviewed, the introduction of a digitally-controlled gain stage degrades performance to the extent that you can get about the same or better system performance by not using the top two MSBs of your DAC when you need to attenuate by the amounts we're talking about with RG. If you don't believe me, look up the specs on digital attenuator devices. It is not that the attenuators suck badly it's because today's DACs are so good.
Title: Replay Gain specification
Post by: pbelkner on 2010-12-19 10:09:23
Back to the standard itself. There are a few questions coming to my mind regarding the scanner. Please have mercy on me asking these possibly trivial questions, but my knowledge regarding filter construction is limited at best.

Quote
Required equal loudness filter

[...]

[A] representative average of the above curves [i.e. the equal loudness curves as measured by Robinson and Dadson, 1956, and Fletcher and Munson, 1933] will is chosen as the target filter. The desired filter response is shown in Figure 2.

Will the final standard define the "representative average" more detailed then just giving a figure?

Quote
Design of the equal loudness filter

[...]

Feeding the target response into [MATLAB] yulewalk.m, and requesting a 2x10 coefficient IIR filter gives the following response: [referring to fig. 3].

[...]

One solution is to cascade the yulewalk filter with a 2nd order Butterworth high pass filter, with a high pass frequency of 150 Hz. The resulting combined response (Figure 4) is close to our target response, and is used by Replay Level.

Will the resulting coefficients of the yulewalk and the Butterworth filters (cf. http://replaygain.hydrogenaudio.org/equal_loud_coef.txt (http://replaygain.hydrogenaudio.org/equal_loud_coef.txt)) be part of the standard?

When is an alternate scanner implementation (as e.g. wavegain (http://www.rarewares.org/others.php) using two IIR filters with different coefficients, possibly with the same or similar response as the MATLAB yulkewalk and Butterworth filters) called compliant with the standard?
Title: Replay Gain specification
Post by: C.R.Helmrich on 2010-12-26 20:47:26
Just letting you know that I took the liberty and bumped an old algorithm-related thread on ReplayGain with some updates, as an invitation to continue the development related disussion there.

http://www.hydrogenaudio.org/forums/index....showtopic=69568 (http://www.hydrogenaudio.org/forums/index.php?showtopic=69568)

Best,

Chris
Title: Replay Gain specification
Post by: grindel on 2010-12-26 22:04:10
Please consider something in the enhanced spec to deal with how HDCD processing and Replaygain should fit together.  Per the thread here http://www.hydrogenaudio.org/forums/index....showtopic=79427 (http://www.hydrogenaudio.org/forums/index.php?showtopic=79427) , it seems that currently one must not use Replaygain processing for any tracks for which one wants to decode for HDCD.  It seems that Replaygain is currently applied first, and when this is done it somehow corrupts whatever the HDCD decoder needs to detect and decode properly. 

If it can be done, it makes sense to me to have Replaygain to either not corrupt whatever is messing up HDCD processing, or perhaps have Replaygain have an option to allow the user to detect for HDCD, and if chosen, have Replaygain combine its processing and whatever HDCD does in a coordinated fashion.  It appears that hardware solutions can decode all aspects of HDCD, but HDCD.exe that a few applications seem to use only process portions of the full HDCD spec - I have no idea what this means for the Replaygain spec, but the person who wrote HDCD.exe seems to be over in Doom9 per http://forum.doom9.org/showthread.php?t=129136 (http://forum.doom9.org/showthread.php?t=129136)

In general, whatever you can do to not force users to pick either Replaygain use or HDCD decoding would be appreciated.  At a very minimum, please make in clear in the current spec that this current tradeoff must occur, if the thread above is accurate.  Thanks for picking up Replaygain spec documentation and development!
Title: Replay Gain specification
Post by: greynol on 2010-12-26 22:51:36
I think a post-processing implementation of RG should be handled by the player, rather than within the RG specification.
Title: Replay Gain specification
Post by: grindel on 2010-12-26 23:05:09
I think a post-processing implementation of RG should be handled by the player, rather than within the RG specification.



That's way it's implicitly done today, and this approach fails users who want the benefits of both.  The current Replaygain spec is silent on this issue, and since other developers are not mind readers, this issue is not dealt with.  A different result requires a different approach.
Title: Replay Gain specification
Post by: greynol on 2010-12-26 23:12:01
That's way it's implicitly done today, and this approach fails users who want the benefits of both.  The current Replaygain spec is silent on this issue, and since other developers are not mind readers, this issue is not dealt with.  A different result requires a different approach.

Yes and the approach should be taken up with the people designing the players which actually perform the post-processing rather than simply shifting expectation of omniscience over to the specification.

Think about it: people might want RG to be post-surround sound processing or post-EQ or album gain that is applied to an arbitrary playlist (http://www.hydrogenaudio.org/forums/index.php?showtopic=85614).

Also, consider that HDCD decoding is not an open specification.
Title: Replay Gain specification
Post by: krabapple on 2010-12-27 09:42:37
Yes and the approach should be taken up with the people designing the players which actually perform the post-processing rather than simply shifting expectation of omniscience over to the specification.

Think about it: people might want RG to be post-surround sound processing or post-EQ or album gain that is applied to an arbitrary playlist (http://www.hydrogenaudio.org/forums/index.php?showtopic=85614).

Also, consider that HDCD decoding is not an open specification.



Forget HDCD, I just wish I could apply RG to .dts and .ac3 (or .dtswav/.ac3wav) bitstreams.  I can't see clearly how this would be done, though.  If transmission isn't bit-perfect, such files only yield white noise from my AVR. 


Title: Replay Gain specification
Post by: googlebot on 2010-12-27 12:11:01
Replay Gain is not the missing link for highly individual, particular problems. It has a well defined purpose. Its place within the playback chain is after decoding. HDCD decoding should happen before Replay Gain application, the same is true for dts and ac3 and all other codecs. Replay Gain information requires proper storage space for its records within a metadata stream. Requests regarding bitstreams which do not provide such should be ignored.
Title: Replay Gain specification
Post by: andy o on 2010-12-27 12:32:03
Forget HDCD, I just wish I could apply RG to .dts and .ac3 (or .dtswav/.ac3wav) bitstreams.  I can't see clearly how this would be done, though.  If transmission isn't bit-perfect, such files only yield white noise from my AVR.

How about tweaking the dialnorm tag? Most AVRs at least nowadays allow for dialnorm.
Title: Replay Gain specification
Post by: audyoknut on 2010-12-27 19:08:19
Replay Gain is not the missing link for highly individual, particular problems. It has a well defined purpose. Its place within the playback chain is after decoding. HDCD decoding should happen before Replay Gain application, the same is true for dts and ac3 and all other codecs. Replay Gain information requires proper storage space for its records within a metadata stream. Requests regarding bitstreams which do not provide such should be ignored.


Peforming Replaygain actions after decoding anything in the bitstream makes sense.  Isn't the problem a few posts above a result of performing Replaygain before decoding something in the bitstream (in the case above, HDCD)?  If not accurate, when does Foobar perform Replaygain decoding today with respect to ac3, dts, and HDCD?

However player currently process Replaygain, does it make sense for the revised Replaygain spec to address the consideration of when Replaygain is applied, either by stipulating a requirement, making an explicit assumption, or providing for implementation before or after decoding the bitstream?  I fear confusion and suboptimal use of Replaygain if these issues are not dealt with in smoe way in the spec.
Title: Replay Gain specification
Post by: googlebot on 2010-12-27 19:58:38
I do not agree. It's trivial to understand that Replay Gain modifies gain. And it is trivial to understand, that any codec which requires unmodified data (as HDCD) must be applied before Replay Gain modification.

It's impossible to apply Replay gain to a AC3 or DTS bitstream without decoding, so it is not necessary to specify that, either. At least DTS has its own loudness control variant, too. That should be preferred instead of hacking Replay Gain into it.
Title: Replay Gain specification
Post by: audyoknut on 2010-12-27 20:11:49
I do not agree. It's trivial to understand that Replay Gain modifies gain. And it is trivial to understand, that any codec which requires unmodified data (as HDCD) must be applied before Replay Gain modification.

It's impossible to apply Replay gain to a AC3 or DTS bitstream without decoding, so it is not necessary to specify that, either.



I think the difference between "easy to understand once explained" and "intuitive such that no mention is warranted" has led to the current confusion.  I think there is a connection between the lack of any mention of such considerations in the current spec, and implementations that have apprently led to applying Replaygain before decoding at least HDCD.  I have a great deal of respect for the developers of Foobar, and they obviously had many, many things to think about during development, so I can see how no mention of this contributed to the current situation.

Some may argue that given that Replaygain can only be used after decoding with certain codecs (for example but not limited to ac3, dts), but can be implemented even if certain other codecs have not been applied (for example but not limited to HDCD), that the situation is not intuitive or even trivial.

If there is so strong a feeling that Foobar is currently processing Replaygain incorrectly (prior to decoding everything in the bitstream), then perhaps some energy should be expended to correct that if only one way of Replaygain implementation will be tolerated.
Title: Replay Gain specification
Post by: googlebot on 2010-12-27 20:34:17
If there is so strong a feeling that Foobar is currently processing Replaygain incorrectly (prior to decoding everything in the bitstream), then perhaps some energy should be expended to correct that if only one way of Replaygain implementation will be tolerated.


Foobar is not doing anything "incorrectly". HDCD is to blame and the issue is exclusive to it. It doesn't fit into a clean architecture where decoding is done by decoders and DSP is done by DSP components. The HDCD component (by a 3rd party developer) is wrapped into a DSP component although it is a decoder. Blame a broken format. The Foobar devs haven't done anything wrong. If the HDCD component wants to support HDCD and Replay Gain support, nothing prevents the component developer from adding Replay Gain support to his component (users would have to disable the integrated solution).

HDCD is a feature without audible benefit and being fed with unmodified samples is its exclusive requirement. IMHO there are no other cases. AC3 and DTS is unrelated.
Title: Replay Gain specification
Post by: greynol on 2010-12-27 20:54:05
If there is so strong a feeling that Foobar is currently processing Replaygain incorrectly (prior to decoding everything in the bitstream), then perhaps some energy should be expended to correct that if only one way of Replaygain implementation will be tolerated.

I'm getting the feeling that neither you, nor the prior poster understand the evolution of ReplayGain, let alone what ReplayGain is.  ReplayGain began as nothing more than an implementation of a proposed standard.  There is no official body granting or denying its usage based on some idea of compliance or anything else for that matter.  ReplayGain began before HDCD became as commonplace as it is today, which is still not very common (which is still a huge overstatement of its ubiquity).

This is not to say that it is impossible for RG to be applied post-processing, just that there is no need to start including exceptions about current and future unforeseen applications; especially considering that this won't change the way RG is calculated.

Oh, besides the other cases that I listed, I just thought of another one: de-emphasis.  Does this mean that RG needs a specific exception for this too?  Of course not.  De-emphasize -> re-encode (lossless/lossy) -> calculate RG.  It's just common sense.
Title: Replay Gain specification
Post by: audyoknut on 2010-12-27 20:57:17
HDCD is a feature without audible benefit...



That's a TOS #8 violation, unless you can provide a valid test to refute the first case evaluated here: http://www.hydrogenaudio.org/forums/index....showtopic=85019 (http://www.hydrogenaudio.org/forums/index.php?showtopic=85019)  I concluded this was a TOS violation becuase TOS #8 seems to require objective support for any statement concerning subjective sound quality, not just assertions of audible differences.  If my interpretation is in error and claims of no audible differences are allowed, then I guess this is not a TOS violation.

Replaygain exists in a world with many other pre-exisitng aspects, not in a world by itself.  I think any spec that ignores the world around it does so at the risk of limited usefulness.  Thinking about exceptions, or use cases, or implementation guidance seems fully appropriate for any spec.  I am not comforted that "blame" can be cast elsewhere.  At a minimum, as many of us learned in engineering school, stating all assumptions as explicitly as possible is just part of rigorous technical work.  If the goal is to exclude all such considerations, at least make explicit assumptions to that effect, without casting any blame.
Title: Replay Gain specification
Post by: greynol on 2010-12-27 21:02:35
Yes your interpretation is incorrect; googlebot's claim is not a TOS8 violation.  Feel free to prove him wrong with level-matched and time-synched samples and an ABX log demonstrating a statistically significant result, however.  Since you took the liberty to link the discussion I started, please follow the spirit of the discussion and not waste our time comparing a sample of HDCD not decoded with one that is.

BTW, who told you to calculate RG and write tags to HDCD that has not yet been decoded???  Don't blame the spec because people haven't exercised proper common sense regarding its application!
Title: Replay Gain specification
Post by: GeSomeone on 2010-12-27 21:43:06
The HDCD component is wrapped into a DSP component although it is a decoder.

This is not true (anymore) since foobar2000 1.1 and foo_hdcd 1.5 (http://www.hydrogenaudio.org/forums/index.php?showtopic=79427&view=findpost&p=720688)
Title: Replay Gain specification
Post by: pbelkner on 2010-12-28 07:17:34
It's impossible to apply Replay gain to a AC3 or DTS bitstream without decoding,

I'm not certain about this, but I can imagine that decoding is done using floating point numbers allowing values above FS, and the final step of the decoder is casting these FP numbers into 16 bit integers, possibly causing clipping.

If the final cast of the FP number into an integer is integrated with applying RG clipping could possibly avoided.
Title: Replay Gain specification
Post by: googlebot on 2010-12-28 08:39:09
That whole thought is so flawed, I don't even know where to start.

The easiest would be to bin this for continued off-topicness. Or we split from post 29 up to this one. I'm going to answer then.
Title: Replay Gain specification
Post by: BlAcKnOiSe on 2010-12-28 10:03:19
My biggest problem with RG always was, that in the end ALL of my albums will have the same RMS level. This is great for "old-quiet vs new-loud" correction, but a disaster in the sense that ambient and classical will sound as loud as heavy metal...
(Not to mention that a lot of ambient drone tracks will have peaks above 0, unless you tell the unit to take that in to account.)
Some music is simply intended to be quieter than others. I still have to touch the volume slider to adjust for these differences, so it almost defeats the whole purpose of RG in this sense.

Isn't there a chance that RG can store information about the style of the music, and have an option to further lower the level like e.g. of pop/rock/electro by 3-6dB and classical/jazz/ambient by 6-12dB or something similar?
Title: Replay Gain specification
Post by: pbelkner on 2010-12-28 10:57:03
That whole thought is so flawed, I don't even know where to start.

Why?

At least liba52 (http://liba52.sourceforge.net/ (http://liba52.sourceforge.net/)) provides a way to supply a gain to the decoder. I've just moved applying RG to AC3 streams to the decoder, and it works great. Moreover, I think it's the best place where to do it, if possible.

The easiest would be to bin this for continued off-topicness. Or we split from post 29 up to this one. I'm going to answer then.

The OT discussion was not started by me.

Title: Replay Gain specification
Post by: googlebot on 2010-12-28 11:51:40
At least liba52 (http://liba52.sourceforge.net/ (http://liba52.sourceforge.net/)) provides a way to supply a gain to the decoder. I've just moved applying RG to AC3 streams to the decoder, and it works great. Moreover, I think it's the best place where to do it, if possible.


The point is, it wasn't claimed that RG cannot be applied by a decoder. AC3 and DTS bitstreams just don't have a RG metadata field. So you either have to use their private methods for loudness correction instead of RG or apply gain before encoding, which implies prior decoding for an already existing DTS or AC3 stream.
Title: Replay Gain specification
Post by: Notat on 2010-12-28 17:10:41
That whole thought is so flawed, I don't even know where to start.

The easiest would be to bin this for continued off-topicness. Or we split from post 29 up to this one. I'm going to answer then.

It would be helpful to me if off-topic contributers started their own threads every once in a while. Failing that, I will start the new ones (http://www.hydrogenaudio.org/forums/index.php?showtopic=85834).
Title: Replay Gain specification
Post by: Notat on 2010-12-28 17:31:13
Will the final standard define the "representative average" more detailed then just giving a figure?

Will the resulting coefficients of the yulewalk and the Butterworth filters (cf. http://replaygain.hydrogenaudio.org/equal_loud_coef.txt (http://replaygain.hydrogenaudio.org/equal_loud_coef.txt)) be part of the standard?

When is an alternate scanner implementation (as e.g. wavegain (http://www.rarewares.org/others.php) using two IIR filters with different coefficients, possibly with the same or similar response as the MATLAB yulkewalk and Butterworth filters) called compliant with the standard?

What I have posted so far is a copy of the original RG proposal. Much of the discussion and background is interesting and potentially useful but is not a normative part of the standard. The normative piece for the filter is currently found in the MATLAB files available at the bottom of this page (http://replaygain.hydrogenaudio.org/calculating_rg.html). I still need to convert that from MATLAB to a normative description.

An alternative filter design with same response would potentially be compliant. An alternate filter design with similar response would not be compliant.

Going forward, I would like to discuss incorporation of loudness standards such as ITU-R BS.1770 into an updated version of RG. This is a loudness measurement system that gives results very similar to RG, is slightly less computationally intensive and is starting to be used extensively in broadcast applications. The recently signed CALM act mandates ITU-R BS.1770 for loudness measurement.
Title: Replay Gain specification
Post by: Notat on 2010-12-28 17:33:11
Just letting you know that I took the liberty and bumped an old algorithm-related thread on ReplayGain with some updates, as an invitation to continue the development related disussion there.

http://www.hydrogenaudio.org/forums/index....showtopic=69568 (http://www.hydrogenaudio.org/forums/index.php?showtopic=69568)

Thank you very much for bringing this to my attention. I have bookmarked it and will refer to it as I move forward.
Title: Replay Gain specification
Post by: Notat on 2010-12-28 17:39:30
My biggest problem with RG always was, that in the end ALL of my albums will have the same RMS level. This is great for "old-quiet vs new-loud" correction, but a disaster in the sense that ambient and classical will sound as loud as heavy metal...
(Not to mention that a lot of ambient drone tracks will have peaks above 0, unless you tell the unit to take that in to account.)
Some music is simply intended to be quieter than others. I still have to touch the volume slider to adjust for these differences, so it almost defeats the whole purpose of RG in this sense.

Isn't there a chance that RG can store information about the style of the music, and have an option to further lower the level like e.g. of pop/rock/electro by 3-6dB and classical/jazz/ambient by 6-12dB or something similar?

Many media players give you the ability to adjust level on a per-track basis. I believe this adjustment is above and beyond what loudness normalization like RG does for you. You should be able select files by genre and tweak the level manually en-mass as you've specified. I don't think we want to make RG smart enough to do this for listeners. In this department, what works for you will not necessarily work for others.
Title: Replay Gain specification
Post by: krabapple on 2010-12-28 20:54:27
The point is, it wasn't claimed that RG cannot be applied by a decoder. AC3 and DTS bitstreams just don't have a RG metadata field. So you either have to use their private methods for loudness correction instead of RG or apply gain before encoding, which implies prior decoding for an already existing DTS or AC3 stream.



Bear with my ignorance a moment please and tell me if I'm right or wrong --

My ac3 and dts files are all in WAV wrappers now (i.e., dtswav, ac3wav).  That made it possible to convert them to FLAC, which I have also done.  These FLAC files are actually the ones I stream via toslink from my laptop (foobar2k with foo_spdif and WASAPI output) to my AVR, which decodes them into surround sound.  This particular combination of file format and software settings makes it possible to have F2K playlists containing both 2channel and 5.1 channel tracks, all with proper tagging and album art....a wonderful thing. 

I certainly can add replaygain meta info to my flac'ed dts and ac3 files -- thanks to the RG metadata field.  But they don't play if I do that, I'm guessing because foobar is trying to change the gain of an undecoded bitstream.  If I understand correctly, the solution would be for F2K to leave them alone, and have something in the *AVR* that recognizes the RG field of the incoming steam, and adjusts decoded output accordingly.

[EDIT: ooops, if this belongs in the new thread, I'd be grateful if the mods moved it]
Title: Replay Gain specification
Post by: [JAZ] on 2010-12-29 18:44:10
Seems you are trying to solve a problem without understanding why it fails.


spdif is just a stream of data. This data can be raw, or can be a codec.

Now, if the file is stored not only inside a .wav file, but in fact, a lossless codec, why would the program know that it has an encoded file?

The problem is not about Replaygain, think about it: Any DSP would fail (equalizer, reverb, stereo expander, resampler...). The only way for this to not cause a problem is for the program to know in some way that it is streaming a signal, and not playing audio.
Title: Replay Gain specification
Post by: googlebot on 2010-12-29 20:28:10
[JAZ] is right. Foobar doesn't care, what's inside a sample and will just send it to the audio output. This works so good that you can put totally different formats as AC3 and DTS into PCM samples, and they get passed along just fine. As it is the case the case for analog audio, metadata is not part of the output stream, just because you choose to output over S/PDIF. Foobar would have to modify the DTS or AC3 stream directly to encode that kind of information. Either by reencoding the content or by tapping into those formats' respective methods for loudness control, which is IMHO preferable.
Title: Replay Gain specification
Post by: greynol on 2010-12-29 20:34:51
As it is the case the case for analog audio, metadata is not part of the output stream, just because you choose to output over S/PDIF.

This was precisely what I was getting at in my earlier discussion with twittles.
Title: Replay Gain specification
Post by: krabapple on 2010-12-29 20:39:10
As it is the case the case for analog audio, metadata is not part of the output stream, just because you choose to output over S/PDIF.

This was precisely what I was getting at in my earlier discussion with twittles.


By george I think I've got it now.  Thanks guys.

(Btw, I never thought the problem was about Replaygain.)
Title: Replay Gain specification
Post by: Axon on 2011-01-04 01:00:43
A couple ideas, please comment if they sound worthwhile:
Title: Replay Gain specification
Post by: Notat on 2011-01-04 04:05:33
I will add a brief discussion of reference program level to the specification.

It is reported (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=69568&view=findpost&p=736857) that this paper (http://www.aes.org/e-lib/browse.cfm?elib=15341) compares RG to BS.1770. BS.1770 apparently is slightly more accurate. Apple does not publish technical information about Sound Check but reports I've heard indicate that it is less accurate that RG or BS.1770.

I have done some more updates to the specification (http://wiki.hydrogenaudio.org/index.php?title=Replay_Gain_specification) tonight. The loudness measurement filter is now specified in terms of topology and filter coefficients. I use the same presentation approach used in the BS.1770 standards document.
Title: Replay Gain specification
Post by: C.R.Helmrich on 2011-01-04 22:31:21
It is reported (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=69568&view=findpost&p=736857) that this paper (http://www.aes.org/e-lib/browse.cfm?elib=15341) compares RG to BS.1770.

Not to forget the Master's thesis I linked to, which comes to the same conclusion.

Quote
ReplayGain ought to be compared with SoundCheck and BS.1770 (now that those standards exist)

I'd rather compare RG to the state of the art, meaning the "integrated" EBU mode meter according to R 128, which is BS.1770 with a loudness gate. See my post in the thread Notat mentioned. My own experiments with high-dynamic-range songs like Phil Collins' "In The Air Tonight" seem to confirm the necessity of gating (otherwise the loud passage of the song ends up too loud).

Chris


Edit: fixed some typos
Title: Replay Gain specification
Post by: Notat on 2011-01-04 23:14:52
It is reported (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=69568&view=findpost&p=736857) that this paper (http://www.aes.org/e-lib/browse.cfm?elib=15341) compares RG to BS.1770.

Not to forget the Master's thesis I linked to, which comes to the same conclusion.

Just finished reading the Dolby paper. It does not actually contain a direct performance comparison of RG and BS.1770.

Next up on my reading list is the Swedish Radio thesis (http://www.speech.kth.se/prod/publications/files/3319.pdf).

Thanks again for bringing this all to my attention.
Title: Replay Gain specification
Post by: Axon on 2011-01-05 00:22:48
OT: "Swedish Radio Thesis" is a fine band name.
Title: Replay Gain specification
Post by: pbelkner on 2011-01-05 18:46:43
Quote
ReplayGain ought to be compared with SoundCheck and BS.1770 (now that those standards exist)

I'd rather compare RG to the state of the art, meaning the "integrated" EBU mode meter according to R 128, which is BS.1770 with a loudness gate. See my post in the thread Notat mentioned. My own experiments with high-dynamic-range songs like Phil Collins' "In The Air Tonight" seem to confirm the necessity of gating (otherwise the loud passage of the song ends up too loud).

Many thanks for pointing me to that.

As far as I understand matters, BS.1770 is a loudness measuring method. A standard comparable to RG and residing on top of BS 1770 is EBU R128 (cf. http://tech.ebu.ch/loudness) (http://tech.ebu.ch/loudness)).

I've just started a corresponding topic (http://www.hydrogenaudio.org/forums/index.php?showtopic=85978&hl=).
Title: Replay Gain specification
Post by: krabapple on 2011-01-05 20:49:16
As far as I understand matters, BS.1770 is a loudness measuring method. A standard comparable to RG and residing on top of BS 1770 is EBU R128 (cf. http://tech.ebu.ch/loudness) (http://tech.ebu.ch/loudness)).



that link is broken for me.

but I found this direct link to the pdf of EBU R128 (http://tech.ebu.ch/docs/r/r128.pdf) "Loudness normalisation and permitted maximum level of audio signals"
Title: Replay Gain specification
Post by: pbelkner on 2011-01-05 21:13:25
As far as I understand matters, BS.1770 is a loudness measuring method. A standard comparable to RG and residing on top of BS 1770 is EBU R128 (cf. http://tech.ebu.ch/loudness) (http://tech.ebu.ch/loudness)).



that link is broken for me.

Sorry for including the right parentheses into the link. The correct link is
[blockquote]http://tech.ebu.ch/loudness (http://tech.ebu.ch/loudness)[/blockquote]
Title: Replay Gain specification
Post by: krabapple on 2011-01-06 03:19:29
As far as I understand matters, BS.1770 is a loudness measuring method. A standard comparable to RG and residing on top of BS 1770 is EBU R128 (cf. http://tech.ebu.ch/loudness) (http://tech.ebu.ch/loudness)).



that link is broken for me.

Sorry for including the right parentheses into the link. The correct link is
[blockquote]http://tech.ebu.ch/loudness (http://tech.ebu.ch/loudness)[/blockquote]



Thanks.
Title: Replay Gain specification
Post by: singlethread on 2011-01-07 22:17:19
There is a major change to make though: what's stored is the 83dB referenced result, plus an arbitrary 6dB. That's a defacto change from the original proposal.

I believe the +6 dB is correctly called out in section 3.2 (Pre-amp). The text on replaygain.hydrogenaudio.com says 6 to 12 dB. I removed the 12 dB option in my early edits because I knew 6 dB was current practice.

Notat, I don't think this is right.

The original RG spec was calibrated to -20 dB, and RG analyzers were supposed to store the corresponding level change. However, it quickly became common among RG analyzers to add an extra +6 dB before storing the level change. Effectively, these analyzers are storing the required level change for -14 dB, not -20 dB. The analyzers I have studied (mp3gain, lame, vorbisgain) all use this modified level.

This is quite different from pre-amp. Pre-amp is an optional correction which is supposed to be applied by the decoder, not the RG analyzer.

The addition of +6 dB, as implemented by RG analyzers, was accepted as the new standard by David ([a href='index.php?act=findpost&pid=154605']here[/a]). I believe that is what he refers to above. Unfortunately that change was never added to the replaygain website. Can somebody confirm that current implementations indeed use -14 dB instead of -20 dB?

The updated RG spec still talks about -20 dB, same as the original spec. This should be changed to -14 dB, according to the modified standard.

Sorry for jumping on this so late. I was just reading through the current version of Notat's updated spec when I noticed this.
Title: Replay Gain specification
Post by: 2Bdecided on 2011-01-07 22:38:34
^^^^ this is correct.
Title: Replay Gain specification
Post by: Notat on 2011-01-08 17:18:55
The "Replay Gain Specification" wiki should be first and foremost an implementation guide. The "fluffy bits" can go. There will still be the ReplayGain HA wiki, which is already doing a very good job explaining it.

I am working my way through the document from top to bottom. I have thus far removed "fluffy bits" from beginning through section 1.3. Section 1.4 is next and in addition to removing fluffy bits, I will update the reference level as discussed in this thread and other references.
Title: Replay Gain specification
Post by: Jean Tourrilhes on 2011-01-10 07:33:44
I've taken the first step towards fulfilling my threat to produce an up-to-date edition of the Replay Gain specification.

The working draft is published on the Hydrogen Audio Wiki (http://wiki.hydrogenaudio.org/index.php?title=Replay_Gain_specification).


I may be totally clueless, but the writeup seems to be very specific to ID3 and MP3, especially the part about bit format. I'm using FooBar2000 to RG tag my FLAC files, which I listen on a SqueezeBox. I don't have and care for MP3 files. As far as I am aware, I'm using ReplayGain, but the VorbisComment are quite different from what you list on your page.

It seems that I have 4 values (and not 3), and seems all textual :
replaygain_album_gain -1.99 dB
replaygain_album_peak 1.000000
replaygain_track_gain -2.60 dB
replaygain_track_peak 1.000000

Good luck...

Jean

Title: Replay Gain specification
Post by: Notat on 2011-01-10 23:22:28
This is work in progress. Vorbis comment documentation is not in there yet but it is coming.

I have just finished updated description of reference level and gain calculation. I've expressed things in terms of headroom which is how professional audio engineers, R128 and BS.1770 tend to look at things. The SPL reference levels are retained and explained as well.
Title: Replay Gain specification
Post by: mjb2006 on 2011-01-11 02:53:03
The addition of +6 dB, as implemented by RG analyzers, was accepted as the new standard by David ([a href='index.php?act=findpost&pid=154605']here[/a]).

I used [a href='index.php?act=findpost&pid=475019']a later statement by David[/a] to justify an explanation in the RG informational article in the HA wiki (http://wiki.hydrogenaudio.org/index.php?title=Replay_Gain) that the spec calls for a reference level "of 83dB (with the expectation that players would add 6dB at playback time). However, the de facto standard has been for the reference level to simply be 89dB and for players not to add anything extra."

So if we're going to replace -20 dB SPL with -14 dB SPL in the spec, then shouldn't we also change the default pre-amp from +6 dB to zero? Or am I not understanding some nuance here?

Also, the new draft was using "ReplayGain" and "Replay Gain" inconsistently. I changed it all back to "Replay Gain" for now until we were sure it should be changed. "ReplayGain" seems to be how many people talk about the spec and the technology in general ("instead of normalizing your audio data, just add ReplayGain tags and configure your ReplayGain-aware media player according to your listening preferences"). I personally feel comfortable with using it as a compound term like that, and I think the spec should acknowledge it as a permissible spelling.

But in its prose, the spec talks of "the appropriate replay gain", "the Replay Gain adjustment", "the replay level", and so on, where "replay" is very deliberately an adjective, clarifying which gain/level/adjustment, out of several being discussed. So in that context, it seems more appropriate to keep the words separate. I'm not sure how to reconcile these two ways of talking about RG. Maybe just keep it as-is, and add a footnote to its first use saying that the technology in general is often called either ReplayGain or Replay Gain, but when referring to a specific type of gain in prose, the latter is preferable?
Title: Replay Gain specification
Post by: greynol on 2011-01-11 03:02:17
FWIW:
http://www.hydrogenaudio.org/forums/index....st&p=736023 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=85536&view=findpost&p=736023)

Of course I have already objected (and still object!) to the "because that is what David did" argument regarding "preferred" ID3 tagging, so take the link with a grain of salt.
Title: Replay Gain specification
Post by: jdoering on 2011-01-11 07:11:44
Seems simple to me. The specific technology/standard has a brand name for which David suggested "ReplayGain" has an edge. In standard prose when you're talking about replay being used as an normal adjective; then its a generic usage of the term rather than a reference to the brand name. It should neither be capitalized nor concatenated in this case as it's just normal English. If the usage is overly confusing in a particular context; then reword (e.g."the gain used during playback" intead of "the replay gain") assuming that is less awkward in a particular situation.

Of course I haven't looked through all of the situations you've reviewed; so maybe it's not so easy in practice...

-Jeff
Title: Replay Gain specification
Post by: googlebot on 2011-01-11 10:42:59
+replaygain has about twice as many Google results as +"replay gain".
Title: Replay Gain specification
Post by: 2Bdecided on 2011-01-11 11:41:47
Of course I have already objected (and still object!) to the "because that is what David did" argument regarding "preferred" ID3 tagging, so take the link with a grain of salt.
I'm not following this properly, so haven't seen this, but if you're referring to the sections in the original proposed spec describing how to store the ReplayGain values, then if there's a different implementation out there that's far more common that the one I originally suggested, then obviously that common implementation is the defacto standard, and that is what is supposed to be captured in this wiki spec (IMO!).

AFAICT for mp3 it's a bit of a mess (not a huge mess, but not ideal!) and the most useful thing to do might be to document the multiple approaches that are out there and what uses/supports each approach. A presumption could be that the widest supported approach(es) is/are the defacto standard, and anything with little or no support could be depreciated (even if it's what's in the original proposed standard). I doubt there can be one single standard for RG tags in mp3, since there isn't one single standard for tagging mp3.

Cheers,
David.
Title: Replay Gain specification
Post by: Notat on 2011-01-11 14:57:02
So if we're going to replace -20 dB SPL with -14 dB SPL in the spec, then shouldn't we also change the default pre-amp from +6 dB to zero? Or am I not understanding some nuance here?

You are correct. I am working my way from top to bottom to bring the specification up to date with current practice. Last night I finished "Reference level" and "Gain calculation" sections. Next is documenting the metadata. When I get to the player recommendations section I expect it will be revised to specify a 0 dB default preamp gain.

BTW: For these references, it is -20 dBFS (http://en.wikipedia.org/wiki/DBFS) not -20 dB SPL (http://en.wikipedia.org/wiki/DBSPL#Sound_pressure_level). I have chosen not to use "dBFS" in the specification because there is some ambiguity whether the reference for dBFS is a full-scale square wave (peak reference) or a sine wave (RMS reference).
Title: Replay Gain specification
Post by: Notat on 2011-01-11 15:01:31
+replaygain has about twice as many Google results as +"replay gain".

David has said that he now prefers ReplayGain. I can make the change is there are no strong objections.

It would also be nice if RG had a logo.
Title: Replay Gain specification
Post by: godrick on 2011-01-12 15:22:29
You may be aware of this already, but:

- XBMC uses ReplayGain
- XBMC is going to be supported in hardware in some fashion shortly per http://www.sigmadesigns.com/uploads/librar...ases/110104.pdf (http://www.sigmadesigns.com/uploads/library/press_releases/110104.pdf)

So that seems to mean that ReplayGain will be supported in hardware to some extent, along with the rest of XBMC.  If you are not doing so already, maybe it makes sense to reach out to the XBMC developers to inform them of your updates, and see if they can implement any enhancements as part of their hardware porting.
Title: Replay Gain specification
Post by: Jean Tourrilhes on 2011-01-12 18:37:36
This is work in progress. Vorbis comment documentation is not in there yet but it is coming.


ID3 is the main reason why I'm using Vorbis on my wife's CLIP+ rather than MP3. On the SqueezeBox and on the PC, FLAC all the way. Converting from FLAC to Vorbis guarantee that there is no tag lossage. Documenting existing VorbisComment practices would probably take a minimal amount of time and would give users a working solution while the ID3 mess get very slowly argued and resolved.

By the way, the CLIP+ is one of the few playing supporting Replay Gain on the original firmware, but ID3 support is brittle and picky, as you would expect. They only accept the MediaMonkey encoding, and have problem with the Foobar2000 encoding. In your spec, you may want to make sure to specify the number of spaces and if '+' is included, which seems to be an issue :-(.

http://forums.sandisk.com/t5/Clip-Clip/Rep...nal/td-p/132019 (http://forums.sandisk.com/t5/Clip-Clip/Replay-Gain-A-how-to-informational/td-p/132019)

Another comment. I don't see of you can have working clipping prevention if you don't have both the track peak and the album peak. If you only store track peak, you may be force to adjust gain between tracks. If you store only album peak, you may adjust down a track unnecessarily in shuffle mode.

Regards,

Jean
Title: Replay Gain specification
Post by: Notat on 2011-01-13 02:45:00
Another comment. I don't see of you can have working clipping prevention if you don't have both the track peak and the album peak. If you only store track peak, you may be force to adjust gain between tracks. If you store only album peak, you may adjust down a track unnecessarily in shuffle mode.

It depends what you mean by "working clipping prevention". You certainly can prevent clipping with just the track peak metric. It will have the behavior you describe. I'm personally not convinced that this is any less obnoxious than lowering the gain below target for the entire album.

In the end, this sort of clipping prevention is a third-rate solution. The first-rate solution is to set your player up with enough dynamic range so that you never need to boost anything. Second-rate is to use a nice digital peak limiter after the boost.

Nevertheless, I do plan to add discussion in the RG specification of how album peak can be used to prevent clipping when album gain mode is used.
Title: Replay Gain specification
Post by: Notat on 2011-01-13 02:53:04
So that seems to mean that ReplayGain will be supported in hardware to some extent, along with the rest of XBMC.  If you are not doing so already, maybe it makes sense to reach out to the XBMC developers to inform them of your updates, and see if they can implement any enhancements as part of their hardware porting.

In theory there will be no enhancements in the first version of the specification. It's all about documenting current practice in with more prescriptive language. This first version is half complete at this point and therefore not yet self consistent. Anyone looking at it now will get confused so not a good idea to share it beyond people following it here. Hopefully it will be finished in a few more weeks.

Thanks for all the comments everyone. They have really helped me up the learning curve.
Title: Replay Gain specification
Post by: mjb2006 on 2011-01-14 04:01:06
I have chosen not to use "dBFS" in the specification because there is some ambiguity whether the reference for dBFS is a full-scale square wave (peak reference) or a sine wave (RMS reference).

I went ahead and added a note to that effect at the first use of "dB relative to a full-scale sinusoid", but that phrase is only used twice, and I think it needs to be used a few more times. I think the spec would be more clear if we just used dBFS everywhere it's needed and had a note on the first use saying that for our purposes, dBFS is dB relative to a full-scale sinusoid. I was about to do that, then backpedaled a bit. I'd rather not be the one responsible
Title: Replay Gain specification
Post by: Rescator on 2011-01-16 03:41:57
I think that this v2.0 effort of the ReplayGain spec should also include the following:

Tag: Peak_Track
Type: 32bit float normalized

Tag: Peak_Album
Type: 32bit float normalized

Tag: RMS_Track
Type: 32bit float normalized

Tag: RMS_Album
Type: 32bit float normalized

Peak_track and Peak_Album is hopefully obvious enough.
The reason for 32bit float is that it's more precise than dB (which has conversion and rounding issues to/from float).
A float peak for the track and album can be used directly in the audio processing chain (after conversion from text first obviously, unless the metatag format allows binary float that is).
0.0 is silence and 1.0 is full digital, this is pretty much as industry standard as you can get, and it's likely that 1.0 will remain full digital for the foreseeable future, regardless of whether dB or LU is "popular".
And if the tag is textual the float could even be 64bit, as a player could simply truncate the text. (some text to binary float routines already does this).

Please note I wrote Peak_ rather than say ReplayGain_Peak_Track etc, as I believe track and album peak is important enough to be available that it should not be "name" limited to just the ReplayGain standard,
in fact peak should be fully independent and just part of the standard audio format meta tag specs instead.

Which brings me to the RMS_Track and RMS_Album tags.
Again it's 32bit float (but again implementations should handle a 64bit by truncating if the tag is text obviously),
and if a track had a "RMS_Track=0.1" then that would equal -20dBFS (which is exactly float value 0.1).

The calculation of the RMS should be (and I believe this is again the industry standard way):
50ms window (sine I believe?), each channel is squared and the means of the channels are added and then the root is calculated. (if I recall correctly, ReplayGain actually does this at one part in it's stage?)

Again RMS_Track and RMS_Album are named such as they like Peak_ should be part of the audio format meta spec.

The benefit is threefold, as implementations like ReplayGain etc improve or adapt new loudness curves and top/bottom percentage cutoff are adjusted, the RMS remains constant (Z filter, aka flat filter, or neutral if you will)
The artists could theoretically hand tag tracks they publish and get the RMS from say Adobe Audition or similar tools, almost every "studio" audio tool lets you do RMS (provided they also show the float value in addition to the dBFS).
The users can simply tell their audio player that they'd prefer the music loudness adjusted to -20dBFS.

Implementation is thus rather easy in the player.
1. The user specified -20dBFS as preferred loudness (I advise this to be the default value too as it matches with the K20 of Bob Katz "K-System", which in turn matches the SMPTE standard.
2. The player translates -20 dBFS (in this case) to 0.1 float.
3. The player finds the RMS_Track tag, which let's say is 0.2 (around -13.98 dBFS, aka K-14)
4. So the math is 0.1 / 0.2 = 0.5, meaning that the player simply need to do: (SMPL * 0.5)

This opens up for a lowest common denominator, ensures that even if ReplayGain or R128 is replaced by something new that playback is consistent.
With Peak_Track, Peak_Album, RMS_Track and RMS_Album as a minimum common denominator then the likes of ReplayGain and R128 are optional.
A user can choose to use RMS only, or RMS with ReplayGain preferred or use R128 but fallback to ReplayGain if missing and if that is missing then fallback to RMS.
ReplayGain or R128 may be obsolete one day, but RMS will not as it is the base of almost all these loudness algos out there.

It also shouldn't be so hard for the ReplayGain code to also spit out the RMS tag as well as the ReplayGain tag.


Some of you may say that RMS is not as accurate as ReplayGain etc.
But I do not agree with that, as most music is mixed by someone listening, anything that shouldn't be there is filtered out.
This means that a low frequency or high frequency sound that is present is 99% intentionally left there on purpose.

I have in many cases reacted to ReplayGain adjusting songs wrongly.
Like when a track with a very bassy sound is suddenly tagged (by foobar 2000 RG scanner) to need a +6 dB gain and it's already has peaks at 0dBFS (or sometimes above with lossy formats),
while a track with little bass is tagged with -3dB yet I can clearly hear that the bass modest track should be twice as loud while the bass heavy should have been at least twice as soft to match my -20dBFS preferred average.
And trust me, I double checked and re-scanned, and the funny thing is that when I ran the same tracks through my own plain simple RMS (done as described earlier and should the the standard way to do RMS AFAIK),
the RMS results actually matched what I heard.

And here is the even more amusing thing. Last time I did a full scan of a large amount of my collection of music it turned out that overall (summed average) of all loudness values.
The plain RMS only deviated +/- 1dB versus ReplayGain. (we're talking thousands of tracks, from metal, to techno to pop, to standup, to film scores and classic, and "chip" computer music.)
But always when I hear that RG is wrong vs what I hear, and I check with my RMS tool I find that the RMS value is always correct.

I always wondered why, but the answer is simple. It is the loudness curve combined with the percentile selection, both "ignore" parts of the sound.
While RMS takes it all. Sure, a straight RMS scan may get biased by noise. Then again...a lot of music is noise these days.
Loudness curves works best with tones, while RMS works "ok" with anything, from tones to static. And it's never that much off from what you hear vs the RMS values.

So I think that adding generic Track/Album tags for Peak and RMS as a minimum requirement, is a must before evolving ReplayGain further and adding loudness versions,
or R128 or ITU whatever into this mess.
And while RMS is not perfect, it has certainly never been wrong that I can recall hearing.

Also, RMS does not have issues with which loudness curve or which percentile range should be used.
Nor does it have the confusion of 89 vs 83db SPL reference level that ReplayGain has. (I find it amusing that ReplayGain itself fell pray to the louder is better trap itself as people complained their poptracks was not loud enough so voila there's the +6dB change), now ReplayGain is "trapped" at 89dB SPL (which is -14dBFS aka K-14) while it should have been K-20.
Maybe v2.0 of ReplayGain may mitigate that, or the R128 proposal etc.
But in the meantime my Peak and RMS suggestions above could be rolled out right now, both are established standards, easy to write and read, and would not conflict with ReplayGain nor any future similar standards.

I'm a programmer and a musician in case anyone are wondering. I've released 3 albums. And as a experiment I made sure the RMS of them was at -20dBFS (aka floating point at around 0.1).
ReplayGain insisted on making some tracks louder while others softer, again it was the loudness curve (or was it the percentile selection or a combo of both?) that was way off base.
As the artist that made those tracks I knew darn well how loud not only the tracks overall where, but also how loud they are intended to be at certain points.
And again my own RMS tool (named K20RMS for obvious reasons) showed me what I heard, that the loudness algo was nuts.
A track with a very high bass sound that is loud enough originally is supposed to be amplified by +10 dB then something is clearly not right when they should have been approx +0 dB instead.

The perfect loudness filter will never be possible. Everyone hear things differently either due to race, age, emotion, attention, environment or hearing damage, and what is used to emit the sounds.
And every piece of audio sounds different as far as loudness go, based on environment, race, age, emotion (do I need to go on?)
The only to get a perfect loudness filter is to play all recordings made to all people that can hear....good luck with that, and the extreme cases will still be upset due to the averaging of such filters.

But RMS on the other hand is unbiased, so if -20dBFS sounds ok for someone then it probably does. While others may like it at -18dBFS, while some may like it at -23dBFS. Let the user set the preferred RMS loudness, and optionally let them choose which method they'd like to take precedence over which others. In my case I'd probably choose RMS only but... *shrug*

Users get confused with the Preamp alone. Add to that the still existing confusion of ReplayGain's 83 and 89db SPL, the upcoming ReplayGain "revision", and the new R128 and so on, and we aren't exactly making the fight against the loudness war any easier are we?

Another thing to remember is that RMS calculation is darn cheap (2-3 lines of code added to a normal DSP loop maybe) compared to loudness filters, can be done faster than realtime, and no patent or license issues at all, and RMS is not tied to any SPL, as RMS is fully unbiased.
RMS would also act as a fallback in cases where the user has enabled clipping prevention but the ReplayGain or R128 etc insists to add so much gain the track would clip. tHe player could in that case look at the RMS and most likely see that the RMS shows the audio is originally at the exact preferred level of the user in the first place, for example.

Some audio formats do have Peak support but track only so no album peak, and none have so far had either track and album RMS, these four should be a bare minimum in any format these days.
Heck it would take Apple hardly any time at all to do a Peak and RMS scan of the entire iTunes catalog and update the tags, they might even be willing to do so if asked nicely, but good luck trying to get them to do a ReplayGain scan or R128 scan etc...


Sorry if this came out as a rant against ReplayGain, it really isn't. (at least not directly) I'm an avid fan of the efforts of the likes of Bob Katz and his K-System (especially K-20).
And ReplayGain is used on all the tracks in my music collection, and foobar2000 is set to gain+clipping prevention and track mode. It just nags me that every now and again a tracks comes along that RG has been way off base with.
Sure I could manually tweak the tag. But when a similar RMS tag would not need any manual tweaking I'd rather want to set foobar2000 to use -20dBFS preferred and use RMS tags even if ReplayGain tags are present.

I may just end up writing a proper tool one of these days which makes Peak_Track and Peak_Album and RMS_Track and RMS_Album tags. (+ ReplayGain tags with a REPLAYGAIN_ALGORITHM=K20RMS for compatibility reasons, which would not be needed if Peak and RMS was already supported, that is...)
Title: Replay Gain specification
Post by: jdoering on 2011-01-16 07:18:00
Rescator you totally confused me. You're talking about additions to the ReplayGain spec but then it really sounds like you gave a lot of examples of what you perceive as failings of the current ReplayGain algorithm.

Why would the current spec which presumes use of the current algorithm (and therefore also assumes that one sees value in using the ReplayGain solution despite lack of a perfect algorithm) add requirements to support a completely separate set of parallel RMS tags? Do the values have a use for a player that want to apply ReplayGain as intended? If you don't want to apply ReplayGain (because it doesn't work or whatever) then why would the ReplayGain spec have anything to do with a completely alternative solution?

-Jeff
Title: Replay Gain specification
Post by: Rescator on 2011-01-16 13:46:47
Because "ReplayGain" is becoming a umbrella standard.
It is moving towards a v2.0

Currently there is (let's call them this for simplicity sake):
RG83 and RG89
Then there will most likely be a RG2.0
plus there already IS a R128 (it ids itself with the tag REPLAYGAIN_ALGORITHM: EBU R128) http://www.hydrogenaudio.org/forums/index....st&p=739665 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=85978&view=findpost&p=739665)

So that's RG83, RG89, R128, and probably RG2.0 down the road.
All I'm saying is to add RMS as well.

ReplayGain is both a specification and an algorithm,
I like the spesification but not the algo (as I pointed out).
But I also see no point in say the peak and rms having to be in REPLAYGAIN_TRACK_RMS tags when just RMS_Track, RMS_Album and Peak_Track, Peak_Album would do.
Thus the REPLAYGAIN_TRACK_PEAK and REPLAYGAIN_ALBUM_PEAK would just be phased out slowly.
As I said Peak and (IMO) RMS are so basic all formats should have them, I do not see a need to plonk REPLAYGAIN_ in front of them.
But that doesn't mean that the ReplayGain specification can't standardize what I explained about Peak and RMS in the post above.
Title: Replay Gain specification
Post by: Notat on 2011-01-16 20:37:50
Excuse me if I missed this point in you long post. It seems to me that your personal findings with respect to using RMS as a predictor of perceived are in conflict with what the literature says. How do you explain this descrepancy?
Title: Replay Gain specification
Post by: 2Bdecided on 2011-01-17 11:14:59
More importantly, please keep this thread for discussing the wiki ReplayGain spec - which is being authored to reflect current practice.

Ideas for improving ReplayGain are always welcome, but belong in their own threads.

Cheers,
David.
Title: Replay Gain specification
Post by: Notat on 2011-01-19 20:29:52
David has said that he now prefers ReplayGain. I can make the change is there are no strong objections.

There were no objections. I've made the edits. We're now officially "ReplayGain".
Title: Replay Gain specification
Post by: Notat on 2011-07-27 18:37:40
I've cleaned up the HydrogenAudio wiki (http://wiki.hydrogenaudio.org/index.php?title=ReplayGain) and Wikipedia (http://en.wikipedia.org/wiki/ReplayGain) articles to reflect the camel case name: ReplayGain.
Title: Replay Gain specification
Post by: Jebus on 2011-07-27 23:17:04
Would it be possible to provide links in the spec to pink noise reference samples? I know there is a link somewhere, but even that file targets the old 83dB reference. Also, mono and stereo versions would be nice.

FYI, I've finished implementing ReplayGain in native C# by following the revised specifications. It was pretty easy to understand, even for a layman developer. Good work!
Title: Replay Gain specification
Post by: Jebus on 2011-07-28 00:35:37
One thing that was a tad confusing is how 2 channels should be analyzed together. The WIKI says:

Quote
SMPTE cinema calibration calls for a single channel of pink noise reproduced through a single loudspeaker. In music applications, the ideal level of the music is actually the loudness when both speakers are in use. So, ReplayGain is calibrated to two channels of pink noise


This suggests to me that the left and right channels should be summed (or doubled for mono), then RMS should be calculated on these values. The "reference" implementation from replaygain.dll however calculates the RMS using all samples divided by the number of channels, which sounds more like its calculating the average of the two channels for each 50ms window.

Does that make sense? I assume that the reference implementation is correct and that the wording is confusing, or that my understanding is simply muddled. Which is it?
Title: Replay Gain specification
Post by: Notat on 2011-07-28 04:28:30
Because it is all defined in terms of the reference, I don't think it matters how you combine the two channels so long as it is balanced and you do it the same way for the reference signal and the test signal.

It would be confusing to link to the SMPTE -20 dB reference signal. Does someone want to volunteer to generate a -14 dB version for us?
Title: Re: Replay Gain specification
Post by: Xerus on 2018-02-20 22:44:54
What happened to all of this? I wanted to add ReplayGain to an application of mine, but I stumbled upon so much stuff and so few actually useful things like libraries, since I prefer to hand over the analysation and calculation to someone who knows the stuff.
It would be good to have a reference analyser implementation in major programming languages separate from LAME so that it can be used directly by application developers and thus spread faster.

That said, I liked the proposal of Rescator to create standard Peak_Track and Peak_Album tags, since I think that he current keys are a bit long and unconcise, and since it isn't anything inherently dependent on RG itself. If that gets recognised, the other two could also be shortened to RG_Track and RG_Album.
I also like the idea of 0.0 to 1.0 represented by a simple float. Currently I have to parse the tags as strings from values like -1.76 dB, which is quite ugly in my opinion.