Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: My Personal Guide to mp3Gain (Read 7833 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

My Personal Guide to mp3Gain

As a newbie to the world of mp3, I have read the various FAQ, tutorials, and other HA threads concerning mp3gain, hoping to learn as much as possible about how it works, so that I can use it most effectively.  While I think I have learned a lot, I know that there are some serious gaps in my understanding, and I hope others in this forum can fill them in for me.  I don’t want to discuss the “track” vs “album” issue, as that certainly seems to have been explained very clearly.  But other pieces of the puzzle are still fuzzy to me.  Any help you can provide in correcting my terminology, fixing my errors, or filling in the gaps would be greatly appreciated – even if you only have time to comment on a small portion of this summary.

Here is what I have so far:

[1] In analysis mode, mp3gain analyzes an mp3 file, and  - using a psychoacoustic model – determines how much gain to apply to the file in order to bring its perceived loudness up (or down) to some pre-set reference level (current default = 89.0 dB).  It displays this gain value, and it also tells you whether or not (Y or N) the mp3 file will “clip” if played as-is (without applying the calculated gain adjustment).  You can now use the displayed gain values in making your own adjustments elsewhere, or you can tell mp3gain to apply the adjustment to the file now.

[2] If you tell mp3gain to apply the gain adjustment, it does a few different things.  First, it writes to the gain field in the header for each individual frame in the file, indicating how loudly that frame should be played in order to achieve the prescribed reference perceived loudness for the file. The same adjustment is made to every frame (they are all boosted or reduced by the same amount).  The adjustment comes in multiples of 1.5 dB (apparently because of the level of precision built into the gain field of the header).  Fortunately, a single 1.5 dB step is indistinguishable to the typical human ear.  Because this adjustment has been written to the actual gain field for each frame in the file, it will be applied by any player used in playing the file (whether it is ReplayGain-enabled or not).  And yet, the gain adjustment is reversible, because these frame headers can be readjusted by mp3gain at any time in the future.

[3] Besides adjusting the headers of all the frames, mp3Gain reportely also writes a ReplayGain header that applies to the entire file as a whole, indicating a further fine-tuning of the file’s playback volume.  Because it is not limited by the 1.5 dB precision of the individual frame headers, this adjustment can be a very fine tweaking.  As I understand it, this ReplayGain value is something that can be read by a player that is ReplayGain-enabled, and allows such a player to perform this fine-tuning of the playback volume.  The bulk of the volume adjustment, though, will have already come from the changes made to the headers of the individual frames in the file.  If your player is not ReplayGain-enabled, this ReplayGain fine-tuning will not be applied.  However, this is apparently no big deal, since the 1.5dB precsion in the individual frame headers is precise enough for virtually all listeners.  The ReplayGain fine-tuming adjustment is inherently reversible, and can be undone or altered using mp3gain or another gain-adjusting program.

[4] Besides making the two types of gain adjustments described above, mp3gain also writes some information to an APEv2 tag.  This tag is essentially a mini-log of the mp3gain analysis, summarizing its results.  It allows you to – at a future date – readjust the perceived loudness of the file without having to run a new analysis.  It stores info on the file’s loudness range, and how much gain should be subtracted in order to “undo” the changes made by mp3Gain previously. 

The first field in the tag is "mp3gain_minmax" - which I have yet to figure out, but am guessing has something to do with the loudness range of the file.  An example: "mp3gain_minmax = 146,175."  Perhaps someone can enlighten me on what these numbers mean.

Next comes "mp3gain_undo" - which indicates how much gain must be removed (or added) from each of the global frame headers in order to restore the file to its original (pre-mp3gain) perceived loudness.  The values are given in multiples of 1.5dB, and there is one value for the left channel, and one for the right.  An example: "mp3gain_undo = +001, +001, N."  This says that you will have to add 1.5 dB to each channel to restore the file to its original loudness.  I have no idea what the “N” stands for, and am hoping someone can explain what that piece of info means, and what values it can have.

The next field in the tag is "replaygain_track_gain" - showing (I presume) how much fine-tuning adjustment was written to the ReplayGain header for the entire file – to be used in the case of a ReplayGain-enabled player.  An example: "replaygain_track_gain = +0.3550 dB."  I presume that this is the actual applied value, rather than an “undo” (negative) value.  I also presume that – if I had chosen to apply album gain rather than track gain – I would see something like: "replaygain_album_gain = some amount of dB."  Right?

And then comes "replaygain_track_peak" -which is a single value, with no units.  I don’t know what it means.  An example: "replaygain_peak = 0.3985."  Can someone explain this one?

As I understand it, the values in mp3Gain’s APEv2 tag are mainly informational, and for allowing mp3gain to make quick future adjustments without having to reanalyze the file. The tag itself does not make any adjustment to playback volume.  If the tag causes you any kind of problems (such as interfering with the metadata display on your player), you can tell mp3gain to not write the tag.  However, if you do not write the tag, you will have to re-analyze the file if you want to re-adjust the volume in the future.

[5] After you have mp3Gain apply its gain change to a file, mp3Gain will now show you a new value for the average loudness of the file (which should be very close to your target reference value), and show you that the required gain adjustment is now zero (since the gain has already been adjusted).  It will also hopefully show you that the file will not experience clipping during playback.  However, it is possible that clipping is still indicated, especially if you have chosen a reference loudness level significantly higher that the current 89.0 dB default.  In that case, you need to adjust your reference loudness down.  If you do this just for this track alone, you will eliminate clipping, but this track’s overall perceived loudness will be reduced with respect to other tracks.  I suppose that is the reason behind choosing a conservative reference level like 89.0 dB - - so that this problem won’t happen very often.

My Personal Guide to mp3Gain

Reply #1
Quote
[3] <snip> If your player is not ReplayGain-enabled, this ReplayGain fine-tuning will not be applied. However, this is apparently no big deal, since the 1.5dB precsion in the individual frame headers is precise enough for virtually all listeners. The ReplayGain fine-tuming adjustment is inherently reversible, and can be undone or altered using mp3gain or another gain-adjusting program.


I think the part about "is precise enough for virutally all listeners" is very debatable.  I believe there have been threads about there still being noticeable volume differences between some songs after applying replaygain/MP3Gain.  I'm not sure whether this is because of the 1.5dB precision or a limitation of the loudness estimation used but either way it is probably better to leave out such a generalization.


Quote
[4] Besides making the two types of gain adjustments described above, mp3gain also writes some information to an APEv2 tag. This tag is essentially a mini-log of the mp3gain analysis, summarizing its results.


I'm not going to quote the entire section because it is so long.  The tag that is written does include some information that is specific to MP3Gain, like mp3gain_minmax.  Some of the fields are a standard way of storing the Replaygain information.  (Most of those fields begin with "replaygain" I think.)

"mp3gain_minmax" is the minimum and maximum gain values found in the frame headers of the file.  There can also be "mp3gain_album_minmax" which is the minimum and maximum gain values found in the frame headers of all the files that are analyzed as an album.  I'm not sure what these values are used for.  (Perhaps they are stored to easily check if an adjustment will cause any of the gain values to go above or below the normal limits of 0-255....not sure.)

As for the letter part of the "mp3gain_undo" field, it reflects whether or not "wrapping" was used when adjusting the gain.  There is an option to "wrap" the gain values when they are changed...so that if a positive change was applied and the new value is above 255, it would "wrap around" to start at 0.  I believe "W" indicates wrap was used and "N" indicates no wrapping.

Your understanding of the "replaygain_track_gain" and "replaygain_album_gain" fields is a little bit off.  These fields are used to store the adjustment necessary to bring the file up or down to the desired level.  (One of the criticisms of this is that there is no way to know what base level was used to establish this value.  It could have been the default of 89 dB or it might have been something else but that's another topic.)  Anyway, "replaygain_track_gain" is the value for track-based adjustment and "replaygain_album_gain" is the value for album-based adjustment.  When you apply a gain change, the values stored in these fields are adjusted to reflect that change, leaving what you called the "fine tuning".  For example, if the track and album values were -6.5 dB and -6.8 dB respectively and you applied a gain change of -6 dB, the values in the two fields would become -0.5 dB and -0.8 dB.  So these two tags are affected when you apply a gain change but they are how the results of the analysis are stored.

The two fields, "replaygain_track_peak" and "replaygain_album_peak", store the peak values in the decoded file and the album.  These values are the fraction of full scale...for example, for 16 bit audio, multiply the number by 32768 and you get the absolute value of the peak when the file is decoded.  So anything above 1.0 means that some values will need to be clipped/limited.

Writing any of the tag fields is optional.  However, while some are used as informational fields for MP3Gain, some are a standard part of Replaygain.  Some other media players and apps look for these fields to be able to adjust the playback.  (These are usually the 4 fields that begin with "replaygain".)

Quote
[5] After you have mp3Gain apply its gain change to a file, mp3Gain will now show you a new value for the average loudness of the file (which should be very close to your target reference value), and show you that the required gain adjustment is now zero (since the gain has already been adjusted). It will also hopefully show you that the file will not experience clipping during playback.


Yes, after applying a gain change hopefully there won't be any files that clip.  But you should keep in mind that just because it says there is still clipping present, it might not be audible.  You might want to try listening to the song to see if you can hear any clipping before you adjust the gain.


This is all from my own understanding of Replaygain and MP3Gain.  If I made any mistakes, hopefully someone else will correct them.

My Personal Guide to mp3Gain

Reply #2
Thanks, kmart, for your observations.  That's what I was hoping for - - some people have figured out various parts of the big picture, and can help me piece them together for myself, so I can understand what the heck I'm doing.

There's little doubt that listening to your results is the best measure of how well you're doing; but it's still nice to understand how the software works - - - because that could reduce your number of needed listening tests from thousands to dozens.  That's why I really appreciate your taking time to share your knowledge on mp3Gain.

I hope others who have additional info can make time to make further corrections and clarifications.

Thanks, again.

My Personal Guide to mp3Gain

Reply #3
Thanks to both of you guys for sharing your knowledge! I'm new to mp3gain and learned a lot from this thread. This seems like great info to be part of a sticky or wiki on mp3gain. I think the existing guides on EAC and LAME are incredibly helpful. Has any thought been given to doing something like that for mp3gain?

My Personal Guide to mp3Gain

Reply #4
Good post Iggy64, you echoed my own understandings and questions exactly.

I'm still thinking about it a little, after a post (post #6) here:
<http://www.hydrogenaudio.org/forums/index.php?showtopic=52232>

I got some interesting info (a little bit by accident on a sort of related matter); it may be of interest to you as well.

My Personal Guide to mp3Gain

Reply #5
[1] In analysis mode, mp3gain analyzes an mp3 file, and  - using a psychoacoustic model – determines how much gain to apply to the file in order to bring its perceived loudness up (or down) to some pre-set reference level (current default = 89.0 dB).  It displays this gain value, and it also tells you whether or not (Y or N) the mp3 file will “clip” if played as-is (without applying the calculated gain adjustment).  You can now use the displayed gain values in making your own adjustments elsewhere, or you can tell mp3gain to apply the adjustment to the file now.

MP3Gain does not use a psychoacoustic model. It just determines the average volume and the highest peak of each MP3.

My Personal Guide to mp3Gain

Reply #6
MP3Gain does not use a psychoacoustic model. It just determines the average volume and the highest peak of each MP3.


Regarding "just determining the average volume," I think it is accurate to describe as a psychoacoustic model the following system:
  • the inverse Fletcher-Munson equal-loudness curves (or the approximation used by Replaygain)
  • the 50ms or so averaging time for each instant whose loudness you measure, and
  • the choice of the 95th-percentile instant as representative of the perceived overall loudness of the entire piece of music
... a model that really does very well at removing gross perceptual differences in loudness over the vast spectrum of music that's out there.

For sure, it's a very different kind of psychoacoustic model to any lossy encoder, and the highest peak calculation doesn't involve any psychoacoustic model - just an MP3 decoder modified to calculate values beyond full-scale.
Dynamic – the artist formerly known as DickD

My Personal Guide to mp3Gain

Reply #7
Thanks, w1L50n, for your tip.

And thanks to you others who added new corrections/improvements to my understanding.

I am now beginning to encode and mp3Gain my audio collection, and am learning a little bit more about mp3gain in each session.  If I can add anything more of substance to this thread, I will certainly come back here and do so.  I have been very pleased with the results I've gotten with mp3Gain so far.  My collection spans generations of recordings, mixing older (less-compressed and less-boosted) albums with newer ones.  This gave me a lot of problems with playback levels.  This has been largely solved with mp3Gain.

After playing around with my target gain level, I eventually gravitated to the current default of 89.0.  It turns out that this eliminates clipping on virtually all of my tracks, without going ridiculously low.  I tried sneaking up to 91 - 92, but I definitely got into clipping with quite a few (perhaps 5-10%) of my tracks.  So I guess I'll stick with 89.0 for now.  The nice thing is, with the gain values stored in the APE tag, I can readjust all these gains later (if I feel the need) without having to reanalyze any of the tracks.  Nice design idea!

My Personal Guide to mp3Gain

Reply #8
Double check [1]. Using mp3gain, analyze not only does the calculation, it will also write the replaygain tag! I can't remember well in the gui version but i think you can disable this. With the command line program you have to add an option. If you later do the [2], it will modify the mp3 gain value (with its 1.5db increments) and retouch the tags for finetuning beyond 1.5db increments so that replaygain tag aware players can benefit from. I think it would suit better to talk about the ReplayGain calculation rather than a psychoacoustic model.
She is waiting in the air

My Personal Guide to mp3Gain

Reply #9
Double check [1]. Using mp3gain, analyze not only does the calculation, it will also write the replaygain tag! I can't remember well in the gui version but i think you can disable this. With the command line program you have to add an option. If you later do the [2], it will modify the mp3 gain value (with its 1.5db increments) and retouch the tags for finetuning beyond 1.5db increments so that replaygain tag aware players can benefit from. I think it would suit better to talk about the ReplayGain calculation rather than a psychoacoustic model.


Thanks for the input.  I will definitely check out these points.  I'll run a few tests to see if the "Analysis" mode of mp3Gain -- by itself -- creates an APE2 tag on the file.  Oddly, I don't think I ever checked on that.  I'll let you know what I find out.

Concerning the actual calculation of the gain value: As I am not a professional in the audio field, I probably should not have casually applied the term "psychoacoustic model" to the gain calculation.  I had read that the mp3Gain value is calculated in four steps: Equal Loudness Filter, RMS Energy Calculation, Statistical Processing, and Calibration with a Reference Level.  Because this process takes into account the way the human hearing system perceives loudness at various frequencies, I boldly applied the term "psychoacoustic," which means "the way the mind hears."  I imagine that audio professionals have a specific use for the term that an amateur like I would not know.  So you are correct - - I should probably just talk about the "Gain Calculation" and not misapply terminology that belongs to the audio experts.

Thanks for all the advice.  As I said, I'll check on the APE2 tag issue and report back.

My Personal Guide to mp3Gain

Reply #10
As I said, I'll check on the APE2 tag issue and report back.

No need to check  Artemis3 is absolutely correct in that APEv2 RG tags are set during analysis, except of course if tagging is disabled.

Btw, if it should have missed anyones attention, then Tycho has made a great command-line tool called metamp3, which among other things incorporates the mp3gain sources for doing RG scanning/applying and he has made the RG/UNDO tags to be set as ID3v2.3 tags instead of APEv2(and all APEv2 RG/UNDO tags found will automatically be transfered to ID3v2.3). It has other great features also and is simply an awesome app that i can fully recommend for others to check out for themselves :

Features
--------
- Write (all) Text, URL and Picture frame tags, with description.
- Compute replay-gain values and set them as ID3v2.3 tags.
- Apply and Undo volume gain (as mp3gain).
- Extract pictures from mp3 files.
- Inspect ID3 v1.1, v2.3, and v2.4 tags
- Inspect detailed info on mp3 files, including lametag data.

http://www.hydrogenaudio.org/forums/index....showtopic=49751