I|m programming a music player for my touchscreen Windows UMPC. I use Qt/QML for the UI so I decided to use Phonon as playback backend, which on Windows uses DirectShow to decode audio. Because Phonon/DirectShow doesn't support replaygain, I decided to implement it myself.
What I do is to scan all the tracks to database including RG tags (files are tagged by foobar2000). Then when a new track starts playing I change the volume using setVolumeDecibel (http://doc.qt.nokia.com/latest/phonon-audiooutput.html#volumeDecibel-prop) function. Everything seems to work fine, but the result is simply wrong. The tracks that should be quieten are to quiet and amplified tracks are to loud.
I'm still quite lost in understanding what a decibel value means. According to my knowledge a value of 10 dB (some sources say 6 dB) means twice as loud (perceived by human ears) and 10 times higher real output. Some of my tracks have RG values close to -10 dB and others up to 6 dB. This indicates a huge difference in volume of the tracks. But I think this is normal.
Anyway could someone explain me please, what might be the cause of the phenomenon, that applying RG changes the volume to much? The practical solution would be probably to divide the RG values by 2. But I want to know why there is a problem and why it doesn't work as I expected. In other players replaygain works OK and all my tracks sound equally loud.
I don't see anything obviously wrong with your approach. The RG metadata tells you how much gain (positive values) or attenuation (negative values) a track needs to sound equally loud as others. The function you're using appears to make the necessary adjustment.
* dB to Amplitude:
gain = pow(10, dbValue/ 20);
* Amplitude to dB:
db = 20 * log10(gain);
gain and dB range :
1.0f = no amplification (0.0dB)
0.5f = half amplitude (-6.02dB)
2.0f = double amplitude ( +6.02dB)
Replaygain value is stored as the difference between the desired Replaygain target (the standard defines 89dB) and the value of the song.
I.e. If a song has a replaygain value of -6dB, it means that it has an absolute value of 95dB, and so, you want to multiply the gain by 0.5 ( See above ).
Edit: btw... the value of 6dB is for Amplitude. (multiplying or dividing by 20). The value of 3dB is for power (multiplying or dividing by 10). You need four times the power to double the amplitude (You can read on internet about these subjects, this is one such links: http://www.iu.edu/~emusic/acoustics/amplitude.htm (http://www.iu.edu/~emusic/acoustics/amplitude.htm) )
[JAZ]> I think I understand this. I really spent a lot of time studying different Internet sources regarding psychoacoustics. But I wasn't really sure if I understood everything correctly. It seems to me I am doing it right. But do you have any explanation for my problems with Phonon? I even tried to use the same track with different volume, scanned it by foobar2000 and the result was just like I wrote in the first post - the originally loudest version was the most quiet and vice versa.
If I change the volume to -6 dB, I can see that the relative volume is approximately 0.5. This seems right. There seems to be something wrong it the volume modification algorithm inside Phonon.
Anyway there is one more thing I don't understand. Usually the tracks are normalized to 89 dB. But dB is a noise level or an amplification ratio. What does it mean 89 dB? It doesn't make sense to me, because the recording is just digital data, the volume depends on the amplifier I use.
The explanation of the 89 dB reference level is in the ReplayGain specification (http://wiki.hydrogenaudio.org/index.php?title=ReplayGain_specification). In short, there's a SMPTE standard which says 89 dB SPL is, for calibration purposes, an ideal real-world volume-level of a pink noise signal with an RMS of -14 dBFS. So we call the loudness of that signal, as measured by ReplayGain's fancy filtered RMS technique, "89 dB", even though it really can be played at any volume.