MP3 vs. MPC vs. Ogg: Low volume test
Reply #11 – 2003-02-09 16:42:52
@Gabriel: Here's the requested MP3 test: Original test sample: "real" music (salsa): o.wav length 42.667 sec. peak amplitude -4.54dB = 19248 max sample value max RMS Power -10.48dB min RMS Power -48.4dB Average RMS Power -21.85dB Total RMS Power -21.0dB (Values taken from CEP Waveform Statistics) [span style='font-size:7pt;line-height:100%'] 1. Creating amplified "original" sample o_a_4b.wav: - converting to 32bit - logarithmic fadeout 0 -> -150dB - save as 4 byte PCM (type 1, 32bit), as Lame can't handle default CEP format "32 bit Normalized float (type 3)" properly.2. 2. For comparing noise created empty file, same samplerate, same bit-depth, same length: n_a_4b.wav: 3. Encoding o_a_4b.wav: - lame 3.90.2 --alt-preset extreme - lame 3.90.2 --alt-preset insane - lame 3.94a11 --preset extreme - lame 3.94a11 --preset extreme -F n_a_4b.wav: - lame 3.90.2 --alt-preset insane 4. Decoding with foobar2000 0.5beta16 Diskwriter, Output WAV (PCM 24bit dithered), DSP used, Resampling to 96kHz (Fast mode disabled) - all encoded files - o_a_4b.wav - n_a_4b.wav 5. Processing all files made in step 4 like this: - apply logarithmic fadein 0 -> +150dB - downsample to 16bit/48kHz (dithered) and save to xxx_fadein.wav 6. Having a look at the files from step 4 with Encspot 7. Listening using WinABX [/span] [span style='font-size:9pt;line-height:100%']Results/Conclusions Step 6: AFAIK --alt-preset extreme (3.94a11 --preset extreme) only uses 32kbps frames when it "notices" silence. So for the mp3 files encoded with "extreme" it could be an interesting information when the first 32kbps frame and when the last 128kbps frame occurs. lame 3.90.2 --alt-preset extreme: - first 32kbps frame: 22.7 sec. -> Amplification at this position: -80dB; RMS Power at this position of o.wav: 25dB -> power of encoded signal: -105dB - last 128kpbs frame: 26.5 sec. -> Amplification: -90dB; RMS Power of o.wav: -17dB -> power of encoded signal: -107dB lame 3.94a11 --preset extreme: - first 32kbps frame: 23.1 sec. -> Amplification: -81dB; RMS Power of o.wav: -30dB -> power of encoded signal: -111dB - last 128kpbs frame: 27.2 sec. -> Amplification: -95dB; RMS Power of o.wav: -19dB -> power of encoded signal: -114dB Maybe the maths I'm doing here are a sign of faulty resoning, but as I did the same it's comparable. So it seems that lame 3.94 keeps encoding low volume signals where lame 3.90.2 already stopped and the difference is somewhere around 6-7dB. Step 7: ABX tests (The volume of my system (AC'97 onboard sound, amp, HD 530 headphones) is set to a level I use normally for ABXing music samples = "hearing-damage proof", all ABX tests 4/4 or till guessing probability < 5%): - n_a_4b.wav vs. n_a_4b_fadein.wav to find out where I start noticing noise ("equipment/ears test"). -> 21 sec. = -83dB RMS power - o.wav vs. o_a_4b_fadein.wav (Where starts noise added to music to be noticable?) -> 24.0-25.0 range; min RMS power of that range: -33dB; RMS power of n_a_4b_fadein.wav (=noise) at this point: -71dB => "SNR"=38dB - o_a_4b_fadein.wav vs. o_a_4b_3.90.2_insane_fadein.wav: The *music* sounds exactly the same to me. Increasing noise starting at 24 sec., starting to sound distorted (clipping) at 39 sec. If I focus on the nois I can ABX at some points (34-36 sec. = noise RMS -35 to -25dB), but it's questionable if the difference is caused by lame or by massive amplification of dither noise. - o_a_4b_fadein.wav vs. o_a_4b_3.90.2_extreme_fadein.wav: ABXable differences start at 13:0-13:9 sec. (ringing, hashness of "s", later "underwater sound") -> Amplification -47dB, RMS power -21dB => RMS power of signal passed to lame -68dB - o_a_4b_fadein.wav vs. o_a_4b_3.94a11_extreme_fadein.wav: 13:0-13:9 sec. (Same as 3.90.2.; 3.90.2 vs. 3.94 isn't ABXable for me with this sample - well, I didn't bother finding differences at 20 ... 30 seconds where music is totally replaced by artifacts - 3.94 --preset extreme -F: The same. It seems like the "guilty" is vbr mode, not minimum bitrate of 32kbps, since 3.90.2 api sounds fine the whole range.[/span]