What is this low-frequency content?
Reply #5 – 2012-05-04 14:57:15
Seem like there's no decoder clipping. Having now seen the file, it only reached full scale once in one channel and has plenty of headroom the rest of the time. I just made an encoding using lame 3.98.4 (not the latest version, I know)lame -V5 filename.wav This created filename.wav.mp3 which averaged about 144 kbps of variable bitrate MP3. It's likely to use significantly higher bitrate at certain times (often transients) and lower bitrates at others. With such a lot of picking sounds on the guitar strings, and possibly some timing difference in when these transients reach left and right channels, I dare say this is more demanding of bitrate than most samples. I then decoded this usinglame --decode filename.wav.mp3 to create filename.wav.mp3.wav It also removed the timing offsets introduced by the encode-decode process, which most mp3 encoder-decoder pairs don't do. I used an old version of Cool Edit 96 (predecessor to Adobe Audition) to view the spectrogram (mainly because I remembered where to find the menu to change spectral resolution to 2048 bands with Blackman window) As you see in my capture image (not embeddable, so click the link), neither version exhibits significant content below 40Hz (displaying whole file but only lower portion of frequency spectrum):http://www.mediafire.com/i/?56sa66550vsxbjk I then repeated using lame without the -V5 option so it encodes to CBR 128kbps (same as -b 128 option) and found the same as you. None of the areas involved were clipping (except maybe one towards the end of the sample). With so much picking during the piece, I'd imagine there could be significant transients throughout, so it's plausible that the encoder switches to short blocks (with poorer frequency resolution) and in the case of CBR 128 it doesn't have the available bits to encode with the accuracy that will result in no bleed-through and concentrates the bits available on encoding more accurately in the most important parts of the spectrum. But this is very hand-waving guess at the causes. The important thing is not whether the spectrum looks identical but whether or not it sounds identical. Transients introduce temporal masking such that more distortion can go un-noticed for a short time after a transient (and a shorter time before it), so it might be that you can't hear the difference. The best way to find that out is to run an ABX test comparing the decoded MP3 or AAC to the original WAV. A spectrogram that looks great can sound awful (try the old BLADE encoder) and one that looks a poor match can sound indistinguishable (often LAME -V5 or so will look lacking in the treble area but sound perfect thanks to its very well tuned psychoacoustic model, which applies to the VBR modes but not CBR). If you want to find out if it's significant, forget 'measurement' and rule-of-thumb engineering specs (like 20 Hz- 20 kHz range of human hearing rule of thumb) and use ABX to see how it sounds to the human being. That's absolutely necessary with psychoacoustic encoders in the presence of real music rather than test tones. [edit: minor typos]