Encoder clipping prevention...
Reply #1 – 2007-03-21 03:34:39
I'm not an expert in codec design by any means, but your project sounds interesting and quite promising, especially if you've managed to bring the relative simplicity of tuning the psymodel from MPC (from what I've read of Frank and Andre's websites) into a VBR-oriented transform codec that presumably has few of the inefficiencies and inflexibilities enforced by the MP3 format. Anyway, to my point, your post mentions horrendous numbers of clipping samples many times, but on no occasions do you say whether they're audible. You can actually get away with a surprising amount of clipping (often a few consecutive samples at 44100Sa/s) without it being audible in my limited experience and hearing ability. I believe all Dibrom's tuning on Lame --alt-preset standard (now -V2) was done with no volume reduction and certainly no ReplayGain and was done with clipped decoded peaks, albeit with audio less distorted and compressed on the original CD than today's CDs. It was considered transparent in all but very few problem samples, and I don't believe clipping was considered to be to blame for lack of transparency. However, many of today's peak-distorted CDs might sound no worse (or even no different) with a modicum of clipping. Some are clipped already on the CD in ways that have introduced only modest levels of harmonic distortion. If those distortion harmonics are small enough to be already masked, they may be removed during encoding and "restore" the peak to an unclipped appearance. However, on decoding without ReplayGain they may well be clipped once again, causing a very similar range of distortion harmonics (usually odd-order harmonics) to the original, which would be masked to about the same degree, and hence cannot be perceived. If ReplayGain is used and the decoded peaks remain unclipped, the lack of such clipping harmonics should be inaudible according to the originally calculated masking thresholds that led to their removal. In other cases, some of the clipping distortion harmonics present on the original CD would be unmasked and audible, so the encoder should retain those that are audible. This is quite likely to produce a decoded peak that isn't clipped or is only slightly clipped (to a degree that will be inaudible - though this is the point which needs ABXing). If none of this clipping turns out to be audible with or without ReplayGain, your principal concern over comparative performance might only be whether any of the more dynamic and well-mastered recordings (which happen to be those that require little ReplayGain adjustment) would exhibit peaks so unnaturally extreme that they force RG with Clipping Prevention to lower the volume to much less than the Target Volume of 89 dB in situations when other codecs (and/or the original CD) manage to achieve the Target Volume without clipping.