Pro Logic's Center Channel Extraction
Reply #14 – 2011-08-05 11:54:42
Sorry to resurrect a slightly old thread, but I'd have thought if you want to encode 5.0 sound in a new end-to-end format onto a CD that sounds like standard 2.0 L+R Red Book audio if played without a decoder you could use some digital tricks to get far superior results with no steering problems, assuming your decoder has access to the digital stream, not just an analogue CD player output. Just for example, a 16kHz lowpass filter is essentially transparent for real musical signals other than test tones according to many listening tests, virtually regardless of listener's age.My initial thought was that it's quite feasible to consider encoding the side channels at much reduced volume and somewhat less bandwidth and shifted, possibly reflected into the 16-22kHz regions of the normal channels (so 0Hz gets reflected up to 22.05 kHz, 6kHz is at 16.05 kHz). Reduced amplitude is necessary to safeguard tweeters and listeners' pets but the reduced bit depth could be compensated using mu-Law a-Law or ADPCM types of technique) If 8kHz bandwidth is fine for Pro Logic, I guess 6kHz isn't too bad, or you could possibly encode over 12 kHz using the ultrasonic areas of L and R channels together as single channel in some way, while encoding a steering signal into or instead of the LSB(s) of the CD audio. Then I had a potentially better thought .14-bit quantization is enough to get transparent PCM audio, even without spectrally-flat dither (and was Philips' original proposal for the CD standard), so you could definitely replace the 2 LSBs of your music, possibly more, with some other encoding that would get lost below the noise floor of even a super-quiet listening room. Just the 2 LSBs at 44.1 Sa/s x 2 channel = 176.4kbps , so even a steering signal to re-steer the left and right channels to completely different locations with or without specified delays could be arbitrarily accurate dependant on the time-constant you allow, so you could get, say, a 12kHz channel, as above then steer it as you wish. Instead of that, though, potentially the best idea is that with such bitrate you could even encode to a low-latency music-compatible codec like the superb CELT (in 5ms latency mode, the delay is negligible, but could be compensated with a delay to the 14-bit PCM front channels if you could be bothered, and at 20ms+ latency (still low enough) the quality per bitrate is even better). CELT is even remarkably resilient to bit-errors, so even a Red Book CD bit stream's errors in burst mode (as in a standard audio CD transport) probably wouldn't sound bad, especially if C2 error detection is known to your decoder. There's enough bitrate to allow you to encode in which ever way suits you best (e.g. you could derive the Centre from L and R from the PCM, then use CELT for LS and RS) and include some kind of signature for compatible decoders to know there's a valid signal there in the LSBs. For easier integration into industry standard chipsets, you could also use codecs like MP3 at up to 160 kbps CBR, which would also provide very good quality stereo. You could potentially offset the MP3 stream relative to the PCM to account for the higher decoder latency. End-to-end latency isn't too much of a concern as it's not encoded live, just requires reasonably sychronization on decode, though it happens that CELT is extremely good and still provides low latency (and is intended to be patent-free also). Whatever you encode there, but especially a lossy codec like CELT (presumably padded out with null data where it doesn't use all 176.4kbps) would present a very noiselike PCM signal so that even if turned up during fadeouts it would sound much like white noise hiss or dither when played back on a regular Red Book CD player. Better still, you can dither your L&R PCM audio to 14-bits with any form of dither you like to preserve a high (technically infinite) dynamic range and relatively low perceptual or noise floor. You could also consider using 15-bit PCM for L and R and 88.2 kbps left over for surround encoding (CELT is still remarkably good at 64 kbps in stereo, and thus 88.2kbps also, but probably not quite transparent) or variations of PCM bitdepth versus encoded bitrate as required. Encoder artifacts in LS and RS alone are likely to be harder to spot in the presence of audio from the PCM front channels. During fadeouts and other quiet passage with no surround channel requirements, you could also consider switching off the LSB encoding and reverting to 16-bit PCM or allowing digital-silence, should that be seen as desirable for the Red Book Audio for any reason (not that I can think of any).