Lossless : stop the madness.
Reply #27 – 2002-08-07 14:10:12
What i am going to make here is quite a controversial statement, many educated people might come and "prove me wrong", thinking that I dont understand what Im talking about. But I personally believe, philosophycaly, that lossless compression is only at its beginning. Audio data is peculiar, it's different from "random-like" data (like socio-political stats( which, when analysed correctly, might as well not be considered so much "random-like")). Without being redondant in itself, musical or sonic data possesses some specific patterns which may be exploited in order to concisely describe it. It's all about detecting those pattern, analysing the nature of the data. At the moment there is mainly two aproaches to audio compression (or encoding, depending on the way we do it....) 1. Lossy, psychoacoustic encoding, reaching at best "transparent" compression at 1:8 ratio and less (MPC and AAC). Removing, in each frame, the unnescessary data, corresponding to the frequency (or partials) that are masked by others... etc.... 2.Lossless, information theory based, mainly exploiting the redundance between L and R channel, then predicting the next sample, and only keeping the prediction error. Finally, we Huffman (run lengh) compress the leftover. Achieving completely transparent result around 1:2 ratio. But something's annoyingly missing from both of them: taking into acount the deep relationship that exists between each note played by the same instrument... If a codec was "intelligent" enough to determine which instrument is played, analysing it precisely, the nature of its timbre, all the harmonics (the partials...) that it generates. Then described the intrument in a concise way. What's left to do? To precisly note when the instrument comes on, when it comes off, what modulation the player imposes to it (modifying the timbre from the original model) what note is played (and in what way does it modify the formants of the sound (the part of the sounds that "doesn't" change, whatever note is played) Then, it might reproduce transparently the sound using a lot less bandwidth than Wav, Flac or MPC. You can see this aproach as a beefed up Midi file. With an incredibly excellent synthesizer and a excellent earing. To assure losslessness, it can then substract the original audio from the "predicted" (encoded) and huffman encode the slight difference that may be left. scenario: The encoder detects: "Oh! This is a piano playing, than, basing itself on the generic piano model, it analyses the specific harmonics generated by this specific piano, and figures out somekind of physical model that would react about the same way then the piano is. It then tries to reproduce the exact same sound with the synth, and if losslessness is really desired, it then substracts the real from the false and huffman encodes the delta (the remainder). It does this for every instrument that it detects in the piece. But the encoder would be useless for white noise, which can only be repruduced lossleesly by using at least the same number of byte. The trick would work precisely because we are NOT listenning to white noise all the time (hopefully ). Anyways, don't believe that WAV is such a perfect way of reproducing sound. It introduces some artefact too (though it's nearly totally inaudible) But any frequency above 1/2 nyquist frequency is quite distorted... (11.025 KHz, for CD audio). A better aproach would be to encode some 192KHz, 32bit with some kind of MPC or AAC, in order to obtain a final product that would be 1411,22 Kbps (like CD audio). It would be a "lossy" method, but it would sounds a lot better than 16 bit wav (more dynamic available among other things, less distorting from the original sound when played back at half the speed....)) Just my two cents... Now... Beat me up