A Quasi-Lossy Codec

Topic: A Quasi-Lossy Codec (Read 10390 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

A Quasi-Lossy Codec

2005-03-23 22:10:27

I've been doing a lot of work with lossless formats lately, and I had an idea for a codec... I'm not sure if this makes any sense at all.

Is there any point in developing some kind of codec that can remove the most un-essential components of an audio file such that it can be transcoded to other formats with the perception of it being an original encode? I'm not sure how this would work, exactly, but perhaps such a file could remove parts of the audio that are commonly not used by perceptual audio encoders. Then, one could later choose to transcode to a lossy AAC, OGG, MP3 file, etc, and the quality would be as good as an original encode.

Perhaps such a codec could compress the audio down to a quarter or less of its original size, which normal lossless encoders can only do with speech, and some forms of classical. (Or perhaps something relatively quiet without a lot of high-frequencies).

Does this make any sense to anyone who knows more about the technicalities of perceptual lossy encoding? Maybe MPC is a codec that is almost like this already, but I'm fairly ignorant when it comes to how the actual algorithms work. (Not that I wouldn't mind learning, however)

A Quasi-Lossy Codec

Reply #1 – 2005-03-23 22:18:40

What would be the difference between this and a normal codec operating at high bitrates?

A Quasi-Lossy Codec

Reply #2 – 2005-03-23 22:31:52

The closest to what you're saying is nonperceptual lossy codecs like WavPack Hybrid mode or Optimfrog DualStream ...

A Quasi-Lossy Codec

Reply #3 – 2005-03-24 02:18:17

Quote

What would be the difference between this and a normal codec operating at high bitrates?
[a href="index.php?act=findpost&pid=284978"][{POST_SNAPBACK}][/a]

Hmm... good question. I guess the difference would be that, rather than the codec being perceptually transparent to a human, it is transparent to an encoder.

I'm not sure if this is possible without a tremendous amount of effort, but the basic idea is to make an encoded file that would be transcoded, and result in almost identical files as a file encoded from the original source material.

So, let's call my new imaginary codec QLC (Quasi-lossy codec). You rip your CD to wavs, then make QLC copies afterwards.

Now we encode to various lossy formats. We have oggs, MP3s, and AAC that are encoded from both QLC and WAV. And the files are nearly bit-for-bit identical, except that the QLC-encoded files were used for one.

I guess this makes it a little more clear in my head, anyways. The idea for this came about when I was considering that the main reason that I (and probably most people) want lossless audio is because we want the option to transcode to perceptual lossless codecs at a later time. There isn't an inherent improvement in the listening experience compared to, say, high bitrate MPC, but it gives more flexibility down the road.

@atici: I'm not sure what a nonperceptual lossy codec is... what use would you have for a lossy codec other than perceiving it? If I recall, these formats are structured in such a way that you can remove a part of the file that is lossy so that you can listen to it independent of the entire thing, right? Personally, I'm not too sure how that makes sense, as one can fairly easily encode to another lossy format, but I guess the idea is that it simplifies the process, right?

A Quasi-Lossy Codec

Reply #4 – 2005-03-24 02:48:10

The lossy codecs I mentioned earlier is exactly suited for the purpose you have in mind: better efficiency for transcoding. Nonperceptual means these lossy codecs do not have an ear model as the perceptual lossy codecs do, i.e. they aim to approximate the original with no reference to human hearing. Therefore there are audible artifacts at low bitrates. However they're known to be better suited for transcoding purposes. den has been looking into this for a while. Here are the relevant threads: thread 1, thread 2, thread 3.

Also you might be interested in bitrate peeling. Ogg Vorbis has promised bitrate peeling in the future but I don't think it's anytime soon. It does almost what you want, you encode your files to high quality ogg vorbis files and when/if you need lower quality files you can peel the high quality ogg vorbis files without transcoding so that the low quality files are created as if they're direct from original WAV source. Relevant threads: thread 1, thread 2.

Otherwise you cannot have general lossy codec such that no matter what lossy codec you transcode it into it will be as if it's direct from the original. If you want to achieve minimum loss no matter what lossy codec you transcode, you need to go with Wavpack/Optimfrog hybrid. If you know you'll use vorbis only (not any general lossy codec), then you could hope that bitrate peeling will be implemented in the future.

A Quasi-Lossy Codec

Reply #5 – 2005-03-24 07:15:23

Neat... I should learn more about WavPack. I've not yet used it. Thanks for the tip.

A Quasi-Lossy Codec

Reply #6 – 2005-03-24 07:20:47

What you wrote reminds me somewhat of this in OptimFrog. I find the option intriguing, even if I would be apprehensive about using it.

Quote

OptimFROG - IEEE Float
[...]
There is a supplementary option, named --mantissabits intended to
reduce the effective mantissa bits before the actual compression. The
23 mantissa bits can be reduced to 22 bits, up to 7 bits. This leads
to significant compression increase. The process is free from any
quantization distortion and any dynamic range reduction.
I suggest you may (always) use the --mantissabits 15 value, as it
gives around 25% compression improvement and it is indistinguishable
for any purposes from the original file.
There is also a big advantage of this process - decoding the file
and reencoding it (or any fragments of it) with the same or bigger
preprocessing option value does not introduce any additional changes
to the data.
[...]

full page here
A bit more information: Significand (..or, more informally, mantissa)

Or in a different manner MPEG-4 SLS 'Scalable Lossless Coding'. I found a rough introduction here Advanced Audio Zip (.pdf file)
And here is a thread discussing SLS on HA.

I hope this helps (it ended up taking awhile ), tec

A Quasi-Lossy Codec

Reply #7 – 2005-03-24 07:41:23

About Ogg Vorbis bitrate peeling, I've heard comments that this would be a huge technical challenge, and may not retain reverse compatibility. It seems that WavPack hybrid is a similar idea, but to me, at least, it makes more sense.

With the bitrate peeling idea, would you have to set an alternate bitrate in advance when you encode it? Like, you encode at Q6/Q0 or something, so that you can recode to Q0 for your MP3 player, but use the Q6es on your computer, but you can't actually "recode" to anything else?

I guess it's a new thing, and maybe the theory is in the making.

BTW, am I using the correct terminology? What do you term peeling the bitrates, or uh... re-encoding without transcoding?

A Quasi-Lossy Codec

Reply #8 – 2005-03-24 08:24:18

Also, MPC scored fantastic with a limited amount of samples in the recent reencoding listening test by guruboolez.

- Lyx

A Quasi-Lossy Codec

Reply #9 – 2005-03-24 08:29:36

Quote

Also, MPC scored fantastic with a limited amount of samples in the recent reencoding listening test by guruboolez.

- Lyx
[{POST_SNAPBACK}][/a]

But failed at ~330kbps compared to wavpack lossy at the same bitrate, at least with the only sample tested at this bitrate.
::[a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=32440&view=findpost&p=283557]source[/url]::
::ABX log::

Bitrate peeling doesn't sound realistic. It's a very old feature, often mentioned but never implemented. Some people call this “vaporware”. Is someone still working on peeling? Few samples were uploaded here two or three years ago, and the quality was absolutely awful (-q0 peeled from higher bitrate compared to -q0 reencoded from higher bitrate).

A Quasi-Lossy Codec

Reply #10 – 2005-03-24 09:54:58

Lowpass @ 20kHz, then encode with a lossless encoder: this is your "quasi-lossy" encoder.
(altough something is lossy or lossless, but can't be "quasi-lossy")

A Quasi-Lossy Codec

Reply #11 – 2005-03-24 23:31:28

@Tec9SD:
Looks like some interesting reading. Thanks! That's probably very close to what I've been looking for. edit:Is this only for 24-bit audio? I don't know what IEEE Float is...

@Gabriel:
Is there actually a lowpass filter in any of (all of?) the lossless encoders?

I guess you'd know how that affects perception of audio, being the LAME developer, but are there not situations where that would make an audible difference... or some kind of difference in fidelity to a person with extremely good hearing and equipment? (at least something like a 20 kHz tone at very high volumes, or something)

A Quasi-Lossy Codec

Reply #12 – 2005-03-25 00:03:30

Quote

@Gabriel:
Is there actually a lowpass filter in any of (all of?) the lossless encoders?

I guess you'd know how that affects perception of audio, being the LAME developer, but are there not situations where that would make an audible difference... or some kind of difference in fidelity to a person with extremely good hearing and equipment? (at least something like a 20 kHz tone at very high volumes, or something)
[a href="index.php?act=findpost&pid=285323"][{POST_SNAPBACK}][/a]

No, not with music because of masking. If you have bat ears and listen to test-tones at unrealistic volume levels(ones which would damage your ears on lower frequencies), then maybe. Also, when transcoding with a lossy codec then the codec will lowpass anyways for exactly the same reason.

- Lyx

A Quasi-Lossy Codec

Reply #13 – 2005-03-25 00:31:08

Test with loud music and many high frequencies:

APE 3.99 High / original sample - 47.6 MB
APE 3.99 High / same sample with 20 KHz cutoff - 47.0 MB (1.4% gained)
APE 3.99 High / same sample with 16 KHz cutoff - 43.9 MB (8.4% gained)

FLAC 1.1.2 -8 / original sample - 49.2 MB
FLAC 1.1.2 -8 / same sample with 20 KHz cutoff - 49.0 MB (0.4% gained)
FLAC 1.1.2 -8 / same sample with 16 KHz cutoff - 47.3 MB (3.9% gained)

Yeah, very useful...

A Quasi-Lossy Codec

Reply #14 – 2005-03-25 00:31:10

Quote

With the bitrate peeling idea, would you have to set an alternate bitrate in advance when you encode it? Like, you encode at Q6/Q0 or something, so that you can recode to Q0 for your MP3 player, but use the Q6es on your computer, but you can't actually "recode" to anything else?

Basically you are just reordering and dropping unused residue data from the VQ backend. The problem is that the codebooks need to be large in order to do this. Segher on xiph.org attempted to write a peeler. I don't know if it was him or somebody else that called it a "dumb peeler" it was really one the first stage peelers it didn't give optimal results on anything (guruboolez would be screaming bloody murder ) it was discussed numerous times before. I am pretty sure a peeler would be possible, but I have no idea who is working on these day's it IS in fact on the bounty somebody with brains though would have to figure out a way to make it work ;-D. I would imagine some noise shaping algorithm would be involved also. Like a orange you can only peel off packets. Basically you would encode at -q 6 on your HD then "peeler" would allow you to peel packets down to target size say -q 2, but do that like I said you would need a really large codebook.

A Quasi-Lossy Codec

Reply #15 – 2005-03-25 00:53:16

Quote

Like a orange you can only peel off packets.

I had an orange for breakfast today, and I didn't even notice the packets on it!

So this is more like an orange, where you can peel off one layer, than an onion, where you could peel of a number of arbitrary layers?

A Quasi-Lossy Codec

Reply #16 – 2005-03-25 00:59:08

Quote

So this is more like an orange, where you can peel off one layer, than an onion, where you could peel of a number of arbitrary layers?

Your getting warmer ;-D (actually onion would probably be better example). in VQ codebook terminlogy these are called "passes" I believe. They work very similiar to n-pass used in video encoding I think. I am not a guru in VQ though I am sure somebody with dev. knowledge could fill you in I haven't looked into it in a long time. "Layers" is a good way to think of it though

http://www.data-compression.com/vq.html this is a good introduction for anybody interested in the the math/technical details behind VQ in general (not that this doesn't pertain to packet peeling/codebooks in Vorbis specifically) but it's a good read if you are interested in second stage entropy methods. Probably would be more appropriate for sci/dev forums, but what the heck.

A Quasi-Lossy Codec

Reply #17 – 2005-03-25 08:45:16

Quote

Test with loud music and many high frequencies:

APE 3.99 High / original sample - 47.6 MB
APE 3.99 High / same sample with 20 KHz cutoff - 47.0 MB (1.4% gained)
APE 3.99 High / same sample with 16 KHz cutoff - 43.9 MB (8.4% gained)

FLAC 1.1.2 -8 / original sample - 49.2 MB
FLAC 1.1.2 -8 / same sample with 20 KHz cutoff - 49.0 MB (0.4% gained)
FLAC 1.1.2 -8 / same sample with 16 KHz cutoff - 47.3 MB (3.9% gained)

Yeah, very useful...

I did not said that it would be very usefull, it is just the answer to the original post.

A Quasi-Lossy Codec

Reply #18 – 2005-03-25 11:06:27

I'm actually very surprised that Gabriel's lowpass idea didn't have more impact than it did. I thought that the high frequencies were more complex, and thus didn't compress much in the first place. Or is this only applicable to perceptual lossy audio codecs?

I guess the point is made that there are some things that can be removed from the audio that could in no way affect the perceived quality of the sound, and that's basically what I'm looking for. There are probably other things too, and by the sound of it, that's exactly what WavPack is good at.

BTW, does anyone know of a detailed comparison of codecs that uses the most recent revision of WavPack? AFAIK, it does compare favorably with the other popular lossless codecs, but I haven't seen a recent comparison with the 4.1 series.

A Quasi-Lossy Codec

Reply #19 – 2005-03-25 19:34:26

Quote

Test with loud music and many high frequencies:

APE 3.99 High / original sample - 47.6 MB
APE 3.99 High / same sample with 20 KHz cutoff - 47.0 MB (1.4% gained)
APE 3.99 High / same sample with 16 KHz cutoff - 43.9 MB (8.4% gained)

FLAC 1.1.2 -8 / original sample - 49.2 MB
FLAC 1.1.2 -8 / same sample with 20 KHz cutoff - 49.0 MB (0.4% gained)
FLAC 1.1.2 -8 / same sample with 16 KHz cutoff - 47.3 MB (3.9% gained)

Yeah, very useful...
[a href="index.php?act=findpost&pid=285336"][{POST_SNAPBACK}][/a]

It seems to me that most of compression in lossless encoders comes from compressing the stereo image. A couple of days ago I was losslessly compressing some stereo file and it compressed by about 40%, after converting it to mono it compressed by about 10-20% only. I guess that stereo file with 2 identical channels would compress by more than 50%, while stereo file with 2 totally different channels would compress by far less, like 20-30% at most. Factors like frequency bandwidth don't seem to matter that much in lossless compression.

A Quasi-Lossy Codec

Reply #20 – 2005-03-25 22:59:12

Quote

It seems to me that most of compression in lossless encoders comes from compressing the stereo image. A couple of days ago I was losslessly compressing some stereo file and it compressed by about 40%, after converting it to mono it compressed by about 10-20% only. I guess that stereo file with 2 identical channels would compress by more than 50%, while stereo file with 2 totally different channels would compress by far less, like 20-30% at most.[a href="index.php?act=findpost&pid=285536"][{POST_SNAPBACK}][/a]

Yeah, AFAIR, it was called safe joint stereo? I think every modern encoder uses advanced joint stereo algorhytms that allow lossless or near-lossless stereo imaging in most cases (not at low bitrates though).

Quote

Factors like frequency bandwidth don't seem to matter that much in lossless compression.
[a href="index.php?act=findpost&pid=285536"][{POST_SNAPBACK}][/a]

I don't think so. Have you actually tried compressing ambient or something similar? You can easily go down to 400-500 kbps, and I don't think that it is poor stereo image's fault.

Notice