Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Is the non-uniform quantizer in AAC a fatal mistake? (Read 6531 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Is the non-uniform quantizer in AAC a fatal mistake?

There's no doubt that the non-uniform quantizer is good for very low bit rates. So I would never advocate removing it.


However, it makes quality at higher bitrates an iffy business, and vastly complicates the rate loop and rate convergence. 

With a uniform quantizer, the rate loop and system design for high-rate coding (i.e. close to transparent) is vastly simplified, and the rates that result for the same quality are likely to be lower.

So, what do you think? Should AAC have included a switch to allow for uniform quantization, or, perhaps, a switch controlled by codebook (sectioning) in which at least one codebook at every level beyond +-1 would represent uniform, rather than powerlaw, quantization?

Edited to add:
Nah, not "fatal". Must have been feeling grumpy or something when I wrote this.
-----
J. D. (jj) Johnston

Is the non-uniform quantizer in AAC a fatal mistake?

Reply #1
There's no doubt that the non-uniform quantizer is good for very low bit rates.

Is it? I mean the idea seems like a good one considering how LBG-like codebook design algorithms work. But the quantized samples' entropy should also be considered. I don't think it makes a big difference when the quantized samples are also properly entropy-coded. For a fixed SNR you mainly trade a low number of code vectors with a high number of code vectors with roughly the same entropy. I might be wrong, though. (*)

With a uniform quantizer, the rate loop and system design for high-rate coding (i.e. close to transparent) is vastly simplified, and the rates that result for the same quality are likely to be lower.

I'm not sure about the 2nd part (lack of experience) but I totally agree with you on the 1st part.

So, what do you think? Should AAC have included a switch to allow for uniform quantization, or, perhaps, a switch controlled by codebook (sectioning) in which at least one codebook at every level beyond +-1 would represent uniform, rather than powerlaw, quantization?

I don't think switching is a good idea. It makes picking the right codebook and scalefactors just more complicated. What should the "linear code books" look like? If only the power term is dropped you'll get a lower average spacing between quantized values which calles for a higher scale factor. I think the code books should be roughly compatible in terms of expected SNR for the sake of scale factor predictability -- provided that it's really worth the hassle.

If I had to design yet another lossy format, it'd probably end up using uniform quantization only unless I'm totally wrong about (*)

edit: link added for LBG algorithm

Cheers,
SG

Is the non-uniform quantizer in AAC a fatal mistake?

Reply #2
For sure, the non-uniform quantizer makes creating fast encoders quite complicated, compared to what could be possible. I never really thought about non-linear hindering high bitrate performance, but it sounds logical.

I think that it would indeed would have been neat to have a uniform quant possibility.

useless rant: perhaps you should have wondered about this 15 years ago...

Is the non-uniform quantizer in AAC a fatal mistake?

Reply #3
I never really thought about non-linear hindering high bitrate performance, but it sounds logical.

Assuming quantization errors are totally random the change of a scalefactor by x dB will result in an expected change of SNR by 3x/4 dB due to the power law. The quantization zones shrink/expand uniformly by a certain percentage. So this applies to high data rates as well. I don't see why such a power law should hinder high bitrate performance so much.

The only "problem" is selecting the appropriate scalefactors, isn't it?

Edit: I just ran a quick simulation for memoryless Gaussian sources:

You'll see, that at higher data rates the uniform quantizer is approx 0.1 bit/sample better. in the range of 4 to 9 dB SNR the power law quantizer seems to perform slightly better (0.03 bits/sample). In both quantizer cases SNR-maximizing quantization thresholds have been chosen. SNR is measured in dB and entropy in bits/sample.

So, it doesn't seem like a nonuniform quantizer is a 'fatal mistake'.

Replacing the scalar quantizer with a simple structured VQ codebook could easily bring this down by another 0.16 bits/sample (VQ people refer to this as "granular gain") which roughly translates to 10 kbit/s for a stereo stream. There are also other things to consider like: Is it possible -- by a clever choice of the quantizer -- to get rid of metallic/tonal artefacts (very low bit rates) when all we are supposed to hear is noise?

Cheers,
SG

Is the non-uniform quantizer in AAC a fatal mistake?

Reply #4
For sure, the non-uniform quantizer makes creating fast encoders quite complicated, compared to what could be possible. I never really thought about non-linear hindering high bitrate performance, but it sounds logical.

I think that it would indeed would have been neat to have a uniform quant possibility.

useless rant: perhaps you should have wondered about this 15 years ago...



I did. Was told to go away and be quiet.

You'll see, that at higher data rates the uniform quantizer is approx 0.1 bit/sample better.


Assuming perfect noiseless compression. That's another story. You also have to account for the granularity of scalefactors and the scalefactor cost, which is a bleeping complicated issue.

But .1 bits/sample before entropy coding is a pretty big margin, even at higher rates.
-----
J. D. (jj) Johnston