Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Low latency codecs (Read 16375 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Low latency codecs

What are the choices for low latency codecs, and what is the fundamental information/perceptual trade-off?

I see that large block sizes are needed to do operations on narrow frequency bands, suitable for some masking stuff, but exploiting temporal correlation could be done against a historical reference (not introducing significant delay)?

-k

Low latency codecs

Reply #1
but exploiting temporal correlation could be done against a historical reference (not introducing significant delay)?


Most codecs don't really do this though, since its quite difficult in practice.

 

Low latency codecs

Reply #3
...exploiting temporal correlation could be done against a historical reference (not introducing significant delay)?

Correct. But low-delay codecs are mostly used in communication scenarios where you might lose a frame during transmission. So if you lose a frame, your history is corrupted, and you're in trouble until the next history reset. Just like in video coding, by the way, where you get nasty blocking artifacts until the next I frame (which might take seconds).

Chris
If I don't reply to your reply, it means I agree with you.

Low latency codecs

Reply #4
In the case of AAC, both Main and LTP profile went nowhere, so I guess the gain wasn't very high. HE-AAC uses some differential coding in the HE part. It certainly doesn't seem that easy if you look at how little used it is.

Opus uses a differential coding of band energy, but constructed so packet loss can be recovered from after a few frames.

The main problem of small block sizes is not so much the inability to operate on narrow bands (the critical bands usually group many FFT lines together anyway), but the problem that tonal signals start to leak into several adjacent frequency bands. Because they require a high SMR, you suddenly have more big coefficients that you must code accurately. And that hurts.

Low latency codecs

Reply #5
What are the choices for low latency codecs, and what is the fundamental information/perceptual trade-off?
I see that large block sizes are needed to do operations on narrow frequency bands, suitable for some masking stuff, but exploiting temporal correlation could be done against a historical reference (not introducing significant delay)?


Masking is a very fuzzy thing. As Garf said, it's not masking that gets you, it's the fact that you lose coding gain.

You can do very narrow frequency domain operations from the temporal domain.  It's all the same thing, after all.  E.g. in Opus CELT mode we have a single backwards looking predictor, but it's only really useful for highly harmonic signals.  For those signals it's helps a lot, otherwise it's not really useful.

Backwards-looking prediction of multiple components has the problem that they're not easily separable.  For a signal with multiple strong inharmonic tones there would need to be multiple offsets. Consider the problem of a single video block which has separate components moving in different directions at once. You could try to separate the block into over-complete components and then predict those but the separation process is computationally hard and might not be possible to do well with low latency.  (Or at least we found that short-time low latency sinusoidal coding didn't appear to work so well, though we perhaps didn't try hard enough this time around as it seemed out of the question computational cost wise)

Making more complicated prediction robust against loss is also quite tricky/impossible, and virtually all low latency applications need to deal with loss (if you can retransmit you probably don't need a low latency codec!)

There are some neat things that could be done if you don't care much about robustness...  E.g. http://ieeexplore.ieee.org/xpl/freeabs_all...rnumber=5413930    But techniques that end up resulting in NxN matrix multiplies or approximations therein work a heck of a lot better (complexity wise) for separable 8x8 pixel blocks than they do for 240 sample audio frame.


Low latency codecs

Reply #7
Try encoding real music losslessly with DPCM and see how much compression you get.  You'll see why I said its "difficult".

This confuses me. Is not "all" music lowpass in nature (at least as a long-term statistic)? Is not DPCM practically a high-pass pre-whitening filter/low-pass predictor?

If music in general is somewhat predictable (just like the weather tends to be like the weather the day before), I would have guessed that a simple, primitive predictor would be better than nothing.

-k

Low latency codecs

Reply #8
Masking is a very fuzzy thing. As Garf said, it's not masking that gets you, it's the fact that you lose coding gain.

Is it possible to say something about how much of the compression is related to pure source-coding, and how much is relying on psycho-acoustically guided lossy coding?

FLAC can do 2:1 lossless encoding, while AAC can do 10:1 or whatever perceptually "as good as lossless" encoding. Can one assume that the first 50% of the AAC compression stems from source redundancy, while the remaining factor stems from (hopefully) irrelevancy?

-k

Low latency codecs

Reply #9
Can one assume that the first 50% of the AAC compression stems from source redundancy, while the remaining factor stems from (hopefully) irrelevancy?


Yes, that looks correct. Another example: MPEG-4 SLS can act as a lossless AAC coder, and when it does, it achieves ratios comparable to classic lossless codecs.

Low latency codecs

Reply #10
This confuses me. Is not "all" music lowpass in nature (at least as a long-term statistic)? Is not DPCM practically a high-pass pre-whitening filter/low-pass predictor?

If music in general is somewhat predictable (just like the weather tends to be like the weather the day before), I would have guessed that a simple, primitive predictor would be better than nothing.


"Better than nothing" is still a far cry from what the codecs achieve now. The T/F transformations they use also exploits the property you mentioned, and hence temporal correlation (but seen from a frequency perspective). Getting more out of that by exploiting correlation between transformed blocks is difficult.

Low latency codecs

Reply #11
NICAM.

But it's hardly efficient!

Cheers,
David.

Low latency codecs

Reply #12
Can one assume that the first 50% of the AAC compression stems from source redundancy, while the remaining factor stems from (hopefully) irrelevancy?


Yes, that looks correct. Another example: MPEG-4 SLS can act as a lossless AAC coder, and when it does, it achieves ratios comparable to classic lossless codecs.


A bit OT, but is there any public SLS encoder available?

Low latency codecs

Reply #13
This confuses me. Is not "all" music lowpass in nature (at least as a long-term statistic)? Is not DPCM practically a high-pass pre-whitening filter/low-pass predictor?

If music in general is somewhat predictable (just like the weather tends to be like the weather the day before), I would have guessed that a simple, primitive predictor would be better than nothing.


"Better than nothing" is still a far cry from what the codecs achieve now. The T/F transformations they use also exploits the property you mentioned, and hence temporal correlation (but seen from a frequency perspective). Getting more out of that by exploiting correlation between transformed blocks is difficult.

So a more precise answer to my initial questions would perhaps be:
1. Lossless audio compression is usually possible, and will give you 2:1 or so for a substantial delay
2. Lossy audio compression is usually possible and may give you 10:1 or so for a substantial delay
3. Very low latency audio compression is usually possible but will either give very poor compression (lossless) or very poor quality:bitrate (lossy)

Low latency codecs

Reply #14
A bit OT, but is there any public SLS encoder available?


There is one in the MPEG reference sources. Those aren't free, of course.

Low latency codecs

Reply #15
3. Very low latency audio compression is usually possible but will either give very poor compression (lossless) or very poor quality:bitrate (lossy)


The last HA listening test tells a different story (see Opus).

Low latency codecs

Reply #16
3. Very low latency audio compression is usually possible but will either give very poor compression (lossless) or very poor quality:bitrate (lossy)


I think you can have efficient low-latency lossless compression. I don't see why a coder with a backwards predictor (Monkey Audio, MPEG ALS with -z mode) wouldn't do well. As explained in the posts there, the problem is that it cannot recover well from packet loss, which tends to go hand in hand with low-latency operation.

For pure lossy codecs, I don't think there is a good way around the coding gain issues.

Low latency codecs

Reply #17
3. Very low latency audio compression is usually possible but will either give very poor compression (lossless) or very poor quality:bitrate (lossy)


The last HA listening test tells a different story (see Opus).


That depends on what you define as very low latency. Opus can work at much lower latency (5ms) than what used in that test (22ms), but at a quality cost. The posters question was what causes this tradeoff.

Low latency codecs

Reply #18
The OP asked for low delay codecs. Opus at 22 ms is by all means a low delay codec compared to all other contenders (AAC, Vorbis, MP3) in the test. Further, for example, AAC-LD is officially called "low delay" for its 20 ms. That Opus can be used down to 5 ms doesn't change the fact, that a low delay codec showed considerably better performance than all other large delay codecs in the latest installment.

The theoretical information usually provided about low vs. high delay coding isn't necessarily false. I just pointed out, that, in practice, a state of the art low delay codec can beat even fine tuned large delay implementations.

Low latency codecs

Reply #19
Masking is a very fuzzy thing. As Garf said, it's not masking that gets you, it's the fact that you lose coding gain.

"coding gain" is the variance of one block of input divided by the variance of one block of transform output, averaged over some set of input blocks?

So the difference between coding transformed blocks of e.g. 128 samples, vs coding transformed blocks of 8*128 samples is that the latter will (for typical content) be more sparse and easily coded into few bits?

-k

Low latency codecs

Reply #20
"coding gain" is the variance of one block of input divided by the variance of one block of transform output, averaged over some set of input blocks?

So the difference between coding transformed blocks of e.g. 128 samples, vs coding transformed blocks of 8*128 samples is that the latter will (for typical content) be more sparse and easily coded into few bits?


I wouldn't say typical content but more: tonal signals. Encoding a tonal signal properly requires a higher SMR due to psychoacoustic reasons. So it's especially relevant to get a good coding gain there. This is visible in practice too, in the sense that low delay codecs are relatively worse on highly tonal signals and suffer less for more noisy input.

The example you give compares encoding 128 samples with encoding 1024 samples so it doesn't really make sense. If you're instead compare 8*128 vs 1*1024 samples, the latter will indeed be more sparse for a tonal signals. If you take a single sine wave, for the 1024 sample case, you will have 1 peak with some small sidelobes/leakage and everything else 0, whereas for the 8*128 you will have 8 peaks, with more leakage, and less 0s.