HydrogenAudio

Lossless Audio Compression => Lossless / Other Codecs => Topic started by: killerstorm on 2011-07-19 09:17:10

Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 09:17:10
hi

I'm considering making a "scalable to lossless" codec similar to MPEG-4 SLS (http://en.wikipedia.org/wiki/MPEG-4_SLS) but for arbitrary lossy encoder.
I.e. you make a lossy stream with encoder of your choice and parameters of your choice, say, MP3 lame.
Then you pass both original file and lossless stream to a scalable-to-lossless encoder, let's call it WaveDelta and it encodes "difference" between two. This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.

Then when you want to get you original file back you feed lossy stream and difference to WaveDelta decoder and it recovers original perfectly.

Use case 1: You listen mostly to MP3 or AAC on variety of devices, but you want to be able to recover original files "just in case" but you don't want wasting storage space on FLACs.
Use case 2: Distribution via torrents: one can include both MP3s and WaveDelta files into torrent. Those who want MP3s will download only MP3s. Those who want lossless will download both and extract into their lossless format of choice. Some people might want to preview content by downloading MP3s first but then choose to get lossless if they like music. (Incurring little overhead!)

Difference from 'dual stream' codecs like Wavpack Hybrid or OptimFROG DualStream is that you have lossy stream of your choice, say, 320 Kbps MP3. This works better both for distribution (it is good when distribution format is familiar and doesn't require additional software) and archival (you can play MP3 on variety of devices).

Difference from MPEG-4 SLS: 1. Any lossy bitstreams, not just some form of AAC. 2. Not as proprietary.

I don't know whether it will work, some people even say that it is impossible , but so far I've got pretty good preliminary results with MP3 lame as lossy stream (not encoder but just a concept of how it can be done.

So, is there any interest in a technology like this? I'm not sure if I should investigate it further...
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-19 11:24:51
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully. Your technique can't do that.

You will probably also find that the FLAC encoded difference isn't that much smaller than the original FLAC. At which point you might as well ship a normal FLAC+MP3.
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 11:54:22
Quote
You will probably also find that the FLAC encoded difference isn't that much smaller than the original FLAC. At which point you might as well ship a normal FLAC+MP3.


I'm not going to encode difference using FLAC, I'm going to implement a very specialized codec for this purpose. And it doesn't actually compute difference, I've mentioned it only as a metaphor. (Sorry if it was confusing, I just tried to describe it in simple terms.)

A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully. Your technique can't do that.


I can do this too (although I cannot guarantee high quality), but I don't think it is very useful.

It might make sense for different samplerates/bitrates. For example, if original file is 96 kHz/24 bit it is possible to resample it to 48 kHz or 44.1 kHz and 16 bits resolution and then transmit three parts:

1. lossy MP3
2. correction to 44.1 kHz/16 bit resampled file
3. correction to 96 kHz/24 bit file

Only those who have good DACs will be interested in third file, otherwise a good resampling will be ideal.

But I see no value in having different qualities. Usually MP3 at a reasonable bitrate is already good enough and there is no benefit of using a correction file, unless you go full lossless.

It would make sense for video because it is still problematic from bandwidth perspective but not for audio.
Title: Flexible 'Scalable to Lossless'?
Post by: odyssey on 2011-07-19 12:29:32
It's very interesting indeed! Although, I wonder why SLS isn't more widespread. It's almost impossible to find an encoder and any support for decoding of it.
Title: Flexible 'Scalable to Lossless'?
Post by: 2Bdecided on 2011-07-19 12:39:52
It's a fun project, and for that reason alone you may choose to do it, but...

This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.
I think you presume wrong. If mp3 (or any other psychoacoustic-based lossy form) was an efficient basis for lossless compression, you'd find it at the heart of the most efficient lossless codecs. You don't. Ever. Which tells you all you need to know.

What you'll find is that it's quite difficult to get the correction file alone to be usefully smaller than good lossless encoding of the original.

Cheers,
David.

Title: Flexible 'Scalable to Lossless'?
Post by: saratoga on 2011-07-19 14:24:19
Fun idea.  It won't work, but fun idea
Title: Flexible 'Scalable to Lossless'?
Post by: menno on 2011-07-19 14:35:32
Allowing any lossy bitstream as basis will not work, because lossy decoders are usually not deterministic. You could only use the exact same decoder that you also used to create the difference file that you feed to the encoder of the residue data. This is why a SLS decoder doesn't use a regular AAC decoder for the lossy stream, but a special deterministic one that isn't even capable of using all the AAC bitstream data.

It's a fun project, and for that reason alone you may choose to do it, but...

This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.
I think you presume wrong. If mp3 (or any other psychoacoustic-based lossy form) was an efficient basis for lossless compression, you'd find it at the heart of the most efficient lossless codecs. You don't. Ever. Which tells you all you need to know.

What you'll find is that it's quite difficult to get the correction file alone to be usefully smaller than good lossless encoding of the original.


MPEG-4 SLS is capable of compression rates very similar to FLAC -5.
Why lossy codecs are not generally used as basis for lossless compression is the complexity compared to most common lossless codecs. Why use something complex (and patented) when you can use something very simple and get the same results?
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 15:01:40
I think you presume wrong. If mp3 (or any other psychoacoustic-based lossy form) was an efficient basis for lossless compression, you'd find it at the heart of the most efficient lossless codecs. You don't. Ever. Which tells you all you need to know.


Yes, I'm pretty sure that lossless codec is not an efficient basis for lossless compression, nobody doubts that there will be overhead compared to lossless-only case, but the task is to salvage whatever information is available in lossy form.


Quote
What you'll find is that it's quite difficult to get the correction file alone to be usefully smaller than good lossless encoding of the original.


It looks like a difficult problem from DSP perspective.

But luckily my background is not DSP but general applied math, so instead of thinking about signal processing I think how to minimize a Frobenius norm of a matrix or something like that

For information-theoretic point of view you'll have no compression gain from having access to lossy stream if and only if mutual information between two signals is exactly zero, that is, they are entirely independent random variables. And that would be ridiculous.

As I said I already have some results, quite surprising, actually: estimation shows that 64 Kbps mono MP3 -- 1.45 bits per sample -- helps to eliminate about 1.45 bits of entropy in encoding of a lossless signal, so pretty much each bit is used. (It's worth noting that I've started with somewhat suboptimal lossless encoding scheme, but there is a room for improvement.)
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 15:15:04
Allowing any lossy bitstream as basis will not work, because lossy decoders are usually not deterministic. You could only use the exact same decoder that you also used to create the difference file that you feed to the encoder of the residue data. This is why a SLS decoder doesn't use a regular AAC decoder for the lossy stream, but a special deterministic one that isn't even capable of using all the AAC bitstream data.


Good point. My plan is to make a general purpose tool and then it is up to other people to find a way to combine it with a deterministic lossy decoder.

MPEG-4 SLS is capable of compression rates very similar to FLAC -5.
Why lossy codecs are not generally used as basis for lossless compression is the complexity compared to most common lossless codecs. Why use something complex (and patented) when you can use something very simple and get the same results?


That's how I see it too.
Title: Flexible 'Scalable to Lossless'?
Post by: saratoga on 2011-07-19 15:17:37
For information-theoretic point of view you'll have no compression gain from having access to lossy stream if and only if mutual information between two signals is exactly zero, that is, they are entirely independent random variables. And that would be ridiculous.


Thats actually not right.  You'll get zero (or more likely negative gain) if the mutual information is less then the added noise from lossy compression step.  Remember, lossy compression adds a lot of quantization noise, which is essentially random and therefore nearly compressible.  Good luck storing a way to correct that efficiently in your correction file.

Title: Flexible 'Scalable to Lossless'?
Post by: pdq on 2011-07-19 15:56:21
You mean nearly INcompressible.
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-19 16:03:13
I'm not going to encode difference using FLAC, I'm going to implement a very specialized codec for this purpose.


Oh ok. That will be rather difficult,  but very interesting, if you get it working well.

Quote
I can do this too (although I cannot guarantee high quality), but I don't think it is very useful.

It might make sense for different samplerates/bitrates.


SLS can do it with bit-granularity with almost perfect quality scaling. Not quite the same thing.

Quote
But I see no value in having different qualities.


Ain't that the point of your idea?
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-19 16:04:18
It's very interesting indeed! Although, I wonder why SLS isn't more widespread. It's almost impossible to find an encoder and any support for decoding of it.


Title: Flexible 'Scalable to Lossless'?
Post by: saratoga on 2011-07-19 16:13:04
You mean nearly INcompressible.


Technically they mean about the same thing
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 16:23:02
Thats actually not right.  You'll get zero (or more likely negative gain) if the mutual information is less then the added noise from lossy compression step.  Remember, lossy compression adds a lot of quantization noise, which is essentially random and therefore nearly compressible.  Good luck storing a way to correct that efficiently in your correction file.


Note that I'm not even aiming to make lossy+correction compression which will be better than pure lossless. Goal here is to make size(lossy)+size(correction) < size(lossy)+size(pure lossless), i.e. size(correction) < size(pure lossless).

As for quantization noise, it just reduces mutual information, so there is no need to take it into account separately.

E.g. if A is signal to be encoded (a random variable) and B is lossy representation of it B = A+X where X is noise (random variable independent from A). Then mutual information I(A,B)=H(B)-H(B|A)<H(B) as H(B|A)=H(X)>0 as you cannot get B knowing only A but not X. So not all of B's entropy is used for mutual information. Also joint entropy H(A,B)=H(A)+H(X)>H(A), which means it takes more bits to encode both A and B in presence of noise (no shit, Sherlock!).
Title: Flexible 'Scalable to Lossless'?
Post by: saratoga on 2011-07-19 16:36:15
Note that I'm not even aiming to make lossy+correction compression which will be better than pure lossless. Goal here is to make size(lossy)+size(correction) < size(lossy)+size(pure lossless), i.e. size(correction) < size(pure lossless).


Really?

As I said I already have some results, quite surprising, actually: estimation shows that 64 Kbps mono MP3 -- 1.45 bits per sample -- helps to eliminate about 1.45 bits of entropy in encoding of a lossless signal, so pretty much each bit is used. (It's worth noting that I've started with somewhat suboptimal lossless encoding scheme, but there is a room for improvement.)


Seems to me you're claiming you can (and already have) done as well as pure lossless coding if pretty much each bit is used

As for quantization noise, it just reduces mutual information, so there is no need to take it into account separately.


Ok I see what you're saying.  However, if you're not actually aiming to do better then existing formats, isn't this pretty much the same as mp3HD?  In that case its basically MP3 with a specifically defined deterministic decoder and a correction file.
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 17:26:25
Oh ok. That will be rather difficult,  but very interesting, if you get it working well.


Well I hope I can get something working quite easily (from kinda prosing results), but polishing it into usable state (e.g. optimizing) would be a hard part...

Quote
Quote
But I see no value in having different qualities.

Ain't that the point of your idea?


My point is that it is useful to have two available quality levels: 1) lossy and 2) lossless. (Or more with lossless at different sample rates.)

But having 1) lossy 2) less lossy 3) lossless is not so useful. I don't see use cases where people would want 'almost lossless' audio.

Well it might be useful different bitrates for lossy -- e.g. 64 Kbit for slow connections, 128 Kbit for average, 200 Kbit for fast ones. But I think this is a topic of lossy encoding, not hybrid/lossless. From what I've read in SLS paper it isn't particularly hard to implement as encoder already has all information, so lack of implementations means there is no need in this.
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-19 17:58:53
Seems to me you're claiming you can (and already have) done as well as pure lossless coding if pretty much each bit is used


As I've mentioned it is on top of quite suboptimal coding scheme: block transform encoder which makes no use of context (previous blocks).
I believe information available in lossy bitstream would overlap with information from previous blocks, so a better encoder which would use information from previous blocks would make less use from lossy bitstream and so some information will be wasted.

Ok I see what you're saying.  However, if you're not actually aiming to do better then existing formats, isn't this pretty much the same as mp3HD?  In that case its basically MP3 with a specifically defined deterministic decoder and a correction file.


Oh, I haven't heard about mp3HD. Yes, I guess it will be somewhat similar, although I'm going to make a more flexible tool.

I would love to make it better than existing formats, in fact that's what I was doing for a while , but it is hard.

Then again it depends on what you mean by existing formats. It would be much, much harder to compete with La than with FLAC.
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-19 18:23:58
Oh, I haven't heard about mp3HD. Yes, I guess it will be somewhat similar, although I'm going to make a more flexible tool.


mp3HD is the same as MPEG-4 SLS but applied to mp3 instead of AAC. A bit like mp3PRO and HE-AAC.
Title: Flexible 'Scalable to Lossless'?
Post by: _m²_ on 2011-07-19 18:36:25
Interesting. I use ogg for my portable player and ofr for archival and it would be nice to have something that has comparable strength (to the sum of ogg+ofr), player compatibility of ogg and frees me from the burden of maintaining 2 data sets.
Title: Flexible 'Scalable to Lossless'?
Post by: SebastianG on 2011-07-24 10:21:20
Quote
This difference is, presumably, smaller than what you get from FLAC because it uses information from lossy stream.
[...]
Goal here is to make
size(lossy)+size(correction) < size(lossy)+size(pure lossless),
i.e.
size(correction) < size(pure lossless).

But that's nothing special, is it? Your goal should be
size(lossy)+size(your_correction) < size(lossy)+size(wavpack_delta)
or something like this.
(where wackpack_delta is simply a wavpack-compressed difference signal).

I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.

But then again, I don't think that you could win a lot by doing something like this. Also, one would have to standardize lossy decoders to be all bit-exact. All in all, I don't think that this endavour is worth the hassle. Sure, you will learn a thing or two while trying. But at the end, you won't have a solution with convincing features compared to simply wavpacking the delta, for example.

Based on the things you have been hinting at here, at Usenet (comp.compression) and in private email, I'd say that you're still in very early experimental stages and not going to leave this stage anytime soon. From what I can tell, you still have a lot to learn.

Quote
It looks like a difficult problem from DSP perspective.

But luckily my background is not DSP but general applied math

Luckily?
I'd say that the lack of a DSP background is at your disadvantage.

Cheers!
SG
Title: Flexible 'Scalable to Lossless'?
Post by: C.R.Helmrich on 2011-07-24 14:07:33
Although, I wonder why SLS isn't more widespread. It's almost impossible to find an encoder and any support for decoding of it.

It takes time to reach the market  But fyi, MPEG-4 SLS = HD-AAC. http://forums.winamp.com/showthread.php?t=332010 (http://forums.winamp.com/showthread.php?t=332010)
Quote
HD-AAC Encoding is not included in this release but is still planned for a future release.


Quote from: Garf link=msg=0 date=
2. This capability is mostly useful for broadcasting, not end-users
3. Any AAC decoder will decode (the AAC part of) SLS

Isn't that why it's meant for end-users? So you can play your lossless MP4 files on e.g. a mobile player supporting only lossy MP4?

Chris
Title: Flexible 'Scalable to Lossless'?
Post by: Nick.C on 2011-07-24 14:17:58
Isn't that why it's meant for end-users? So you can play your lossless MP4 files on e.g. a mobile player supporting only lossy MP4?
I don't think that most users would want to be carrying about high bitrate lossless files when the player can only interpret the lossy core. If some quick "correction stripper" was available to only send to the portable player the lossy part then that would be advantageous.
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-24 16:18:12
I don't think that most users would want to be carrying about high bitrate lossless files when the player can only interpret the lossy core. If some quick "correction stripper" was available to only send to the portable player the lossy part then that would be advantageous.


That functionality is available.

But much of the most advanced part of SLS, namely the bitrate scalability down to actual bits per second, isn't something most users will care about. Replacing a lossless FLAC + MP3 by a single MP4 might appeal to some people, but I won't make predictions about the uptake.
Title: Flexible 'Scalable to Lossless'?
Post by: _m²_ on 2011-07-24 16:44:37
@SebastianG:
I don't think that your stricter rules are OK. I mean that even if it's worse than wavpack lossy + wavpack delta, it would still be worthwhile because hardware support for wavpack is low.
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-24 17:17:23
@SebastianG:
I don't think that your stricter rules are OK. I mean that even if it's worse than wavpack lossy + wavpack delta, it would still be worthwhile because hardware support for wavpack is low.


Replace wavpack lossy with lossyWAV and it makes perfect sense.
Title: Flexible 'Scalable to Lossless'?
Post by: _m²_ on 2011-07-24 19:18:42
I don't get it.
Does lossyWAV come with a hybrid mode?
Title: Flexible 'Scalable to Lossless'?
Post by: saratoga on 2011-07-24 20:21:25
I don't get it.
Does lossyWAV come with a hybrid mode?


If you subtract the lossyWAV file from the original file, you get a correct file that can be used to undo the lossyWAV step.
Title: Flexible 'Scalable to Lossless'?
Post by: Nick.C on 2011-07-24 20:32:38
.... or you can just select to create a correction file at the same time as processing.
Title: Flexible 'Scalable to Lossless'?
Post by: _m²_ on 2011-07-24 21:26:55
OK, thanks for the info. I'm going to test it at some point, sounds interesting.
http://wiki.hydrogenaudio.org/index.php?title=LossyWAV (http://wiki.hydrogenaudio.org/index.php?title=LossyWAV) could use some update.
Title: Flexible 'Scalable to Lossless'?
Post by: Woodinville on 2011-07-25 05:57:40
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-25 06:38:09
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


I'm not sure what exactly you are trying to say here.
Title: Flexible 'Scalable to Lossless'?
Post by: Woodinville on 2011-07-25 09:29:15
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


I'm not sure what exactly you are trying to say here.


Consider the difference between a perceptual codec and a lossless codec that does not understand perception. Now, what does SLS do at low rates, perceptual scaling ( most important perceptual information) or simple information-theoretic scaling?  And, if so, which kind of scaling?
Title: Flexible 'Scalable to Lossless'?
Post by: hellokeith on 2011-07-25 09:55:54
Killerstorm,

Sounds like a cool idea.  Go for it.

It's not like HA is raging daily with new ideas and development (no offense to those who are).
Title: Flexible 'Scalable to Lossless'?
Post by: SebastianG on 2011-07-25 10:03:55
@SebastianG:
I don't think that your stricter rules are OK.

I think you misunderstood what I was trying to say. WavPack was just an example of a lossless audio encoder. I did not mention it because of its "hybrid" feature. In both cases, "lossy" was supposed to be the exact same stream, i.e. an mp3 stream. So...

given an mp3 stream, for example, the goal should be to create something that is smaller than wavpack_encode(mp3_decode(mp3_stream)-original_wav) but still allows us -- in combination with the mp3_stream -- to reconstruct the original PCM signal.

Why? Because wavpack_encode(mp3_decode(mp3_stream)-original_wav) is already possible today and also allows us to reconstruct the original. We can consider this as baseline that the OP has to improve upon. Otherwise, I'd simply say: Why bother? (no bang for the buck).

Still, as I said here and elsewhere, I don't see this going anywhere for many reasons.

SG
Title: Flexible 'Scalable to Lossless'?
Post by: SebastianG on 2011-07-25 10:14:50
Consider the difference between a perceptual codec and a lossless codec that does not understand perception. Now, what does SLS do at low rates, perceptual scaling ( most important perceptual information) or simple information-theoretic scaling?  And, if so, which kind of scaling?

The only thing I know about SLS is that it uses the partially decoded AAC base layer as prediction for the intMDCT samples. Unfortunately, that is too little to be able to answer your riddle. I would have expected the SLS layers to uniformly (across different frequeucies) increase the signal-to-noise ratios. In that case, I would speak of "perceptual scaling" since the headroom between quantization noise and masking threshold uniformly increases. But now that you mention this, I can imagine that SLS works differently. If I had to decide which approach to aim for, I'd aim for the first one because it makes a lot of sense to me. But maybe it's not practical/possible.

SG

(intMDCT = bijective integer-to-integer MDCT approximation)
Title: Flexible 'Scalable to Lossless'?
Post by: Garf on 2011-07-25 10:25:50
I would have expected the SLS layers to uniformly (across different frequeucies) increase the signal-to-noise ratios. In that case, I would speak of "perceptual scaling" since the headroom between quantization noise and masking threshold uniformly increases. But now that you mention this, I can imagine that SLS works differently.


It's been a long time since I looked at the spec, but I do think it works as you describe. At very low rates I can imagine the masking threshold being far enough off that the base codec doesn't follow it very well and so the SLS bits added don't increase quality as much as one would hope, but it's still quite efficient. Even more so because the actual entropy coding should be better than that in the AAC base layer.
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-25 14:55:46
Quote
size(correction) < size(pure lossless).

But that's nothing special, is it?


Yes, but even a very simple tool which will decode lossy, align samples, compute delta and call other lossless compressor might be practically useful (unlike an abstract idea that it is doable).

Your goal should be
size(lossy)+size(your_correction) < size(lossy)+size(wavpack_delta)
or something like this.
(where wackpack_delta is simply a wavpack-compressed difference signal).


Good point.

Quote
I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.


Can't you get more-or-less same information by analyzing lossy waveform? Well, maybe format-specific analyzer can extract more useful information, but it would be more complex and not as flexible.

Quote
Based on the things you have been hinting at here, at Usenet (comp.compression) and in private email, I'd say that you're still in very early experimental stages and not going to leave this stage anytime soon. From what I can tell, you still have a lot to learn.


Well, I'm trying different ideas. But I don't need them all before I make something which satisfies criterion above.
Title: Flexible 'Scalable to Lossless'?
Post by: killerstorm on 2011-07-25 15:36:15
The only thing I know about SLS is that it uses the partially decoded AAC base layer as prediction for the intMDCT samples.


They use same transform (intMDCT) both for lossy and lossless parts (I guess bijective integer approximation is close enough), so prediction boils down to taking quantization into account.

You're right about the rest, here's a quote from the SLS paper:

Quote
In order to achieve the desirable scalability in perceptual
quality, MPEG-4 SLS adopts a rather straightforward
perceptual embedding coding principle, which is
illustrated in Figure 7. It can be seen that bit-plane
coding process is started from the most significant bit-
planes (i.e. the first non zero bit-planes) of all the sfb,
and progressively moves to lower bit-planes after
coding the current for all sfb. Consequently, during this
process, the energy of the quantization noise of each sfb
is gradually reduced by the same amount. As a result,
the spectral shape of the quantization noise, which has
been perceptually optimized by the core AAC encoder,
is preserved during bit-plane coding process.

Title: Flexible 'Scalable to Lossless'?
Post by: SebastianG on 2011-07-25 17:59:18
Quote

I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.

Can't you get more-or-less same information by analyzing lossy waveform?

No (*)

(* unless you also implement a psychoacoustic model that deterministically estimates the masking thresholds. But that's way beyond practical and just a bad approximation to the information that is already available in the compressed stream )
Title: Flexible 'Scalable to Lossless'?
Post by: Woodinville on 2011-07-27 04:05:28
It's been a long time since I looked at the spec, but I do think it works as you describe. At very low rates I can imagine the masking threshold being far enough off that the base codec doesn't follow it very well and so the SLS bits added don't increase quality as much as one would hope, but it's still quite efficient. Even more so because the actual entropy coding should be better than that in the AAC base layer.


The first half is the case. The second half is actually not so important, the base entropy coding in AAC is pretty good, except for scalefactors when the L-R signal is very small.