Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Flexible 'Scalable to Lossless'? (Read 21597 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Flexible 'Scalable to Lossless'?

Reply #25
@SebastianG:
I don't think that your stricter rules are OK. I mean that even if it's worse than wavpack lossy + wavpack delta, it would still be worthwhile because hardware support for wavpack is low.


Replace wavpack lossy with lossyWAV and it makes perfect sense.

Flexible 'Scalable to Lossless'?

Reply #26
I don't get it.
Does lossyWAV come with a hybrid mode?

Flexible 'Scalable to Lossless'?

Reply #27
I don't get it.
Does lossyWAV come with a hybrid mode?


If you subtract the lossyWAV file from the original file, you get a correct file that can be used to undo the lossyWAV step.

Flexible 'Scalable to Lossless'?

Reply #28
.... or you can just select to create a correction file at the same time as processing.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)


Flexible 'Scalable to Lossless'?

Reply #30
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?
-----
J. D. (jj) Johnston

Flexible 'Scalable to Lossless'?

Reply #31
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


I'm not sure what exactly you are trying to say here.

Flexible 'Scalable to Lossless'?

Reply #32
A major advantage that SLS has over your technique is that it is scalable (hence the Scalable to LossLess name). You can strip off bits from the SLS stream and quality degrades gracefully.


Well, it is not so true that the degredation is perceptual at low rates, now, is it?


I'm not sure what exactly you are trying to say here.


Consider the difference between a perceptual codec and a lossless codec that does not understand perception. Now, what does SLS do at low rates, perceptual scaling ( most important perceptual information) or simple information-theoretic scaling?  And, if so, which kind of scaling?
-----
J. D. (jj) Johnston

Flexible 'Scalable to Lossless'?

Reply #33
Killerstorm,

Sounds like a cool idea.  Go for it.

It's not like HA is raging daily with new ideas and development (no offense to those who are).

Flexible 'Scalable to Lossless'?

Reply #34
@SebastianG:
I don't think that your stricter rules are OK.

I think you misunderstood what I was trying to say. WavPack was just an example of a lossless audio encoder. I did not mention it because of its "hybrid" feature. In both cases, "lossy" was supposed to be the exact same stream, i.e. an mp3 stream. So...

given an mp3 stream, for example, the goal should be to create something that is smaller than wavpack_encode(mp3_decode(mp3_stream)-original_wav) but still allows us -- in combination with the mp3_stream -- to reconstruct the original PCM signal.

Why? Because wavpack_encode(mp3_decode(mp3_stream)-original_wav) is already possible today and also allows us to reconstruct the original. We can consider this as baseline that the OP has to improve upon. Otherwise, I'd simply say: Why bother? (no bang for the buck).

Still, as I said here and elsewhere, I don't see this going anywhere for many reasons.

SG

Flexible 'Scalable to Lossless'?

Reply #35
Consider the difference between a perceptual codec and a lossless codec that does not understand perception. Now, what does SLS do at low rates, perceptual scaling ( most important perceptual information) or simple information-theoretic scaling?  And, if so, which kind of scaling?

The only thing I know about SLS is that it uses the partially decoded AAC base layer as prediction for the intMDCT samples. Unfortunately, that is too little to be able to answer your riddle. I would have expected the SLS layers to uniformly (across different frequeucies) increase the signal-to-noise ratios. In that case, I would speak of "perceptual scaling" since the headroom between quantization noise and masking threshold uniformly increases. But now that you mention this, I can imagine that SLS works differently. If I had to decide which approach to aim for, I'd aim for the first one because it makes a lot of sense to me. But maybe it's not practical/possible.

SG

(intMDCT = bijective integer-to-integer MDCT approximation)

Flexible 'Scalable to Lossless'?

Reply #36
I would have expected the SLS layers to uniformly (across different frequeucies) increase the signal-to-noise ratios. In that case, I would speak of "perceptual scaling" since the headroom between quantization noise and masking threshold uniformly increases. But now that you mention this, I can imagine that SLS works differently.


It's been a long time since I looked at the spec, but I do think it works as you describe. At very low rates I can imagine the masking threshold being far enough off that the base codec doesn't follow it very well and so the SLS bits added don't increase quality as much as one would hope, but it's still quite efficient. Even more so because the actual entropy coding should be better than that in the AAC base layer.

Flexible 'Scalable to Lossless'?

Reply #37
Quote
size(correction) < size(pure lossless).

But that's nothing special, is it?


Yes, but even a very simple tool which will decode lossy, align samples, compute delta and call other lossless compressor might be practically useful (unlike an abstract idea that it is doable).

Your goal should be
size(lossy)+size(your_correction) < size(lossy)+size(wavpack_delta)
or something like this.
(where wackpack_delta is simply a wavpack-compressed difference signal).


Good point.

Quote
I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.


Can't you get more-or-less same information by analyzing lossy waveform? Well, maybe format-specific analyzer can extract more useful information, but it would be more complex and not as flexible.

Quote
Based on the things you have been hinting at here, at Usenet (comp.compression) and in private email, I'd say that you're still in very early experimental stages and not going to leave this stage anytime soon. From what I can tell, you still have a lot to learn.


Well, I'm trying different ideas. But I don't need them all before I make something which satisfies criterion above.

 

Flexible 'Scalable to Lossless'?

Reply #38
The only thing I know about SLS is that it uses the partially decoded AAC base layer as prediction for the intMDCT samples.


They use same transform (intMDCT) both for lossy and lossless parts (I guess bijective integer approximation is close enough), so prediction boils down to taking quantization into account.

You're right about the rest, here's a quote from the SLS paper:

Quote
In order to achieve the desirable scalability in perceptual
quality, MPEG-4 SLS adopts a rather straightforward
perceptual embedding coding principle, which is
illustrated in Figure 7. It can be seen that bit-plane
coding process is started from the most significant bit-
planes (i.e. the first non zero bit-planes) of all the sfb,
and progressively moves to lower bit-planes after
coding the current for all sfb. Consequently, during this
process, the energy of the quantization noise of each sfb
is gradually reduced by the same amount. As a result,
the spectral shape of the quantization noise, which has
been perceptually optimized by the core AAC encoder,
is preserved during bit-plane coding process.


Flexible 'Scalable to Lossless'?

Reply #39
Quote

I think it's possible. I would try to estimate the temporal and spectral shape of the "delta noise" based on the information you can find in the lossy stream (scale factors and code books for MP3, for example) in order to reduce the amount of required side information in the "correction layer". Side information could be prediction coefficients and prediction residual power for an LPC-based coder.

Can't you get more-or-less same information by analyzing lossy waveform?

No (*)

(* unless you also implement a psychoacoustic model that deterministically estimates the masking thresholds. But that's way beyond practical and just a bad approximation to the information that is already available in the compressed stream )

Flexible 'Scalable to Lossless'?

Reply #40
It's been a long time since I looked at the spec, but I do think it works as you describe. At very low rates I can imagine the masking threshold being far enough off that the base codec doesn't follow it very well and so the SLS bits added don't increase quality as much as one would hope, but it's still quite efficient. Even more so because the actual entropy coding should be better than that in the AAC base layer.


The first half is the case. The second half is actually not so important, the base entropy coding in AAC is pretty good, except for scalefactors when the L-R signal is very small.
-----
J. D. (jj) Johnston