Near-lossless / lossy FLAC

Topic: Near-lossless / lossy FLAC (Read 183629 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Near-lossless / lossy FLAC

Reply #75 – 2007-06-15 19:10:44

Three rather unrelated but still on-topic comments:

(1) I'd like to note that it's not only the "frame size" that should match. This preprocessor and any lossless encoder exploiting zeroed LSBs should be in perfect sync (not only the same frame sizes but the same frame boundary positions).

(2) It's nice to have those isolated tools ("simplifier" and lossless encoder) but this also limits the performance. So one should either go for a combined tool with variable length blocks or a modified lossless encoder which is smart enough to detect varying "wasted_bits" and partitions the stream accordingly.

(3) Here's another technical thought which might be interesting for Thomas in case he wants to add lossy support to TAK:
Selecting "wasted_bits" to be an integer allows an encoder to control the signal-to-noise ratio in steps of 6 dB only. Compared to other lossy codecs (MP3, AAC control the SNR in steps of roughly 1.1 dB = 1.5*(3/4) dB) this 6 dB step size is quite large. This is an old idea of mine of how to get more resolution: Make it probabilistic. You can store in each frame or subframe (you might want to allow changing the resolution within a frame) the information "wasted_bits = x with probability p and x+1 with probability (1-p)" and use the same pseudo-random number generator in encoder and decoder for deciding the "wasted_bits" value per sample. Also you should think about generating the actual "wasted bits" via this RNG instead of zeroing them. This would be equivalent to subtractive dithering and avoids nonlinear distortions. Entropy coding might be a bit more complicated, though.

Per sample coding could be done like this:

Code: [Select]

wbits = minWasted + RNG.nextfloat()>p ? 1 : 0;  // randomly chosen wasted bits count
waste = RNG.nextIntBits(wbits); // randomly generated LSBs
quantized_to_code = round( (float)(current_sample-waste) / 2^wbits ); // sample to code
quantized_actual = (quantized_to_code << wbits) + waste: // dequantized sample

Of course, the encoder's RNG state should match the decoder's (ie. same seed).
Good news: Noise shaping doesn't need to be part of the format specification but can later be added to the encoder without breaking anything.

Cheers!
SG

Near-lossless / lossy FLAC

Reply #76 – 2007-06-15 19:54:41

Quote from: SebastianG on 2007-06-15 19:10:44

(3) Here's another technical thought which might be interesting for Thomas in case he wants to add lossy support to TAK:
Selecting "wasted_bits" to be an integer allows an encoder to control the signal-to-noise ratio in steps of 6 dB only. Compared to other lossy codecs (MP3, AAC control the SNR in steps of roughly 1.1 dB = 1.5*(3/4) dB) this 6 dB step size is quite large. This is an old idea of mine of how to get more resolution: Make it probabilistic. You can store in each frame or subframe (you might want to allow changing the resolution within a frame) the information "wasted_bits = x with probability p and x+1 with probability (1-p)" and use the same pseudo-random number generator in encoder and decoder for deciding the "wasted_bits" value per sample. Also you should think about generating the actual "wasted bits" via this RNG instead of zeroing them. This would be equivalent to subtractive dithering and avoids nonlinear distortions. Entropy coding might be a bit more complicated, though.

That's a very nice idea!

But if i build a dedicated lossy codec, i am -unlike the preprocessor- not restricted regarding the scale factors. I may divide the signal by any value i like and therefore can have a quite high resolution of the signal-to-noise steps. My very old experimental implementation of a lossy codec was using about 2 dB.

Thomas

Near-lossless / lossy FLAC

Reply #77 – 2007-06-15 20:09:15

Very true. Although, what integer "scalefactor" is between 1 and 2 ?
BTW, the probabilistic approach can and was initially intended to be used for steganography.

Cheers!
SG

Near-lossless / lossy FLAC

Reply #78 – 2007-06-15 20:22:52

Quote from: SebastianG on 2007-06-15 20:09:15

Very true. Although, what integer "scalefactor" is between 1 and 2 ?

As many as the resolution of my fixed-point integer arithmetic (multiplication with a factor 1 / "ScaleFactor") permits? I am not restricted to integer values.

But i forgot one thing: If you want to have a correction file, it's much easier and more efficient to use the integer bit removal approach and then i would favourize your clever idea.

Thomas

Near-lossless / lossy FLAC

Reply #79 – 2007-06-15 22:59:11

Quote from: TBeck on 2007-06-15 20:22:52

As many as the resolution of my fixed-point integer arithmetic (multiplication with a factor 1 / "ScaleFactor") permits? I am not restricted to integer values.

Hmm... Havn't thought of that, tbh.

Quote from: TBeck on 2007-06-15 20:22:52

But i forgot one thing: If you want to have a correction file, it's much easier and more efficient to use the integer bit removal approach

Hmm... I see what you mean. However, this correction file stuff is not easily combined with noise shaping. No matter how you do this correction (coding of either unfiltered or filtered error samples). I wonder how MPEG4-SLS is going to solve that problem. Prior mennos comment I believed SLS to be some kind of AAC + IntMDCT mix.

Cheers!
SG

Near-lossless / lossy FLAC

Reply #80 – 2007-06-16 22:48:10

I like the subtractive dither idea (though of course it's useless with a pre-processor approach), but I doubt you'll find distortion without dither in this application, so it's more a case of "nice ot have, just in case" rather than essential.

I don't like the idea of noise shaping in this application (though it could depend what you mean by noise shaping). I guess you can draw the line between pure lossless, and lowest possible transparent bitrate lossy anywhere you like - but the cleverer you get, the closer you get to mp3 etc - one of the points of this is that, with nothing "clever" going on, there's nothing there to unexpectedly interfere with something (anything) downstream. It could be a great format for transcoding.

You could add more clever stuff as an option, but I think it would be extremely useful to keep the minimalist aproach I've outlined as an option too.

I know it's a tough call, but I was hoping for some ABX results. IMO there's not much point proceding further until the best ears at HA have commented!

Cheers,
David.

Near-lossless / lossy FLAC

Reply #81 – 2007-06-17 15:29:58

Quote from: 2Bdecided on 2007-06-16 22:48:10

I know it's a tough call, but I was hoping for some ABX results. IMO there's not much point proceding further until the best ears at HA have commented!

Oh yes, please...

Unfortunately i have lost my "Golden Ears" in the last years (that's why i have dropped TAK's earlier lossy mode), therefore i can't help with this although i would be very excited to do so.

I am really thinking about the possibility to add a lossy mode to TAK. But i would only like to do it, if transparency can be achieved at bitrates i myself would regard as small enough to be useful. Therefore i am very interested into some ABX results.

Later i may also provide an alternative preprocessor application (in another thread, no hijacking) which is based upon my earlier lossy approach. It's working quite different than your preprocessor. It would be interesting to compare the results.

Thomas

Near-lossless / lossy FLAC

Reply #82 – 2007-06-17 18:09:19

Quote from: TBeck on 2007-06-17 15:29:58

... I am really thinking about the possibility to add a lossy mode to TAK. ...

Wonderful.
IMO extremely high quality lossy variants of lossless codecs are most attrative today now that mass storage is so cheap. If a robust extremely high quality is achievable at ~ 350 kbps this is still very attractive compared to lossless. This is especially true if such a codec is available for DAPs (for mere PC use we're already in a state where many people can use a lossless codec). At the moment availability for DAPs de facto means: it's available with Rockbox firmware.

Near-lossless / lossy FLAC

Reply #83 – 2007-06-18 01:54:16

Quote from: halb27 on 2007-06-17 18:09:19

If a robust extremely high quality is achievable at ~ 350 kbps this is still very attractive compared to lossless.

I doubt, that this is possible. I would expect 400 to 450 kbps (on average) to be sufficient.

But we will never know, if nobody evaluates 2Bdecided's files...

BTW: I spend this day building a quick and dirty preprocessor which implements a variation of my old lossy approach:

- Preprocessing of files with output to a wave file. Hi 2Bdecided, your idea is really clever!
- Shows you, which bit rate TAK's default setting would achieve when compressing the output file.
- Select a quality level.
- Choose between 2 different filters.

Unfortunately it doesn't make sense to release the preprocessor, if nobody likes to evaluate the results of such software...

Thomas

Near-lossless / lossy FLAC

Reply #84 – 2007-06-18 07:11:56

Quote from: TBeck on 2007-06-18 01:54:16

Quote from: halb27 on 2007-06-17 18:09:19

If a robust extremely high quality is achievable at ~ 350 kbps this is still very attractive compared to lossless.

I doubt, that this is possible. I would expect 400 to 450 kbps (on average) to be sufficient.
But we will never know, if nobody evaluates 2Bdecided's files...

BTW: I spend this day building a quick and dirty preprocessor which implements a variation of my old lossy approach:
....
Unfortunately it doesn't make sense to release the preprocessor, if nobody likes to evaluate the results of such software...

'~ 350 kbps' was not meant to be achieved with 2Bdecided's preprocessor exactly though that would be great. When I wrote that I had a lossy mode of TAK in mind you were thinking about.
shadowking has done listening tests, and me too spent quite some time with listening tests on various samples 2Bdecided gave us. So 'nobody likes to evaluate ...' is not totally correct though sure a lot more testing would be welcome. I have a lot of hope that Porcupine will join us. He has a very good feeling what to look at to find the weaknesses of these kind of codecs.
I'd love to learn to know your preprocessor, and I'm willing to do some listening tests.

Near-lossless / lossy FLAC

Reply #85 – 2007-06-18 07:30:10

I don't expect problems at 550k. Even shorten and rkau with lossy were fine at those bitrates when I played with them, but not good at 350k. If some data reduction is needed and one is content with 450~550k bitrate then this preprocessor will work fine. If that is its goal I don't see a problem. The goals of Dualstream and to an extent Wavpack lossy are different. Can this preprocessor be ported to meet those goals ?

Near-lossless / lossy FLAC

Reply #86 – 2007-06-18 09:52:42

Quote from: shadowking on 2007-06-18 07:30:10

I don't expect problems at 550k. Even shorten and rkau with lossy were fine at those bitrates when I played with them, but not good at 350k. If some data reduction is needed and one is content with 450~550k bitrate then this preprocessor will work fine. If that is its goal I don't see a problem. The goals of Dualstream and to an extent Wavpack lossy are different. Can this preprocessor be ported to meet those goals ?

The important thing about my approach is that it's pure VBR.

If it works, the bitrate will be whatever the bitrate will be. You will have no control over it - it'll be completely content dependent.

You can shift the threshold upwards to decrease the bitrate, but then you'll introduce audible noise. Simple as that.

(More usefully, you can increase the bitrate by shifting the threshold downwards - this could be useful to allow multiple generations of coding. Also, if I've set the threshold in the wrong place to start with, it will have to be lowered by default - a reason to ABX!)

If my approach doesn't work, then it will be ABXable sometimes. I don't expect problems, but I don't think shadowking you can imply there's no need for people to ABX just because the bitrate is usually high. You can have a high bitrate file with audible artefacts!

If my answer hasn't covered what you had in mind, let me know what you meant by "the goals of Dualstream and to an extent Wavpack lossy".

Cheers,
David.

Near-lossless / lossy FLAC

Reply #87 – 2007-06-18 10:28:43

Quote from: TBeck on 2007-06-18 01:54:16

Unfortunately it doesn't make sense to release the preprocessor, if nobody likes to evaluate the results of such software...

I'll have a play, if you can release it.

The things you've already discussed (taking my basic idea and enhancing it when implementing it within TAK) sound quite exciting. If you could include more ideas from your own lossy work that could be even better.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #88 – 2007-06-18 10:32:27

Ok. I understand now. Goal is transparent pure vbr without bitrate control.

Can you try these:

http://64.41.69.21/technical/reference/keys_1644ds.wav

http://64.41.69.21/technical/reference/keys_2496.wav

Old artificial sample. Seems to be an exclusive optimfrog / wavpack problem. Noise at 350 k .. I abxed at 450k but fail at 512k. Advanced noise shaping (up) works wonders for both encoders.

Near-lossless / lossy FLAC

Reply #89 – 2007-06-18 10:51:15

As the basic premise of reducing bitdepth "transparently" will apply to all lossless codecs, I would very much like to see a standalone implementation of the method.

Near-lossless / lossy FLAC

Reply #90 – 2007-06-18 10:59:09

Quote from: SebastianG on 2007-06-15 22:59:11

Hmm... I see what you mean. However, this correction file stuff is not easily combined with noise shaping. No matter how you do this correction (coding of either unfiltered or filtered error samples). I wonder how MPEG4-SLS is going to solve that problem. Prior mennos comment I believed SLS to be some kind of AAC + IntMDCT mix.

In SLS the correction/scalability has nothing to do with the AAC core (except defining the lower limit ). The quantised AAC spectrum serves as a starting point for SLS, but there is nothing to scale in that. The scalability in SLS comes from the fact that the frames are entropy coded in bitplanes. So first all MSB's in the frame are encoded, etc, upto the LSB. Scaling can then be achieved by simply removing some bytes from the end of the frame. Correction files (tracks) can be made by copying bytes from the end of each frame to another file or track.
The trick proposed in this threat can basically be done as post-processing in SLS instead of pre-processing.

Near-lossless / lossy FLAC

Reply #91 – 2007-06-18 11:09:43

Funny is also that SLS does not manage to take any advantage of this pre-processor

Near-lossless / lossy FLAC

Reply #92 – 2007-06-18 12:00:08

Quote from: shadowking on 2007-06-18 10:32:27

Ok. I understand now. Goal is transparent pure vbr without bitrate control.

Exactly.

Quote

Can you try these:

http://64.41.69.21/technical/reference/keys_1644ds.wav

http://64.41.69.21/technical/reference/keys_2496.wav

Old artificial sample. Seems to be an exclusive optimfrog / wavpack problem. Noise at 350 k .. I abxed at 450k but fail at 512k. Advanced noise shaping (up) works wonders for both encoders.

Wow! They're killer samples for this algorithm, and FLAC itself. I think they're still transparent (can you try ABX please?) but look at the bitrates (all FLAC)...

keys_1644ds:
lossless: 1078kbps (ratio=0.764)
lossy: 829kbps (ratio=0.587)

keys_2496:
lossless: 4587kbps (ratio=0.995!!!)
lossy: 1742kbps (ratio=0.378)

Note: There's a mistake in the MATLAB script I've posted when the FFT size is larger than the lossless block size (as it is at 96kHz sampling with the parameters I was using). I believe I've fixed it for your sample, but I'll check the script more thoroughly with other sample rates before I post an update.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #93 – 2007-06-18 12:47:42

More info and warnings about these samples here:

http://64.41.69.21/technical/sample_rates/index.htm..

It seems they are transparent ! - bitrate is insane but thats unlimited VBR for you. Dualstream quality 5 was where I couldn't get good results in the past - these 'samples' maybe tricking the optimfrog model. Bitrate is 458 k.

Optimfrog Wavpack has much better compression on these. That's why vbr is very important on lower compression codecs and modes like FLAC. With optimfrog, MAC, TAK and Wavpacks -hx modes, even the 'end to all' cases will do fine with ABR 400~500k simply because they will compress well. In flac (or wavpack fast / normal modes) such a 'fixed' bitrate will trigger strong noise.

Near-lossless / lossy FLAC

Reply #94 – 2007-06-18 13:00:21

Quote from: halb27 on 2007-06-18 07:11:56

Quote from: TBeck on 2007-06-18 01:54:16

....
Unfortunately it doesn't make sense to release the preprocessor, if nobody likes to evaluate the results of such software...

...
shadowking has done listening tests, and me too spent quite some time with listening tests on various samples 2Bdecided gave us. So 'nobody likes to evaluate ...' is not totally correct though sure a lot more testing would be welcome. I have a lot of hope that Porcupine will join us. He has a very good feeling what to look at to find the weaknesses of these kind of codecs.
I'd love to learn to know your preprocessor, and I'm willing to do some listening tests.

Big sorry halb27!

I should have split the post. My complaining about the lack of testers was not directed to you!

Quote from: 2Bdecided on 2007-06-18 10:28:43

Quote from: TBeck on 2007-06-18 01:54:16

Unfortunately it doesn't make sense to release the preprocessor, if nobody likes to evaluate the results of such software...

I'll have a play, if you can release it.

Fine!

Quote from: 2Bdecided on 2007-06-18 10:28:43

The things you've already discussed (taking my basic idea and enhancing it when implementing it within TAK) sound quite exciting.

For now i took your great preprocessor idea to test my own, very simple approach for the determination of the wasted bit count. My approach has possibly more in common with the method described by SebastianG. I would have liked to add your code too, but unfortunately i don't know nothing about MathLab and also not very much about DSP, therefore this was out of the scope of a rainy afternoons work.

Quote from: 2Bdecided on 2007-06-18 10:28:43

If you could include more ideas from your own lossy work that could be even better.

I am not sure if you are already doing something like this: Because 1024 samples are still quite much, i am partitioning the frame into blocks of 128 or 256 samples. Each block is beeing analyzed and i am using the lowest wasted bit count result for the whole frame. Safety first.

Thomas

Near-lossless / lossy FLAC

Reply #95 – 2007-06-18 13:51:31

Quote from: TBeck on 2007-06-18 13:00:21

I am not sure if you are already doing something like this: Because 1024 samples are still quite much, i am partitioning the frame into blocks of 128 or 256 samples. Each block is beeing analyzed and i am using the lowest wasted bit count result for the whole frame. Safety first.

I'm doing the same, but with 2 FFT size...

My analysis is (ignoring bugs in the code!) independent of lossless frame size. The thresholds are calculated using 20ms and 1.5ms FFTs. Looking at all the results, the lowest wasted_bits requirement within the frame is the one that is chosen.

I'm sure you could follow the MATLAB - it's only BASIC with some clever array handling. I'll try to add some more comments to the code when I get time.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #96 – 2007-06-18 13:54:57

A bit O/T, but I'm a reformed Pascal / Assembler hobby programmer and find the whole concept of playing with audio quite appealing - however, not £1350 appealing (commercial Matlab licence cost). I found Scilab and FreeMAT almost immediately - which would you recommend as a being easier to port the Matlab code to?

Near-lossless / lossy FLAC

Reply #97 – 2007-06-18 14:51:32

I haven't tried myself, but GNU Octave is supposed to be good.

EDIT: The first problem is getting audio data in. I use my own routines, hacked from the basic MATLAB routines wavread and wavwrite, to add the ability to handle various sample rates, bit depths, and numbers of channels. Since they're hacks of copyrighted code, I don't feel comfortable sharing them, but wavread and wavwrite are good for 44.1kHz 16-bit stereo.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #98 – 2007-06-18 15:19:12

Would it be ok to discuss my preprocessor in this thread?

I don't want to perform some kind of hijacking, but it seems to be the right context.

Thomas

P.S.: It sems to be possible to attach files to ordinary threads, but how to do it? Have i to belong to another member group to be allowed to?

Near-lossless / lossy FLAC

Reply #99 – 2007-06-18 17:10:23

I think "Developers" can upload files anywhere. "Members" can only upload in the uploads forum. I guess you can PM a moderator to become a "Developer".

Of course you can discuss your pre-processor here, though if we're both going to get people to ABX, it might get a bit confusing. I guess it depends if you expect them to merge, or not.

I am not an open source zealot, but for a useful discussion, you'll have to share pretty much exactly what it's doing - otherwise there's little hope of finding relevant problem samples without doing an exhaustive test.

Cheers,
David.

Notice