FLAC compression improvement patch

Topic: FLAC compression improvement patch (Read 45124 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

FLAC compression improvement patch

Reply #25 – 2014-08-10 22:14:19

Quote from: ChronoSphere on 2014-08-10 22:03:04

I am not sure I completely understand. I tried encoding a file with these parameters and for me the "-8 -A tukey(0.5) -A partial_tukey(2) -A punchout_tukey(3)" gave a worse compression rate than simply using -8... Or are those benchmarks with your custom code applied?

Yes, this is with new code, so it doesn't work unless you compiled from git with the patch I mailed to the flac-dev applied. The fact that you didn't get an error is not really a bug, but it's not convenient either: any apodization function that is not valid is silently dropped. You've probably been using -8 -A tukey(0.5), or even -8 -A tukey(0) (which is rectangle) if your locale has a comma as decimal separator.

Quote

edit: I'd still want an option to allow me to do an exhaustive search, even if the gains are minimal.

Sure, you'd just use -8 -e

FLAC compression improvement patch

Reply #26 – 2014-08-11 13:57:36

Quote from: ktf on 2014-08-10 22:14:19

Quote
edit: I'd still want an option to allow me to do an exhaustive search, even if the gains are minimal.

Sure, you'd just use -8 -e

Doesn't -8 already imply -e? In fact doesn't -7 already include -e?

FLAC compression improvement patch

Reply #27 – 2014-08-11 14:39:22

Quote from: lithopsian on 2014-08-11 13:57:36

Quote from: ktf on 2014-08-10 22:14:19
Quote
edit: I'd still want an option to allow me to do an exhaustive search, even if the gains are minimal.

Sure, you'd just use -8 -e

Doesn't -8 already imply -e? In fact doesn't -7 already include -e?

It does at the moment. However, KTF suggests replacing -e with the new apodization functions for the -8 preset. Combining the new functions (which improve compression more than -e) with -e provides only 0.1% additional compression at a significant speed penalty. Essentially -e is not that effective any more because the new functions have already covered most of the improvement.

It makes sense to me to remove -e from the preset when the new apodizations functions come in.

FLAC compression improvement patch

Reply #28 – 2014-08-11 14:40:03

Currently, it does. But ktf is suggesting to drop it in case his patch gets accepted, because the gain is not worth the encoding speed hit for most users.

edit: bah, too late.

FLAC compression improvement patch

Reply #29 – 2014-08-11 15:39:07

So you propose

Code: [Select]

-6 == -l 8 -b 4096 -m -r 6 -A tukey(0.5) -A partial_tukey(2)
-7 == -l 8 -b 4096 -m -r 6 -A tukey(0.5) -A partial_tukey(2) -A punchout_tukey(3)

According to my tests new -7 preset is only slightly better than new -6 but noticeably slower.

Maybe it's better to define it as

Code: [Select]

-7 == -l 12 -b 4096 -m -r 6 -A tukey(0.5) -A partial_tukey(2)

?
Faster encoding, better compression but slightly lower decoding speed.

FLAC compression improvement patch

Reply #30 – 2014-08-11 16:34:29

I was considering something similar as well, but seeing that the differences are not that big, I thought it was a good idea to at least keep the decoding behaviour the same.

The thing is, the gap between -6 and -7 is currently that small too, at least, with my dataset. Currently, changing from -6 to -7 increases the compression by 0.083 percentage point at the cost of dropping the speed by 0.46x With this proposal, changing from -6 to -7 increases compression by 0.065 percentage point at the cost of dropping the speed by 0.65x

Code: [Select]

           ----- Setting -----                                                  , compr , encsp, decsp
-8 --no-exhaustive-model-search -A tukey(0.5);partial_tukey(2);punchout_tukey(3), 56.511, 64.0 , 379.9
-7 --no-exhaustive-model-search -A tukey(0.5);partial_tukey(2);punchout_tukey(3), 56.647, 74.5 , 403.0
-8                                                                              , 56.690, 60.1 , 379.0
-6 -A tukey(0.5) -A partial_tukey(2)                                            , 56.712, 115.2, 399.9
-7                                                                              , 56.833, 84.2 , 402.6
-6                                                                              , 56.916, 182.6, 402.3

But of course, I'd like to hear what your results are! I was just seeing the very small difference between -5 and -6 and the gain between -7 and -8 and thought: maybe this could be spread out more evenly.

FLAC compression improvement patch

Reply #31 – 2014-08-11 16:34:34

@ktf
Did you compare the compression results to Grigory's flacCL? He must use some kind of brute forcing at -8 also. Maybe worth a look into its code.

FLAC compression improvement patch

Reply #32 – 2014-08-11 17:34:46

Quote from: ktf on 2014-08-11 16:34:29

But of course, I'd like to hear what your results are!

"old" means current preset, "ktf" means new presets proposed by ktf, "my" is the preset from post 30: -l 12 -b 4096 -m -r 6 -A tukey(0.5) -A partial_tukey(2)

Code: [Select]

setting     comp.ratio   enc.speed
-----------------------------------
old -5      64.61        331.9
old -6      64.61        324.2
old -7      64.55        185.3
old -8      64.37        141.9

ktf -6      64.44        228.7
ktf -7      64.41        158.6
ktf -8      64.23        133.7

my  -7      64.28        200.3

FLAC compression improvement patch

Reply #33 – 2014-08-11 18:05:32

BTW: it's possible to write tukey(0.5) in a locale-independent way: tukey(5e-1)

FLAC compression improvement patch

Reply #34 – 2014-08-11 18:25:49

Quote from: Wombat on 2014-08-11 16:34:34

Maybe worth a look into its code.

I might, but as getting known to a piece of code takes quite some time, not very soon

Quote from: lvqcl on 2014-08-11 17:34:46

[...]

You're right, it's probably a better idea not to hold on to keeping decoding exactly the same. FLAC's compression settings are already littered with these archaic non-LPC presets that look weird/bad in comparisons with other lossless codecs.

Quote from: lvqcl on 2014-08-11 18:05:32

BTW: it's possible to write tukey(0.5) in a locale-independent way: tukey(5e-1)

If I'd thought of that a week ago... The whole locale point/comma thing is confusing. If the USA finally convert to SI, I guess the Dutch (among others) should switch to a different number format. It's just confusing to have to work with a point in certain applications and most calculators, and a comma in others.

FLAC compression improvement patch

Reply #35 – 2014-08-11 18:34:49

I thought that FLACCL is GPU-accelerated Flake. But I'm not 100% sure.

FLAC compression improvement patch

Reply #36 – 2014-08-11 18:46:18

Quote from: lvqcl on 2014-08-11 18:34:49

I thought that FLACCL is GPU-accelerated Flake. But I'm not 100% sure.

CUETools.Flake kind of started as a C# port of flake, but by now it is a completely different kind of beast. FLACCL is a GPU-accelerated CUETools.Flake.
I would be more interested to see how CUETools.Flake compares to FLAC with proposed changes at -7 and -8, because FLACCL cannot be directly compared performance-wise.

FLAC compression improvement patch

Reply #37 – 2014-08-12 01:36:34

This newer "CUETools.Flake.exe" i never tried. It is indeed a different animal as the old flake. At -5/-6 it shows compression similar or better to -8 with the regular git build i have of lately. Higher compression comes with a speed penalty but reaches roughly the same compression as the speedy flacCL.

@ktf
I remember you asked abaout variable blocksize. This CUETools.Flake version has it build in working. If the implementation is correct one can expect ~0.25% better compression, not more.

I only encoded a few albums so these numbers may not be of to much help.

FLAC compression improvement patch

Reply #38 – 2014-08-21 16:34:51

What i like about punchout_tukey, is that you don't actually have to calculate autocorrelation coeffs 3 times for it's 3 variants. About 1/3 of the each of those windows consists of zeroes you can easily ignore during calculation, eliminating almost 1/3 of the work, making it seem like two window functions, not three. And being smart about it, you can also eliminate another almost 1/3 of the work, by calculating segments that consist entirely of ones only once instead of twice (for two of the 3 variants that don't have this segment zeroed-out). So you basically get all 3 autocorrelation coeff sets for the price of one. Plus you can reconstruct the coeffs for the full tukey almost for free from the same material.

Also, you don't have to actually use all 3 variants to calculate residual. You can choose the best variant just by looking at prediction error coeffs that are side-product of lpc coeff calculation.

My initial tests show that using those window functions and some of those tricks i can achieve the same level of compression as CUETools.Flake -8 at 2/3 of the CPU time, or i can get at least ~0.04 percentage points better compression at the same speed. Maybe this doesn't sound like much, but you have to take into account CUETools.Flake -8 was already doing much better than flac -8.

FLAC compression improvement patch

Reply #39 – 2014-08-21 16:45:35

It's a pity that libFLAC probably cannot make use of these optimisations, as the whole LPC encoding stage would have to be rewritten. For now at least I try to be as unobtrusive as I can with these changes. The window stage was plugged in later as was explained in the thread linked earlier in this discussion by TBeck, and because of that it isn't very flexible. In fact, this whole work-around probably wouldn't exists if the windowing stage was a little more flexible

About the residual calculation: libFLAC does calculate the residual only once per frame anyway. See stream_encoder.c in the libFLAC source code:

Code: [Select]

/* Exact Rice codeword length calculation is off by default.  The simple
 * (and fast) estimation (of how many bits a residual value will be
 * encoded with) in this encoder is very good, almost always yielding
 * compression within 0.1% of exact calculation.
 */
#undef EXACT_RICE_BITS_CALCULATION

This estimation is based on the prediction error. I have checked, gains seem to be <0.01%, so it works really well.

FLAC compression improvement patch

Reply #40 – 2014-08-21 16:50:06

Quote from: ktf on 2014-08-21 16:45:35

This estimation is based on the LPC error. I have checked, gains seem to be <0.01%, so it works really well.

I thought this estimation is a kind of dry run of residual calculation, where residual is calculated, but only the sums are stored. So it is fast, but still it processes the whole frame for each window, which can be eliminated.

FLAC compression improvement patch

Reply #41 – 2014-08-21 16:58:38

Quote from: Gregory S. Chudov on 2014-08-21 16:50:06

I thought this estimation is a kind of dry run of residual calculation, where residual is calculated, but only the sums are stored. So it is fast, but still it processes the whole frame for each window, which can be eliminated.

You're right, the undef I quoted only removes the residual as a parameter to set_partitioned_rice, not to precompute_partition_info_sums_. My bad.

edit: Now I think of it, that optimisation might be relatively easy to implement in libFLAC.

FLAC compression improvement patch

Reply #42 – 2014-08-21 18:03:55

Does it make sense to build and upload flac.exe with new settings for -6...-8? (so that anyone can experiment with them)

FLAC compression improvement patch

Reply #43 – 2014-08-21 19:59:58

If it ends up accepted into the official branch, I think it could make sense. Just like we played around with that optimized git version. I'd be interested in running my own benches anyway, I think there are others, too.

FLAC compression improvement patch

Reply #44 – 2014-08-21 20:55:09

Quote from: ChronoSphere on 2014-08-21 19:59:58

If it ends up accepted into the official branch, I think it could make sense.

The patch has been posted to the dev list but it has been awfully quiet there the last two weeks.

FLAC compression improvement patch

Reply #45 – 2014-08-21 21:26:48

Yes, I saw that. That's why I said if
Maybe people are on vacation, it's august after all...

FLAC compression improvement patch

Reply #46 – 2014-08-22 10:43:15

Quote from: lvqcl on 2014-08-21 18:03:55

Does it make sense to build and upload flac.exe with new settings for -6...-8? (so that anyone can experiment with them)

Such a binary should be clearly marked I think, both in vendor string as well as the command line output. As far as I can see, the changes this patch introduces are in non-critical area's (i.e., it should be impossible for this change to create non-compliant, broken or corrupted files) but just to be sure it should probably be clear that this is not a stable version.

The testing I've done is quite exhaustive, but I wouldn't be surprised if someone is able to find a better compression preset with the new apodization function, which would be nice. I've only tested stacking apodizations (to be sure the encoder doesn't compress worse than when using tukey(0.5)) but it might be that that isn't really neccesary.

FLAC compression improvement patch

Reply #47 – 2014-08-22 21:36:58

I was just rereading your post and pondering over various improvements. Please bear with me

Quote from: Gregory S. Chudov on 2014-08-21 16:34:51

Also, you don't have to actually use all 3 variants to calculate residual. You can choose the best variant just by looking at prediction error coeffs that are side-product of lpc coeff calculation.

I was just looking into whether this can be used in libFLAC, but I'm not sure whether this could work. These partial_tukey and punchout_tukey windows work because they leave out part of the signal. The LPC_error apparently calculates the error when converting the autocorrelation into LPC coefficients. However, it is the error between the linear predictor and the original signal that counts, because that is what the residual actually is. There are two steps in between: calculating the autocorrelation and windowing the signal. Am I right?

Oh, and as a sidenote: minimizing the error between the autocorrelation and the predictor is a long shot from minimizing the residual, right? I understand that calculating the residual for every step is expensive (if even possible), but I'm just wondering whether this is simplified too much?

Quote

My initial tests show that using those window functions and some of those tricks i can achieve the same level of compression as CUETools.Flake -8 at 2/3 of the CPU time, or i can get at least ~0.04 percentage points better compression at the same speed. Maybe this doesn't sound like much, but you have to take into account CUETools.Flake -8 was already doing much better than flac -8.

For some reason I didn't read this the first time. Great work, nice improvement!

FLAC compression improvement patch

Reply #48 – 2014-08-26 21:50:37

Quote from: ktf on 2014-08-22 21:36:58

I was just rereading your post and pondering over various improvements. Please bear with me

Quote from: Gregory S. Chudov on 2014-08-21 16:34:51
Also, you don't have to actually use all 3 variants to calculate residual. You can choose the best variant just by looking at prediction error coeffs that are side-product of lpc coeff calculation.

I was just looking into whether this can be used in libFLAC, but I'm not sure whether this could work. These partial_tukey and punchout_tukey windows work because they leave out part of the signal. The LPC_error apparently calculates the error when converting the autocorrelation into LPC coefficients. However, it is the error between the linear predictor and the original signal that counts, because that is what the residual actually is. There are two steps in between: calculating the autocorrelation and windowing the signal. Am I right?

I'm not really good at signal processing theory, but apparently LPC_error is supposed to predict the error between the linear predictor and the original signal, and it does that quite well most of the time. That's the whole point - it lets us choose the best order (FLAC__lpc_compute_best_order) and optionally window function without actually calculating all the residuals. I do it in CUETools.Flake and i like the results. In the version below, compression levels from 2 to 6 choose windows from the set of partial tukey / punchout tukey based on LPC_error.

Would be nice to see how this competes in a lossless codec comparison.

FLAC compression improvement patch

Reply #49 – 2014-08-27 01:36:15

A very simple test of mine. The new flake version is interesting. Only a few files get slightly bigger at -8 but some get clearly smaller. flake -6 is even compressing better on my few samples as old flake -8. Maybe some less would be allright for -6 already. Is the flacCL binary ready yet?

Code: [Select]

flac 1.30 
-8
4:54 2.386.302.297

old flake
-8 
4:42 2.379.879.473
-6 
3:33 2.381.719.880

new flake
-8
4:32 2.377.468.510
-6
3:09 2.379.445.148

Notice