New FLAC compression improvement

Topic: New FLAC compression improvement (Read 50274 times) previous topic - next topic

0 Members and 3 Guests are viewing this topic.

Re: New FLAC compression improvement

Reply #75 – 2021-10-30 11:33:20

Quote from: Porcus on 2021-10-27 23:17:19

But hi-rez performance is absolutely not consistent

I guess it's because most of the CDDA frequnecy bandwidth in typical audio files is used to encode audible signal with more predictable harmonic and transient structure. On the other hand, hi-res files can have gross amount of modulator noise from AD converters, idle tones, obvious low pass at 20-24kHz, resampling artifacts, and occasionally real musical harmonics and transients in the ultrasonic frequencies. They are quite different things and therefore hard to predict.

Re: New FLAC compression improvement

Reply #76 – 2021-10-31 19:23:39

Quote from: bennetng on 2021-10-30 11:33:20

and occasionally real musical harmonics and transients

Does my sensitive snout smell the sweet scents of sarcasm ... ?

Indeed I suspect the same reasons.
And even if it weren't, it would still be well justified to tune the encoder more for CDDA than for formats with less market share.

Still it would have been nice to find something that catches several phenomena cheaply. I ran mass-testing on one hirez corpus and then observed the same thing on another - but due to very few files in both - and I still wonder if it is too spurious.

Re: New FLAC compression improvement

Reply #77 – 2021-11-01 11:34:39

Quote from: Porcus on 2021-10-31 19:23:39

Does my sensitive snout smell the sweet scents of sarcasm ... ?

Maybe, but speaking of sarcasm and unpredictable factors:
https://www.audiosciencereview.com/forum/index.php?threads/sound-liaison-pcm-dxd-dsd-free-compare-formats-sampler-a-new-2-0-version.23274/post-793538
Those who produce hi-res recording couldn't hear those beeps in the first place?

I've seen lossy video codecs can be tuned to optimize for film grain, Anime and such, but don't know if it is relevant to lossless audio or not.

Re: New FLAC compression improvement

Reply #78 – 2021-11-05 12:33:06

Quote from: Porcus on 2021-10-27 23:17:19

Anyone feels like testing the flac-irls-2021-09-21.exe posted at https://hydrogenaud.io/index.php?topic=120158.msg1003256#msg1003256 with parameters as below? Rather than me writing walls of text about something that could be spurious ...

For CDDA:
* Time (should be similar) and size: -8 against -8 -A "tukey(5e-1);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)" against -8 -A "welch;partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)" (I expect no big differences between the latter two.)
* Time (should not be too different) and size: -7 -p against -8 -A "welch;flattop;partial_tukey(3/0/999e-3);punchout_tukey(4/0/8e-2)" (or replace the welch by the tukey if you like)
* Time (should not be too different) and size (will differ!): -8 -e against -8 -p against -8 -A "welch;partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2);irlspost(1)"
Note, it is irlspost, not irlspost-p.

For samplerate at least 88.2:
* -8 against -8 -A "gauss(3e-3);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)"
* For each of those two: How much does -e improve size?
* How much larger and faster than -e, is -8 -A "gauss(3e-3);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2);irlspost(1)" ?

My tests indicate that the gauss(3e-3) combination is impresses nobody on CDDA, makes very little difference on most hirez files - but for a few it could be a percent. And, then the "-e" improvement was a WTF. But hi-rez performance is absolutely not consistent ... well it is much better than the official release.

I'm too lazy to be putting together the encoding times table, but here's a few compression comparisons:

16/44.1, 16/44.1, 16/44.1, 16/44.1, 16/44.1, 16/44.1, 24/44.1, 16/48, 24/48, 24/96, 24/192, 24/192

Based on what we have there, flac-irls-2021-09-21 -8 -p outperformed every other parameter set in eleven cases out of twelve, -8 -e winning in one case. What surprised me was that flac-irls-2021-09-21 -8 -p (and in some cases just -8) also outperformed flaccl v2.1.9 -11 in eight cases out of twelve, flaccl curiously winning in both 24/192 cases and in two 16/44.1 cases out of six. All of outperforming, mind, was achieved by a very slim margin.

In terms of speed, I did a quick test with Amon Tobin's How Do You Live LP, 16/44.1.
For flac-irls-2021-09-21 -8 -p, total encoding time was 0:53.509, 50.33x realtime, while for flaccl -11 it was 0:23.978, 112.33x realtime.
Decoding time for flac-irls-2021-09-21 -8 -p tracks was 0:03.284, 818.845x realtime, and for flaccl -11 it was 0:03.496, 769.270x realtime.

P.S. Whoops, I missed -8 -A "gauss(3e-3);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)". Oh well.

Re: New FLAC compression improvement

Reply #79 – 2021-11-05 12:47:53

@rrx
What syntax do you use to view percentage rates in the "Compression" tab of Playlist View?

Re: New FLAC compression improvement

Reply #80 – 2021-11-05 12:53:03

Quote from: Adil on 2021-11-05 12:47:53

@rrx
What syntax do you use to view percentage rates in the "Compression" tab of Playlist View?

Good question.

Code: [Select]

$if($or($strcmp($ext(%path%),cue),$stricmp($ext(%path%),ifo),$stricmp($info(cue_embedded),yes)),$puts(percent,$div($div($mul(100000000,%length_samples%,%bitrate%),%samplerate%),$mul($info(channels),%length_samples%,$if($strcmp($info(encoding),lossless),$info(Bitspersample),16))))$left($get(percent),$sub($len($get(percent)),3))','$right($get(percent),3),$puts(percent,$div($mul(800000,%filesize%),$mul($info(channels),%length_samples%,$if($stricmp($info(encoding),lossless),$info(Bitspersample),16))))$left($get(percent),$sub($len($get(percent)),3))','$right($get(percent),3))'%')

Re: New FLAC compression improvement

Reply #81 – 2021-11-05 13:02:02

Thank you very much!

Re: New FLAC compression improvement

Reply #82 – 2021-11-05 14:26:40

Besides the testing here has anyone an idea if there ever will be another official flac release? Since the cancelled 1.34 version it became very silent.

Re: New FLAC compression improvement

Reply #83 – 2021-11-05 15:55:01

Would be bloody annoying if not, when ktf has found the improvements that gave the independent implementations the upper hand over the official for fifteen yeart, and fixed bugs that made files blow up.

Re: New FLAC compression improvement

Reply #84 – 2021-11-09 19:35:28

Okay, I have something fresh to chew on for those interested: a Windows 64-bit binary is attached. Code is here but needs cleaning up before this could be merged.

I've rewritten the partial_tukey(n) and punchout_tukey(n) code into something new: subblock(n). Partial_tukey(n) and punchout_tukey(n) still exist, this is a (faster) reimplementation recycling as much calculations as is possible. It is rather easy to use:

using flac -A subblock(2) is roughly similar to using flac -A tukey(7e-2);partial_tukey(2/0/14e-2)
using flac -A subblock(3) is roughly similar to using flac -A tukey(5e-2);partial_tukey(2/0/1e-1);partial_tukey(3/0/15e-2);punchout_tukey(3/0/15e-2)
using flac -A subblock(4) is roughly similar to using flac -A tukey(4e-2);partial_tukey(2/0/8e-2);partial_tukey(3/0/12e-2);punchout_tukey(3/0/12e-2);partial_tukey(4/0/16e-2);punchout_tukey(4/0/16e-2)

The main benefit is that it is a bit faster and also much cleaner to read. The reason for the weird tukey parameters (4e-2 etc.) is that the sloped parts for the main tukey and the partial tukeys have to be the same, as the code works by subtracting the autocorrelation calculated for the partial tukey to calculate the punchout tukey.

Attached is also a PDF with three lines:

the darkblue one with triangles is the current git (without any of the patches mentioned in this thread)
the green one with squares is with all patches up until now
the lightblue one with 'diamonds' is the binary I'm attaching, in which settings -6, -7, -8 and -9 use the new subblock apodization

From right to left are encoder presets -4, -5, -6, -7 and -8, with the darkgreen extending further left with -9 and the lightblue extending further left with -8 -A subblock(6), -9 and finally -9 -A subblock(6);irlspost-p(3). To make it even more complex. The darkgreen -9 was defined as -A "tukey(5e-1);partial_tukey(2);punchout_tukey(3);irlspost-p(3)" with -e, the lightblue -9 is defined as "subblock(3);irlspost-p(4)" without -e. That last change was a suggestion from @Porcus

As you can see in the graph, this change makes presets -6, -7 and -8 a little faster. -9 is now 3 times as fast (mind that the scale isn't logarithmical) but compression is slightly worse. Perhaps I should increase to irlspost-p(5).

edit: I just realize that subblock is probably a confusing name, so perhaps I'll come up with another.

Re: New FLAC compression improvement

Reply #85 – 2021-11-10 03:21:59

Still the same 29 CDs. Not sure -9 is convincing enough.

IRLS-subblock beta -8
7.590.501.741 Bytes

IRLS-subblock beta -8 -ep
7.584.176.739 Bytes

IRLS-subblock beta -9
7.588.567.011 Bytes

IRLS-subblock beta -9 -ep
7.584.089.533 Bytes

older results:
IRLS beta -9
7.583.395.627 Bytes

IRLS beta -9 -ep
7.582.193.957 Bytes

CUEtools flake -8
7.591.331.615 Bytes

Re: New FLAC compression improvement

Reply #86 – 2021-11-10 04:33:20

Missed that one, sorry. Only slightly slower as -9

IRLS-subblock beta -8 -p
7.585.883.902 Bytes

Re: New FLAC compression improvement

Reply #87 – 2022-08-25 13:54:51

So the build posted in https://hydrogenaud.io/index.php/topic,122179.msg1014061.html#msg1014061 has implemented the "subblock" as "subdivide_tukey(n)" and also the double precision fix. I ran it through the same high resolution corpus as in Reply #54. I forgot all about padding this time, all numbers are with default padding, file sizes as reported by Windows.

TL;DR up to -8:
* New -7 is damn good and takes about the same time as old -8. Consistent with results on the test builds posted here.
* -6 is still like -7 -l 8, right? The thing is, -6 was completely useless on this material. -7 would be within seconds (actually, re-runs indicate that the seven seconds quoted below could be a high estimate of the time difference), and the size gain over -6 would dwarf the -5 to -6
* Nothing says -l 12 is the sweet spot. It could be higher than 12.
(It is known that higher -l is not unambiguously better, too high -l might lead to bad estimates and bad choices made; Indeed, results are ambiguous going from -7 -l 14 to -7 -l 15. But, all files improved going from -8 -l 14 to -8 -l 15.
Obvious first step is to test whether -l 13 or -l 14 should be considered for presets -7 or -8.)

TL;DR for those who want more than -8:
* -8e and -8p get completely murdered here. In particular, higher subdivide_tukey achieves smaller files in shorter time. I've had some surprises on the relationship between -8e and -8p, but on this corpus, either can be improved upon at half the time.
* Somewhat higher -l appears to improve even more and cheaper. While all files are bigger with -8 -l 14 than with -8 -l 15 and the improvement is cheap per MB, I would be careful about drawing conclusions on what -l to recommend. After all, small corpus and only high resolution.,
* size gains (= s(h)avings!) are often concentrated on only a few signals.

Results. Initially I was curious to see about what -8e -A "subdivide_tukey(n)" would outcompress -8 -A "subdivide_tukey(n+1)". Turns out ... well, numbers are telling. Uncompressed wave is 14 620 215 098 file size (seven hours three minutes). FLAC sizes are file sizes, no tags but default padding used.
1.3.4 -8: 464 seconds for 57.85 percent of wav size.
New -3: 244 seconds for 57.35. Not all albums improve over old -8. Kayo Dot is up nearly a percent, from 2 264 254 762 to 2 285 220 291
New -5: 311 seconds for 56.34.
New -6: 434 seconds for 56.27. Not attractive! 123 seconds more than -5 for 10.3 MB gains - spend a few seconds more and gain 47.
New -7: 441 seconds for 55.95. Time pretty close to old -8. All improved over old -8 (Kayo Dot only .7 percent though.)
New -7 -l 13: 463 seconds for 55.93. Just because how -7 improves over -6, I wondered whether it makes sense to stretch the -l even further. It seems to do so: a third of the size gain in an eleventh of the time penalty.
New -7 -l 14: 463 seconds here too. 55.92, and The Tea Party EP accounts for > 1.5 MB of the gain over -l 13. Indicates that FLAC rarely chooses this high order? Anyway, -l 14 takes out half the size gain between -7 and -8.
New -7 -l 15: 477 seconds for 55.91. Ambiguous result: remove TTP and sizes would be bigger than -7 -l 14. TTP still .7 GB bigger than -8.

-8 and above follow, all figures are 55.8xy hence a third decimal for these small differences. "-8 -st4" is short for -8 -A "subdivide_tukey(4)" etc.
plain -8: 673 seconds for 55.895. The TTP EP accounts for more than half of the size gain over -7, (4 422 584), despite being only 4.2 percent of total duration.
-8 -l 14: 734 seconds for 55.860. 70 percent of the size gain over -8 is TTP.
-8 -l 15: 749 seconds for 55.848. All files improve over -8 -l 14 (contrary to using -7), but again most (1MB) of the size gain is TTP.
-8e: 38 minutes and 55.879. Larger than -8 -A "subdivide_tukey(4)" (one file is a kilobyte smaller though), and slower than -8 -A "subdivide_tukey(7)"
-8p: 48 minutes and 55.873. All the -8p files are larger than with -8 -A "subdivide_tukey(5)". (Ambiguous against (4).)
-8 st4: 15 minutes, 55.870. Again, TTP accounts for most of the size gain over -8: 1 983 892 of 3 660 748. (Which in turn is ~1/40th of a percentage point. All this over plain -8, no -e nor -p)
-8 st5: 20 minutes, 55.855. Takes this much to beat -8 -l 14. TTP accounts for nearly half the 2 MB size gain over st=4 and is now 19 kilobytes smaller than -8 -l 14, Cult of Luna (nine percent of duration) thirty percent.
-8 st6: 26 minutes, 55.845. Takes this much to beat -8 -l 15 (on average - the TTP is bigger than the -8 -l 15 version). TTP nearly half, TTP&CoL three quarters of size gain over st=5.
-8 st7: 36 minutes, 55.838. TTP still bigger than -8 l 15. TTP&CoL three quarters of size gain over st=6.
-8 st8: 41 minutes, 55.833. TTP&CoL three quarters of size gain.
-8 st9: 51 minutes, 55.829. TTP&CoL nearly three quarters of size gain.
-8 st10:59 minutes, 55.825. TTP&CoL are seventy percent of the half a megabyte gained.
-8 st11:90 minutes, 55.823. Ditto. That was a jump in time.
-8e -A "subdivide_tukey(5)": 93 minutes, 55.842. Between -8 st6 and 7 in size.

Why go to these extremes? More to (t)establish that -e is not worth it. An initial test on one file indicated that the "-e" would overtake "the next n in subdivide_tukey(n)" around 5. Then I ran an overnight test, and lo and behold: -8e -A "subdivide_tukey(5)" puts itself between 6 and 7 (Cult of Luna makes for half the gain over subdivide_tukey(6)) - and is so slow that well, here you got -8 -A "subdivide_tukey(11)" just to compare. Not saying it is at all useful.

Re: New FLAC compression improvement

Reply #88 – 2022-08-25 14:58:55

Oh, one more thing tested: Does an additional flattop help the -8? Spoiler: not worth it.

Reason to ask: The subdivide_tukey sets out to test many windowing functions by designing them to recycle calculations. It is based on a successful choice, but the recent modification is designed for speed. Interesting question is then if forcing in an additional function makes for big improvements - possibly costly yes. If it does not: fine!
Choosing flattop as it is quite different from tukey and, back in the day when both CPUs and FLAC were slower, it was often included when people tested "Ax2".

Result: Does not take much extra time, doesn't make much improvement. Tried both orders, remember -8 took 673 seconds:
-8 -A "subdivide_tukey(3);flattop". 698 seconds, improves 194 kilobytes or 0.0013 percentage points.
-8 -A "flattop;subdivide_tukey(3)". slightly slower, a kilobyte bigger.
(Didn't bother to check individual files.)

Conclusion: subdivide_tukey is fine! At least on this high resolution corpus.

Oh, and:

Quote from: Porcus on 2022-08-25 13:54:51

(It is known that higher -l is not unambiguously better.

... in size. (Speed totally disregarded.)

Re: New FLAC compression improvement

Reply #89 – 2022-08-26 03:59:03

On the 29 CD corpus this git build pretty much behaves the same as the IRLS-subblock beta above.
-8 -p (7.585.569.422 Bytes) is absolutely usable with that while -ep for my taste is much to slow to justify.

I tried that -8 -A "subdivide_tukey(9)" and with these CD Resolution files it seems not to scale as good.
7.587.283.854 Bytes

Re: New FLAC compression improvement

Reply #90 – 2022-08-26 19:35:48

I'll have a go too

Here's a comparison of FLAC 1.3.4 and current git (freshly compiled with the same toolchain) for 16-bit material. Compression presets are -0 at the top left through -8 at the bottom right.

Here's the same thing for 24-bit material

Here's a comparison on the same material, but starting with -4 at the top left and with a few additions

The same thing for 24-bit material

edit: I just forgot to say, these additions seem to have made -e useless, I have plans to make -p 'useless' as well. -e and -p are brute-force approaches, and I see a possibility to have something outsmart -p fast enough to be universally applicable. I've worked with @SebastianG on this. As soon as the next FLAC release is out and the code is no longer in need of fixes, maybe I can work on that. In the meantime, I've posted a math question on StackExchange but sadly with little reply. Maybe someone here can help out.

Re: New FLAC compression improvement

Reply #91 – 2022-08-26 22:31:51

24 bits, but still 44.1 kHz or 48 kHz? Strange things seem to happen at higher sampling rates.
(Which brings me to a completely different question: why is the subset requirement less restrictive for higher sampling rates? Getting streamability would be a harder task when you have to process more data per second, so why allow for an additional computational burden in the algorithm precisely where it becomes harder in the data as well? Is it so that, twenty years ago when there hardly was any high-res, one accepted that a "FLAC" player could just specify that it didn't accept higher resolutions?)

As for the StackExchange question ... integer programming is not my thing, and it isn't easy [enter joke about being NP-hard ...].
Also here the objective is - well at least nearly - to find a predictor that makes the encoded residual short. And the integer solution to the matrix eq is strictly speaking a solution to a different problem?
(Typically, how much of the time is spent on solving for the predictor and how much is spent on packing the residual?)

Re: New FLAC compression improvement

Reply #92 – 2022-08-27 09:07:49

Quote from: Porcus on 2022-08-26 22:31:51

24 bits, but still 44.1 kHz or 48 kHz? Strange things seem to happen at higher sampling rates.

I'll rerun with upsampled material.

Quote

(Which brings me to a completely different question: why is the subset requirement less restrictive for higher sampling rates? Getting streamability would be a harder task when you have to process more data per second, so why allow for an additional computational burden in the algorithm precisely where it becomes harder in the data as well? Is it so that, twenty years ago when there hardly was any high-res, one accepted that a "FLAC" player could just specify that it didn't accept higher resolutions?)

I don't know what the rationale was. Looking at the testbench results it is clear many hardware players still simply ignore high-res material.

I can only conjecture the idea here was that the blocksize for CDDA translates to a higher blocksize if you want the same number of blocks per second. If it is still music sampled at higher rates, this makes sense: if a certain blocksize and predictor order makes sense for music in general, than double the sample rate would imply double the blocksize and double the predictor order. This would imply music waveforms remain stable (i.e. well-predictable) for about 100ms.

This is actually how ffmpeg encodes FLAC: it doesn't set a blocksize, but a timebase. For 44.1kHz material this maxes out the blocksize at 4906, at 96kHz it picks 8192 and at 192kHz is uses 16384. It sticks to standard blocksizes.

Quote

As for the StackExchange question ... integer programming is not my thing, and it isn't easy [enter joke about being NP-hard ...].
Also here the objective is - well at least nearly - to find a predictor that makes the encoded residual short. And the integer solution to the matrix eq is strictly speaking a solution to a different problem?

The objective is to find a predictor that produces a residual that can be stored with the least bits, yes. Translating this into a mathematical description is rather hard: it isn't a least-squares problem, but it isn't a least-absolute-deviation problem either.

Quote

(Typically, how much of the time is spent on solving for the predictor and how much is spent on packing the residual?)

For the not-so-brute-force presets (like 3, 4 and 5) about one third of the time is spent on finding a predictor, one third on calculating the residual and one third on other things like reading, writing, logic etc. preset 5 only brute-forces the stereo decorrelation part, nothing else.

Preset 8 is closer to half of the time spent on finding a predictor, half of the time calculating a residual and a negligible amount of time on other things. For -8ep pretty much all of the time is spent calculating residuals for various ways of expressing the same predictor.

Now for the reason I think the integer solution is a possible improvement: look at the 24-bit graphs I just posted. The size of a single subframe at preset 8 is 24 * 4096 * 0.715 = 70000 bits. Assuming the predictor order is 10 on average , the predictor takes up 10*15 = 150 bits, which is 0.2%. With -p, the precision can at most be lowered from 15 to 5, meaning the predictor cannot grow smaller than 50 bits, saving 0.15%. However, in many cases, using -p gives you more than that theoretical 0.15%. Why is this?

I think the problem is that the predictor is calculated in double precision floating point numbers, and the rounding treats each predictor coefficient as independent. However, these coefficients aren't independent. As p simply successively tries coarser roundings, it might hit a point where the most important coefficients 'round the right way'. That would explain why the possible savings are higher than one would expect from only looking at the space saved by storing smaller predictor coefficients.

Now, one way to round coefficients smarter is by not treating each coefficient individually but by treating them together, as a vector. That's why I think this is a closest vector problem, but I'm not sure.

Re: New FLAC compression improvement

Reply #93 – 2022-08-27 10:15:20

Quote from: ktf on 2022-08-27 09:07:49

As p simply successively tries coarser roundings, it might hit a point where the most important coefficients 'round the right way'. That would explain why the possible savings are higher than one would expect from only looking at the space saved by storing smaller predictor coefficients.

Now, one way to round coefficients smarter is by not treating each coefficient individually but by treating them together, as a vector. That's why I think this is a closest vector problem, but I'm not sure.

And here is where I was - for no good reason but "came to think of" - suspecting:
* The closest vector would not solve the ultimate problem
* Rounding off is a way of "just moving coefficients in some direction, hoping for an improvement, and comparing"
So, conjecture: part of the success of -p is not saving space for the coefficients, but "randomly" finding a better fit (I think that is what you are suggesting too?).
But then: unless the closest vector is truly a good choice for the ultimate problem, it is not at all given that you should go to great efforts finding it. It is not at all given that you should go to great efforts minimising an L1 norm. For example, it could be that searching in a direction that happens to improve would pay off.
If it tries each coefficient independent, does it then (for order 10) try 10 individual round-offs first? And if one is for the worse, does it round the other direction?

(... calling for a post-processor that takes an encoded FLAC file and works from there)

Re: New FLAC compression improvement

Reply #94 – 2022-08-27 19:59:28

Here is the upsampled material. This is the exact same material as for the 16-bit tests, but upsampled. So, there is no content above 20kHz.

Here's presets -0 through -8

Here's -4, -5, -6, -7, -8, -8 -A subdivide_tukey(4), -8 -A subdivide_tukey(6), -8e and -8ep.

The difference here is indeed stunning.

Quote from: Porcus on 2022-08-27 10:15:20

And here is where I was - for no good reason but "came to think of" - suspecting:
* The closest vector would not solve the ultimate problem

Yes, you are right. When the ultimate problem is 'get a quantized predictor that produces the smallest possible subframe' then yes, this is not that solution. However, it might be used as a part in that ultimate solution.

Quote

* Rounding off is a way of "just moving coefficients in some direction, hoping for an improvement, and comparing"
So, conjecture: part of the success of -p is not saving space for the coefficients, but "randomly" finding a better fit (I think that is what you are suggesting too?).

To remain lossless, the predictor in the FLAC format uses integer coefficients only. So, if one doesn't use integer programming to get to an integer solution all the way, there has to be some rounding. And yes, what you conjecture is what I suggested.

Quote

But then: unless the closest vector is truly a good choice for the ultimate problem, it is not at all given that you should go to great efforts finding it.

Yes, I'm not sure that this will help at all, but it seems good enough of an idea to pursue. It just seems that there should be a better way to get a quantized predictor than some simple rounding.

Quote

It is not at all given that you should go to great efforts minimising an L1 norm. For example, it could be that searching in a direction that happens to improve would pay off. If it tries each coefficient independent, does it then (for order 10) try 10 individual round-offs first? And if one is for the worse, does it round the other direction?

I'd like to get off the brute-force approach for a while. If you want to check individual round-offs, that would take 24 residual calculations at least for a 12th order predictor, which would make it about twice as slow as just using -p (which tries 10 different roundings). Furthermore, that misses relations between coefficients. It might be that rounding the 1st coefficient up means that the second coefficient should be rounded down to compensate, if they are linked. Finding such a relation would require a lot more brute-force.

However, as it seems to me, these relations are embedded in the matrix that is solved to find the (unquantized) predictor. If it is not solved directly and the result then rounded, but used directly to find a integer vector, than these relations can be used to find a predictor that comes closer to the original values.

Re: New FLAC compression improvement

Reply #95 – 2022-08-28 10:38:56

Yeah, so ... getting a gradient by say, starting with solving for a z, rounded off (floor/closest/whatever) to a vector x of integers, then
calculate total size S(x); then proceeding to
calculate S(x+e_i)-S(x) for each standard unit vector e_i,
and one will have done 1 + 12 = 13 full compressions just to get a direction of steepest descent.

Instead, we conjecture that a good choice for x has Ax-b small - and then starting with the naive round-off of the "exact argmin" z,, we want to get closer, cheaper than by full compression?

Then for the "outsmart" part: the z vector minimizes a norm that, well we know it isn't likely to be the best? Which spawns the question: if another norm (say L1) is conjectured to be a better choice (but computationally nowhere as tractable, cf the IRLS buils), and you want to round off in a direction that improves - why not use the roundoffs to pursue an improvement in that one?

Re: New FLAC compression improvement

Reply #96 – 2022-08-29 13:31:47

Back to the graphs. I didn't realize that FLAC 1.3.4 also has a "useless -6". But there are other settings that are ... well, "easy to improve upon at costs the user would already have accepted". Including -8, but the "-8" name is kinda reserved - it is easier to play with -6 and -7.

If we disregard presets that "serve special purposes" (that would be 0 to 3), then a "reasonable choice of presets" would make for a convex graph: the return (in terms of bytes saved) should diminish in CPU effort, because the encoder presets should pick the low-hanging fruit first.
In particular: if we extrapolate the line from -4 to -5 forth and it later (that is, further to the right in these graphs) it hits or overshoots a dot (say -7), then I would claim "if you are willing to wait the extra time taken for the improvement of -5 over -4, then you would also accept the time taken for the improvement to -7". This does assume that the user's your time/size trade-off is constant, which I think is reasonable for users who are indeed willing to go beyond default (and especially for those willing to wait for -8)

Which is what I mean by "useless -6":
* If you are willing to wait for -6 (over -5), you should be willing to wait for -7.
* And if you are willing to wait for -5 (from -4, the previous dot in the convex minorant!), then "at least nearly" you would be willing to wait for -7 for CDDA (at least you shouldn't complain much!), and you should definitely in your graph of upsampled material.
Furthermore:
* For CDDA: If you are willing to go from -7 to -8, you should also be willing to run -8p.
* For the upsamples: If you are willing to go from -7 to -8, then you should at least nearly accept -A subdivide_tukey(6)

Now here is a question:
Can the flac encoder easily accommodate different presets depending on signal? Like what ffmpeg does:

Quote from: ktf on 2022-08-27 09:07:49

This is actually how ffmpeg encodes FLAC: it doesn't set a blocksize, but a timebase. For 44.1kHz material this maxes out the blocksize at 4906, at 96kHz it picks 8192 and at 192kHz is uses 16384. It sticks to standard blocksizes.

Reference FLAC seems well tuned to select 4096 (4906 was a typo) for CDDA; the "also-standard" 4608 seems not to be of much use (maybe some curious people can test if it is an idea for 48k ... hm, another idea could be -b <list of blocksizes>, but right now you are not in the mood for brute force I see ;-) .)

Say, -8 could be unchanged for signals up to 64k (threshold could be determined by a bit of testing), and from then on and up, selection -8 would invoke also -b 8192 -l 14 (and maybe a higher subdivide_tukey). But that comes at a risk of people complaining over slowdowns. After all, git -8 is so much slower than 1.3.4 -8 on high resolution (although it IMHO pays off!) that maybe, maybe, a split-by-sample-rate option should rather speed up high resolution.
So, alternatives might be e.g. a semi-documented --Best with capital B, Don't call it --super-secret-totally-impractical-compression-level, because it is likely to be "practical" to everyone who would find -8 worth it. The "-9" is a bit dangerous to spend ... or well, "-9" could stand for "higher compression than -8, but beware it will change in git whenever we feel like testing a possible improvement" (and thus you are free to alter -9 to an IRLS in a test build too).

Re: New FLAC compression improvement

Reply #97 – 2022-08-29 17:56:59

As a somewhat basic end user who doesn’t understand the vast majority of the technical discussion here over the past few days - I do like the idea of some sort of preset that always enables the most extreme form of compression, with no other considerations to processing speed or anything else. With the caveat that it’s experimental and may change across future builds. If I’m not misunderstanding that this is what’s being proposed?

If it has a cute name like —super-secret-totally-impractical-compression-level, then even better!

Re: New FLAC compression improvement

Reply #98 – 2022-08-29 19:00:43

Quote from: MrRom92 on 2022-08-29 17:56:59

If I’m not misunderstanding that this is what’s being proposed?

Actually it isn't precisely what is being done. Sometimes we/I cannot help ourselves/myself putting on some stupidly slow compression job before going away for the week, and by posting it here I have probably fueled that misunderstanding big time.

But if you want the most extreme form of compression, then FLAC is not the codec. FLAC was created for low decoding footprint - i.e. to be played back on ridiculously low power devices - it takes less computing power than e.g. mp3 to decode! If you look at ktf's lossless tests at http://www.audiograaf.nl/downloads.html , you will see that there are codecs that out-compress FLAC but require a lot more computational effort.

FLAC is extremely "asymmetric": There is hardly any limit to how slow you can get FLAC encoding by going beyond the presets (the -0 through -8) and directly accessing the filter bank and ordering brute-force calculation of as much as possible - yet it will still decode at ultra-light footprint by fifteen year old decoder in a device that was considered low-power even at that time. This compression improvement thread is more about finding improvements within practical speeds - or at least, "sort of practical" speeds, again sometimes I cannot . Surely I have exceeded the "practical" every once in a while, but still.

There are even compressors that aren't even remotely useful for playback and that you won't find in ktf's comparisons up there. Look at https://hydrogenaud.io/index.php/topic,122040.msg1010086.html#msg1010086 when I put an encoder at work for like twelve hours plus on this single track to achieve something that would take twice as much time decoding than playing (well on an old CPU!) - but at a size much smaller than any playback-useful codec.
On the other hand, sometimes FLAC makes wonders that other codecs can not.

"--super-secret-totally-impractical-compression-level" was indeed a name used in some very old flac versions. I don't know when it disappeared.

Re: New FLAC compression improvement

Reply #99 – 2022-08-29 21:38:25

I tried -8 -p in ye olde FLAC 1.2.1 (-e was already enabled with -8 back then), and not only did it make compression much slower, but the compression ratio actually got worse!

I admire your tenacity with trying to find better compression options when the potential gains are so small.

Notice