Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: New FLAC compression improvement (Read 52870 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

Re: New FLAC compression improvement

Reply #50
ffmpeg already in 2006 ... hm.

This is probably my final test, and it is neither nice nor "fair": it is the single worst TAK-able track in my CD collection, Merzbow's "I lead you towards glorious times" off the Veneorology album. For those not familiar with this kind of noise music, you can listen at YouTube to understand why these insane bitrates.

For ktf's double-precision build there is a simple TL;DR: absolutely no difference on this track.
After I put every flac file through metaflac to remove everything including padding, ktf's build produces bit-identically the same file as flac 1.3.1, for each setting. (Those were: -0 through -8, for -4 -e though -8 -e, for -4 -p -e through -8 -p -e. More than I intended, but a "wtf?" or two here.)

Also, some more bit-identicals:
flac.exe produce bit-identical files for the following groups: (-3, -4, -4 -e, -4 -p -e); (-6, -7, -8); (-6 -e, -7 -e, -8 -e); (-6 -p -e, -7 -p -e, -8 -p -e).
flake -11 and -11 --vbr 4 produce bit-identical files ... and they are quite a bit better than any other FLAC.


I deliberately picked a track that I knew would make for strange orderings in that TAK cannot handle it (only -4 without e and m gets within .wav size) - but look at how ffmpeg cannot agree with its own ordering. Also look at where flake put its -8 and -9. flac.exe only misses the order once, in that -0 produces smaller file than -1.


Lazy screendump coming up. The "cholesky" is the same kind of option as above.

Oh, OptimFrog managed to get down to 53 222 937, but I don't have any fb2k component for it.

Re: New FLAC compression improvement

Reply #51
This is probably my final test, and it is neither nice nor "fair": it is the single worst TAK-able track in my CD collection
Couldn't help myself since there was another discussion on decoding efficiency, tried the ape. "Revised" worst end of the file size list:

1418 TAK -p1
1424 Monkey's Insane
1424 Monkey's High
1424 TAK -p0
1424 Monkey's Extra High
1426 Monkey's Normal

... so even when TAK fails at beating PCM, it succeeds at improving over Monkey's.

Re: New FLAC compression improvement

Reply #52
After the 'double autocorrelation' change which has been submitted to FLAC quite a while ago, I've been busy improving the IRLS code for which I started this topic. Source code can now be found here: https://github.com/ktmf01/flac/tree/autoc-double-irls

Please see the image below

X

Compared are the exe I dumped here the 16th of June, the exe I'm attaching now and CUEtools.Flake 2.1.6. Presets for FLAC are, from left to right, -4, -5, -6, -7, -8, -8e, -8ep and -9. Presets for Flake are -4, -5, -6, -7, -8, -8 --vbr 4. As you can see, the largest difference is because of the 'double autocorrelation' change, which is clearly visible from -4 to -8ep. However, the change from old -9 to new -9 is what I've been working on. The IRLS code is now much faster, used more efficiently and compresses slightly better.

Feel free to try to exe, look through the source etc. Please be cautious with the exe, I've been quite busy tuning but have done little testing.  I've only tested on CDDA material, maybe there are still some surprises on 96kHz/24-bit material left.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #53
Again my boring 29 CDs

IRLS beta -9
7.583.395.627 Bytes

IRLS beta -9ep
7.582.193.957 Bytes

older numbers
flac-native -9
7.586.738.858 Bytes

flac-native -9 -ep
7.581.988.737 Bytes

-9 speed has indeed improved,thanks!
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: New FLAC compression improvement

Reply #54
maybe there are still some surprises on 96kHz/24-bit material left.
I said the Hydrogenaud.io autoc-double testversion 2021063 -7 was good on hi-rez, this one is even better - on some material. Your new -9 is slooow on this material, good I didn't test the first one.

tl;dr on the below specified four hours of 96/24 (no fancy compression options given!)

-9: spends 40 minutes to achieve 57.25%
-8e spends 15:44 to achieve 57.30% (compared to Hydrogenaud.io autoc-double testversion 2021063 it shaves off .31 points at a cost of 8 seconds)
-8: spends 5:36 to achieve 57.33 (savings: 0.38 points, costs 12 seconds). ffmpeg at -8 gets inside that 0.38 interval, no matter whether it uses lpc_order 2 (spending 6:28) or 6 (at 17 minutes)
-7: spends 3:38 to achieve 57.37, which is still better than the autoc-double testversion's -8e
-6: spends 3:11 to achieve 57.83, that is not good compared to -7. Here and down to -4, the differences to the autoc-double testversion is at most .17
-5 spends 2:10 to achieve 57.92. -4 spends 2:00 to achieve 57.98. That's on par with ffmpeg -5, but twice as fast.

I tested CUETools.Flake -4 to -8, not so much variation, spending from 8:27 down to 3:04 for 58.23% to 58.48%.
I tested 1.3.1 at -8 -e (the -e by mistake), took twelve minutes for 58.97 and was worst for all files - except ordinary -8 half a point worse.


But a lot of the improvement is due to an album and an EP out of four. Your new -7 is faster than 1.3.1 -8 and yields savings by half a percent point up to 8.5 (!!) percentage points, and it is the biggest file that is least compressible.



Material: to get done in a day, I selected the following four hours from the above 96/24 corpus, in order of (in)compressibility:

* Kayo Dot: Hubardo. 93 minutes, prog.metal. Needs high bitrate despite not sounding as dense as the next one.
All about the same, all within half a percent point. And this is the biggest file of them all
Best: flake -8 at 65.72, then your new -9 at .73. (Heck, even OptimFrog -10 only beats this by 1 point.)

* Cult of Luna: The Raging River. 38 minutes sludge metal/post-hardcore.
Large variation, flake does not like this.
Best: New -9 at 59.45. -7 and up shaves a full percentage point over the autoc-double testversion. ffmpeg -8 about as good. ffmpeg -5 at 60.8. flake -8 at 62.59, 1.3.1 even a point worse at -8 -e.

* The Tea Party: Tx20. An EP, only 18 minutes Moroccan Roll. Earlier tests reveal: differs significantly between encoding options.
Large variations. Your -9: 53.95. Your -7 beats your new -6 by 3.2 points and your previous -8e by half that margin. ffmpeg varies by 3 points - here is the file where one more lpc pass makes for .1 rather than .02. Flake runs 60 to 61. flac 1.3.1 62 and 63.

* Open Goldberg Variations.  82 minutes piano, compresses to ~47 percent. Earlier tests reveal: doesn't use high Rice partition order.
Best: ffmpeg -8, but between 46.71 and 46.92 except flac -1.3.1 (add a point or more).



Done on an SSD, writing the files takes forty to sixty seconds. Percentages are file sizes without metadata, padding or seektables, but those don't matter on the percentages for such big files anyway.

Re: New FLAC compression improvement

Reply #55
I don't know how subset-compliant ffmpeg is on hirez ...
AFAICT it should be on 88.2k - 192k by default and above that if you force block size to 16384.

From what I understand prediction order is not limited when sampling frequency >48000, partition order is limited to 8 but this the max that ffmpeg is using on any level, which leaves block size. ffmpeg is using 105 ms block size, which is translated to:
Code: [Select]
44100 - 4608
48000 - 4608
88200 - 8192
96000 - 8192
176400 - 16384
192000 - 16384
352800 - 32768
384000 - 32768
Notice, that ffmpeg uses non-default block size. 4608 for 44100/16 with compression_level 8. And there is no option in ffmpeg to set block size.
"-frame_size 4096" works for me.

Re: New FLAC compression improvement

Reply #56
@ktf : On my computer, -9 -e generates same files as -9, and -9 -p -e the same (in an awful lot of time) as -9 -p; is that to be expected? I tried the 96/24 and a small CDDA set too.
Asking because it could depend on CPU-specifics.

(I wonder if combinations of -9, -e and/or -p will do calculations that couldn't lead to improvements. I don't know if there is any demand for any optimization; on one hand, who uses -p -e after all? on the other, yes those who use -9 -p apparently do get the same as -9 -e -p, so ...)


partition order is limited to 8 but this the max that ffmpeg is using on any level
[...]
"-frame_size 4096" works for me.
Interesting. Thx, that leaves room to test whether it matters - and if so, whether defaults are optimized.

Re: New FLAC compression improvement

Reply #57
As you can see above -ep only slightly improves compression because the SSE double precision already doesn't leave  much room. I use a Ryzen 5900x.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: New FLAC compression improvement

Reply #58
@ktf : On my computer, -9 -e generates same files as -9, and -9 -p -e the same (in an awful lot of time) as -9 -p; is that to be expected?
The same behaviour here. -e seems to do nothing with -9
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: New FLAC compression improvement

Reply #59
Preset -9 is equivalent to -b 4096 -m -l 12 -e -r 6 -A "tukey(5e-1);partial_tukey(2);punchout_tukey(3);irlspost-p(3)", so it already includes -e.

irlspost-p takes the result from evaluating tukey(5e-1);partial_tukey(2);punchout_tukey(3) and iterates on it, with the final iteration also being evaluated with -p. Using -9p gives only very small gains as irlspost-p already uses p (and irls usually results in the smallests file)

If you want even better compression, I'd recommend either using -9 -r 8 in the case of electronic music (some chiptune can really gain a lot by using this) or -9 -A "tukey(5e-1);partial_tukey(2);punchout_tukey(3);irlspost-p(10)" or an even higher number for irlspost-p. I haven't seen much improvement with more than 10 iterations, but perhaps this is different for hi-res material.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #60
Are there any known decoding problems going above -r 6?
Playing with the parameter i see no real gains and even slightly worse results for 24-96 stuff.
I think the default -9 setting is well chosen. -p is simply to slow for its minimal gains.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: New FLAC compression improvement

Reply #61
Are there any known decoding problems going above -r 6?
Not that I know of. ffmpeg and CUEtools.Flake use it in some presets by default. For most music it is not worth the trade-off, but for chiptune (Game Boy emulation) I've seen gains of 3 - 4%. There, the high partition order actually switches on low-frequency square wave transitions.

I think the default -9 setting is well chosen. -p is simply to slow for its minimal gains.
Yes, with the irlspost-p I tried to get the gains I saw with using -p but for minimal speed loss. As the irlspost stuff takes the best predictor from the tukey windows and does a few iterations on them, the result from the irls process usually gives the smallest frame.

From the small difference between -9 and -9p you can see that these iterations sometimes give results worse instead of better (because a improvement between -9 and -9p implies that this is from the regular tukey apodizations, and thus that the IRLS process did not improve upon them), it also shows that the process usually works very well.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #62
It seems that adding another "-e" still slows it down, so apparently some combinations force the encoder to do the same work twice.

Anyway, I will be putting the build at some overnight work out of curiosity, but avoid the -p and -e.

Question: Does irlspost-p(N) correspond to ffmpeg's lpc_order N in terms of passes, or to N-1 or to N+1 or justforgeteverythingaboutcomparing?


Edit: Oh, and on the Rice: The Open Goldberg Variations album would frequently end up in same files at a number of different -r settings.

Re: New FLAC compression improvement

Reply #63
It seems that adding another "-e" still slows it down, so apparently some combinations force the encoder to do the same work twice.
Are you sure? I cannot think of a way that can happen

Quote
Question: Does irlspost-p(N) correspond to ffmpeg's lpc_order N in terms of passes, or to N-1 or to N+1 or justforgeteverythingaboutcomparing?
I assume you mean -lpc_passes? If you want to compare anything, I'd say comparing to irlspost(N) instead of irlspost-p(N) is more fair. irlspost(5) would correspond to -lpc_passes 6, but the algorithms are quite a bit different.

The basic idea is the same, but the execution is very different. In both implementations the basic weight is the inverse of the residual of the pass before. This is the so called L1-norm. The implementation in ffmpeg has a factor summed to the absolute of the residual before inversion which grows smaller every iteration. I don't know what the idea for that is, and it seems counterproductive to me. My implementation in libFLAC weighs according to the L1-norm, but has a cut-off in place for small residuals, this is something suggested by most books I read on the subject. Cutoff is currently at 2, but this is something that is tuneable. This cut-off makes sure small residuals don't get too much attention, and it protects against division by zero. I have experimented with much larger values, and some music seems to compress better with values over 50 for example, so this still needs tuning.

My implementation also weighs with a moving average of the data. Current moving average window width is 128 samples, but this also something that is tuneable. This is because the way rice partitions work: a large residual in one part of the block does not have to have the same impact in another part of the block. By using a moving average as a proxy for this effect, the IRLS algorithm can try to optimize the whole block for the minimal number of bits, instead of only the hardest-to-predict parts. This moving average window width is also tunable, and is related to the -r parameter.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #64
Are you sure? I cannot think of a way that can happen

I ran two rounds -e and one -p -e, happened on all three. Will run them over a few times to see if it was a coincidence.

By the way, was there any reason that the new -5 to -8 should improve compression over the double precision version I tested above? (Because, apparently they do improve. Again only tested the 96/24 files.)


Quote
I assume you mean -lpc_passes?
Yes.

Quote
If you want to compare anything, I'd say comparing to irlspost(N) instead of irlspost-p(N) is more fair. irlspost(5) would correspond to -lpc_passes 6, but the algorithms are quite a bit different.

Depends on whether you think irlspost(N) is ready for test runs?
Naming suggests that irlspost(N) is like irlspost-p(N) but without the final "p" - and so that the algorithms that are "quite a bit different" are those two vs ffmpeg, and not irlspost vs irlspost-p?

Re: New FLAC compression improvement

Reply #65
By the way, was there any reason that the new -5 to -8 should improve compression over the double precision version I tested above? (Because, apparently they do improve. Again only tested the 96/24 files.)
Yes, the code has seen some improvements since I uploaded that binary. They have been included in the most recent binary.

Depends on whether you think irlspost(N) is ready for test runs?
Naming suggests that irlspost(N) is like irlspost-p(N) but without the final "p" - and so that the algorithms that are "quite a bit different" are those two vs ffmpeg, and not irlspost vs irlspost-p?
3 'apodizations' have been added. They aren't apodizations, but that was a good way to introduce it and it works in the exact same part of the code. Those three are irls(N), irlspost(N) and irlspost-p(N). They work with the exact same code, but the way in which they interact with the rest of the code is different.

irls(N) does iterations 'from scratch', irlspost(N) takes the best of the previous apodizations as a starting point (hence the post, it works like post-processing), irlspost-p(N) is the same but with precision search just for the end results of the IRLS process. The inner process of all three is the same.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #66
First: false alarm on -e taking more time. Probably it was because I kept files on a spinning drive approaching full, and so I/O would always take more time on the second run (and I would do -9 before -9 -e).

So I re-started the overnight jobs to get accurate encoding times for the full seven-hour 96/24 corpus. Here are a few at 4096 except ffmpeg at its default 8192 and Flake at variable blocking:

* 2644 for flac.exe 1.3.1 at -8 -p -e . It spent 170 minutes on this, only because I had to test how low the new could go and still beat it:
* 2641: In four minutes, your new -2 compresses better than 1.3.1 -8 -p -e. That is worth something eh? :-D
* 2605: as good as CUETools' Flake got it within subset. (Not timed, from earlier,  -8 --vbr)
* 2597: new -5 (five minutes)
* 2585: ffmpeg -compression_level 12 (No time, from earlier.)

Bearing in mind the improvement from 1.3 1 and Flake down to the next one, the compression improvements from -7 aren't much:
* 2578: new -7 (< 7 minutes).
* 2577: ffmpeg -compression_level 12 -lpc_type cholesky -lpc_passes 2 (45 minutes), narrowly beaten by your new -8:
* 2577: new -8 (10 minutes)
* 2575: new -8 -p -e takes 3h38min, that is slower than 2x realtime, and I got better compression from "-9 without -e" (not timed)
* 2575: 69 minutes for ffmpeg -compression_level 12 -lpc_type cholesky -lpc_passes 4
* 2574: 92 minutes for -9. The improvement over "without -e" is 0.48 kbit/s
* 2574: 93 minutes for ffmpeg -compression_level 12 -lpc_type cholesky -lpc_passes 6, narrowly beats -9. And another 24 minutes on -lpc_passes 8 only improved a fraction of a kbit/s.
* 2574: 84 minutes for "-9 with -e replaced by -r 8", beats -9 by 0.15 kbit/s - that is due to The Tea Party EP (which also by the way is better off at -b 2048)
* 2574: -9 -p improves 0.35 over -9 and is so slow I won't touch it again.


For what this limited testing is worth, it points at:
* -7 is so near -8 that ... Should one beef up -8 to get a proper difference? Try to look for a different apodization function? Look for overlap, add a fourth, do -r 8 or ... ? On the other thand, it is slower than the stock compile.
* -8 -p -e must die. Die. Oh it doesn't even move. Oh yes it does ... in a few hours. But nobody uses it eh? Or well maybe somebody does because it is perceived as the "best" preset.
* For those who want to wait for -9, it is on par with with what ffmpeg can do in the same amount of time (ffmpeg: -lpc_passes costs 6 minutes per pass. I didn't calculate until after running the even-only).
* While I don't think people will do -9 without having CPU time to spend, consider whether a new -9 should force -e included, when it isn't in -0 to -8. Maybe better keep it optional, as omitting it doesn't lead to unreasonable compressions.
* The average improvements for the tougher modes, are driven by a few signals - the TTP EP and Cult of Luna, primarily. That points towards an adaptive "this is enough!" mode, if anyone bothers to implement it. Or, actually, even if one doesn't want to go for the variable blocking strategy, then one could do two full -5 runs only to guesstimate the best for-the-file blocksize before running -9, only adding ten percent to the overall time. We have an expensive -p and an expensive -e, so heck ...

Re: New FLAC compression improvement

Reply #67
More overnight testings. The question I was thinking of asking is: how much CPU time do you have to spend to s(h)ave the next megabyte? (Amounting to about a third of a kbit/s on this corpus.) You would expect this marginal time cost to increase as you first select the ripe fruits to pick, and the following suggest that it holds (compare to 1.3.1, where going from -8 to -8 -e saves 62 MB in a quarter of an hour, that is around fifteen seconds per):
* Going new -5 (or -6, also just tested, is not much better than -5) to new -7: a few seconds per megabyte s(h)aved. Did I say that new -7 is damn good?!
* new -7 to -8: a minute per megabyte s(h)aved
* -8 to -9: about ten minutes for the next megabyte. Sounds horrible, but it is better than going -8 to -8 -p -e.

But then at this "ten minutes for the next" (for those willing to spend that CPU time going from -8 to -9) - then interesting things happen:
There are alternatives that get you a megabyte per ten minutes, and several of them seem to add up nicely! Given that the "cost" a megabyte just jumped by a factor of ten, you would expect the next to be jumping further? No, not necessarily:
The following are about in the same minutes-for-the-next-megabyte and can be combined on top of -8:
* A manual "-9 without -e"
* adding "-r 8 -e" to the previous
* Two more apodization functions: I just naively continued the "partial_tukey(2);punchout_tukey(3)" pattern with A tukey(5e-1);partial_tukey(2);punchout_tukey(3);partial_tukey(4);punchout_tukey(5)

That the marginal cost stays flat for a while, suggests that there is some smart setting to be found that catches the lion's share of the improvement much cheaper. What to try?
(Idea: maybe try an approach where higher order partial_tukeys are done by rather than making 4 each of length 1/4, take the first and last 1/2, and first and last 1/4 and leave the middle to a different function name?)



What I actually intended with these tests, was something else: My idea was that with these improvements, there might be settings that are no longer well suited because they only take time doing what the new code already picks up. (Like -p ... pending more testing, the documentation could indicate that this is less worth than -e.) Apart from -7 getting so close to -8, I didn't find anything consistent. Some isolated strange things are going on; like, I tried two files with -9 -l 32 (which encodes at half real-time!) and one of them became bigger than -9.



Also, wishlist items:
* After displaying compression ratio with three decimals (should be enough for most purposes, but ...), it wouldn't hurt with a "(+)" if it is >1 and "(=)" if it hits exactly the same audio size (which it sometimes will, say if recompressing with a -r 8 that makes no difference).
* --force-overwrite-iff-higher-compression . (Which spawns the question: if your irlsposts are post-processing algorithms, could they take as starting point an existing flac file without doing the initial compression?)
* Maybe abort with error if user gives -A "blah blah blah without closing quotation mark -- "filename with space.wav" ? I learned it leaves flac.exe hanging doing nothing responding nothing.
* For the documentation: the fact that flac accepts 5e-1, avoiding locale-specific 0.5 vs 0,5. Should be recommended.

Re: New FLAC compression improvement

Reply #68
* A manual "-9 without -e"
Sadly, even a -9 without -e internally forces the -e on the IRLS code because I haven't implemented a 'guess' method for the predictor order for IRLS code. Perhaps I could check what would happen if the code just defaulted to the highest order calculated.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #69
Perhaps I could check what would happen if the code just defaulted to the highest order calculated.
Why not ... since it is not my time. Well actually: presets 0 through 8 all have the 2x2=4 combinations of -N, -N -p, -N -e and -N -p -e, so, not a bad idea either.

Did another test on a different computer. Earlier on we found that double precision did well on higher frequency content (you did an aggressive lowpass to get the point across), and I was curious whether it carries over to even higher-rez. Turns out yes - and also your most recent update does.
This is one track only, so all reservations taken. I went over to http://www.2l.no/hires/ and picked the Britten track in 352.8/24 as well as 88.2/24 (that's in the 96 column ...) and ran it with (I) stock 1.3.1, (II) your "native" as posted here, (III) your double precision as posted here, and (IV) your most recent IRLS.
Did -8 and -8 -e.  And then (V) on the new build: -8 -A "tukey(5e-1);partial_tukey(2);punchout_tukey(3);irlspost(3)" (no -p!)  and -9. Then, (VI): replaced the irlspost(3) by irlspost(7) and for -9, bumped up to irlspost-p(7). And finally, your new -7.

Observed the gains:
(I) 4515 kbit/s at -8. Bitrate savings to -e: 42 resp. 19.
(II) Nine kbit/s better than 1.3.1, and Savings to -e comparble as for 1.3.1, ending up at 6842 resp 2127
(III) HUGE savings for the 352 file! Bitrates 6525 resp. 2101.
Going to -e saves another 30 resp ... rounded off to 0.
(IV) 6444 vs again 2101. And, rounded off to integer kbit/s, -e gains nothing.
(V) irlspost(3) makes no difference. -9 gains 5 resp. 10 kbit/s.
(VI) Just don't. One file up a couple hundred bytes, one down about the same.

Also, going -8 to -6 hurts the 352 the most. Same about -4 to -3 and -1 to -0. Going -8 to -7 does hardly anything, neither does -6 to -5 to -4 or -2 to -1. Only  -3 to -2 hits the 88.2 slightly more.

Re: New FLAC compression improvement

Reply #70
Observed the gains:
(I) 4515 kbit/s at -8. Bitrate savings to -e: 42 resp. 19.
(II) Nine kbit/s better than 1.3.1, and Savings to -e comparble as for 1.3.1, ending up at 6842 resp 2127
(III) HUGE savings for the 352 file! Bitrates 6525 resp. 2101.
Going to -e saves another 30 resp ... rounded off to 0.
(IV) 6444 vs again 2101. And, rounded off to integer kbit/s, -e gains nothing.
(V) irlspost(3) makes no difference. -9 gains 5 resp. 10 kbit/s.
(VI) Just don't. One file up a couple hundred bytes, one down about the same.
Could you explain this a little more, I can't follow. (I) has only 1 result for -8 but (II) and (III) have 2 which are in completely different ballparks? The number in the 6000-range is the 384kHz file and the one in the 2000-range the 88.2kHz number, I think? What is the 4000-range number for (I)?
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #71
Gosh, sorry. (4515 is the average of the two.) And then I got something wrong because (I) and (II) yield the same at -e. Restating (I) and (II):

(I): -8 yields 6884 and 2146. -8 -e yields 6842 vs 2127, improving 42 resp. 19.
(II): -8 yields 6869 and 2142. Adding a "-e": again 6842 resp 2127, so the "-e" improvements are down to 25 resp. 15.

That's still about the same ballpark when you compare to what gives the BIG differences on the 352.8 file. While the 88.2 varies with at most 46 kbit/s - that is still two percent! - the big one is reduced by six percent, and the contributions can be summarized in order of significance:

* 327 (that is nearly five percent) by going from your "native" build -8 -e to the double-precision build at -8
* 79 by going native -8 -e to IRLS -8 (without -e)
* Twenties to forties: adding "-e" in (I), (II), (III)
* A few kbit/s: Going 1.3.1 -8 to native -8; and, adding "-p" to IRLS -8;
* Zero to one, all on the IRLS build: adding "-e" to -8; further adding "-p" to -8 -e and to -9.

Here since I had to do things over again, I also let the IRLS beta do -p. The bitrates for the big file for the IRLS version are
6444 for -8 -e (gains 1208 bytes, that is 15 parts per million, over -8)
6439 for -9
6438 for -8 -p (improves over -8 -e, contrary to my uh, well-founded prejudices ...),
6437 for -8 -p -e
6437 for -9 -p

So the huge gain is the move to double precision; and then on top of that, your new build improves further, dominating the advantages that "-e" used to give. Then my interpretation was that new -8 is so good that "-e" no longer saves much; but, here -p did something.

(Hm, is there any way to make an "intelligent guess at doing halfway -e, halfway -p, halfway various -b" without going full brutal on it all? Let's not call the semiexpensive  switch "--sex", rather ... hey, -x is not taken.)



Oh, and one more thing. On the seven-file corpus, where I said -9 -p is so slow I wouldn't touch it again: nevertheless, to get a more accurate timing of it, I gave it an overnight job to arrive at around 1x realtime, i.e.:
-9 -p = -9 -p -e took around twice the time of -8 -p -e.

Re: New FLAC compression improvement

Reply #72
New testing with conflicting results, CDDA corpus and new higher-rez.

A bit of tuning on the previous 96/24 found that I could use -8 -Ax4, where the four being punchout_tukey(7) and partial_tukeys 7, 4 and 1 (=the default tukey(5e-1) to beat -9 - and comfortably and a fraction of the time.
So to see if that was a more universal observation, I ditched the old corpus and found a new.  Results vary wildly from CDDA to hi-rez.

Second and third columns are improvement in 1/100 percent (not percentage points!) over new -8 and over the previous boldfaced setting in the column; the "-b 4608" is over the setting in question.

Second column is CDDA, third is hirez.  Positives are savings. Fourth column is encoding time. For "reference": new -7 (damn good I say!), new -5, and the original 1.3.1 -8 files.

*EDIT:* Facepalm, the table had only encoding time for the full thing, not CDDA. And -e is more expensive for higher rez, so that -8 -e actually takes less time than the Ax4 (458 vs 475 seconds). Trying to edit manually:
Setting etcCDDAhirezencoding time
1.3.1 -8 to new -8-11.7-542.1n/a
-5 to -8-48.8-167.6149
-7 to -8-3.7-17.5242
(This is -8)00347
-b 4608 impr:-0.62.2
-e over -81.313.5869CDDA: 458
over previous1.313.5
-b 4608 impr:-0.62.3
Ax4 over -84.535.7838CDDA: 475
over previous3.222.1
-b 4608 impr:-0.33.1
Ax4 -r8 over -84.635.6894
over previous0.1-0.1
-b 4608 impr:-0.33.4
Ax4 -r8 -e over -85.842.03429
over previous1.26.4
-b 4608 impr:-0.33.5
-9 over -810.929.12820CDDA: 1537
over previous5.1-12.9
-b 4608 impr:-1.43.7
-9 -r8 over -811.029.03794
over previous0.1-0.1
-b 4608 impr:-1.33.7
... thanks to whomever posted the link to https://theenemy.dk/table/



What we can see:
* No reason to use -8 -e.  The "Ax4" beats -e quite a lot, at about the same time.
* -9 for CDDA: improvement over Ax4 even more than Ax4 improves over -e, but expensive in time. 
* -9 for hi-rez: loses to Ax4
* -b 4608 hurts CDDA slightly (unexpected), benefits hi-rez (expected).
* Net effect of -r 8 is around zero.



Music then.  
CDDA, I sorted my collection (tracks, not images!) on audio MD5, started somewhere and highlighted about 2 GB FLAC (1.3.1) files consecutive in the sorting. That's Beastie Boys and Black Sabbath, Zappa and Muse, Skinny Puppy and Creedence ... but overall heavy on electric guitar-driven rock, prog and metal.
hi-rez, I took a few albums and topped up with what was hi-rez from an Earache free sampler - makes this even heavier metal oriented than the CDDA part. Got to two GB there too.

No classical this time, because I have not much more > CDDA than what I have already used. 

Re: New FLAC compression improvement

Reply #73
Well as if I haven't already posted enough misunderstandings of mine, here I think I made another.

found that I could use -8 -Ax4, where the four being punchout_tukey(7) and partial_tukeys 7, 4 and 1 (=the default tukey(5e-1) to beat -9

Let's see: partial_tukey(N) creates N functions and as I think @ktf tried to teach me, each of them apparently takes about as much computational effort as any other. (I casually tested ... it seems so.)
Back in the younger neolithic, this forum discussed how -Ax2 was interesting but not worth it; well -8 is now an Ax6 and the above is a nineteen to beat -e.

And for -9 and IRLS: a -Ax19 -e to beat -Ax6+irlspost+-e ...

(Looks like the battle between brute force and a sensible algorithm. Except, it seems, the sensible algorithm takes so much time that semi-Brutus can give it a few casual stabs and then go home.)

Re: New FLAC compression improvement

Reply #74
Anyone feels like testing the flac-irls-2021-09-21.exe posted at https://hydrogenaud.io/index.php?topic=120158.msg1003256#msg1003256 with parameters as below?  Rather than me writing walls of text about something that could be spurious ...


For CDDA:
* Time (should be similar) and size: -8 against -8 -A "tukey(5e-1);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)"  against -8 -A "welch;partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)"  (I expect no big differences between the latter two.)
* Time (should not be too different) and size: -7 -p against -8 -A "welch;flattop;partial_tukey(3/0/999e-3);punchout_tukey(4/0/8e-2)" (or replace the welch by the tukey if you like)
* Time (should not be too different) and size (will differ!): -8 -e against -8 -p against -8 -A "welch;partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2);irlspost(1)" 
Note, it is irlspost, not irlspost-p.


For samplerate at least 88.2:
* -8 against -8 -A "gauss(3e-3);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2)"
* For each of those two: How much does -e improve size?
* How much larger and faster than -e, is -8 -A "gauss(3e-3);partial_tukey(2/0/999e-3);punchout_tukey(3/0/8e-2);irlspost(1)" ?

My tests indicate that the gauss(3e-3) combination is impresses nobody on CDDA, makes very little difference on most hirez files - but for a few it could be a percent.  And, then the "-e" improvement was a WTF. But hi-rez performance is absolutely not consistent ... well it is much better than the official release.