Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC v1.4.x Performance Tests (Read 71142 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: FLAC v1.4.x Performance Tests

Reply #50
To whom it may concern:
On my quest to find a setting for v1.4.1 that rivals my favourite encoding setup (v1.3.3 with -7 = my sweet spot for encoding time and compressed file size), I found these settings for v1.4.1.
The goal was to achieve the same (or better) encoding speed with better compression.
Code: [Select]
Reference:
FLAC Binary: flac133_case.exe
FLAC Option: -7
 Average time  = 22.682 seconds (5 rounds), Encoding speed = 476.67x
 FLAC file size = 1.168.025.916 Bytes (= 61,241% of WAV size)
For my timings I only used Case's Haswell build, since this was the fastest on my computer.
I haven't varied the windowing functions, because frankly I have little idea what I'm going to do there...
Code: [Select]
a) FLAC Option: -l11 -b4096 -m -r6 -A subdivide_tukey(2)
 Average time  = 22.927 seconds (3 rounds), Encoding speed = 471.58x <= worse encoding speed: 477x -> 472x
 FLAC file size = 1.167.741.823 Bytes (= 61,226% of WAV size) <= better compression: 0.015 percent points

b) FLAC Option: -l11 -b4096 -m -r5 -A subdivide_tukey(2)
 Average time  = 22.134 seconds (5 rounds), Encoding speed = 488.48x <= better encoding speed: 477x -> 488x
 FLAC file size = 1.167.807.739 Bytes (= 61,229% of WAV size) <= better compression: 0.012 percent points
 
c) FLAC Option: -l11 -b3072 -m -r5 -A subdivide_tukey(2)
 Average time  = 21.051 seconds (5 rounds), Encoding speed = 513.62x <= better encoding speed: 477x -> 514x
 FLAC file size = 1.167.708.945 Bytes (= 61,224% of WAV size) <= better compression: 0.017 percent points

d) FLAC Option: -l11 -b3584 -m -r5 -A subdivide_tukey(2)
 Average time  = 20.729 seconds (3 rounds), Encoding speed = 521.58x <= best encoding speed: 477x -> 522x
 FLAC file size = 1.167.755.713 Bytes (= 61,227% of WAV size) <= better compression: 0.014 percent points

e) FLAC Option: -l11 -b3328 -m -r5 -A subdivide_tukey(2)
 Average time  = 20.866 seconds (3 rounds), Encoding speed = 518.16x <= better encoding speed: 477x -> 518x
 FLAC file size = 1.167.700.585 Bytes (= 61,224% of WAV size) <= better compression: 0.017 percent points
So it's basically -l11 and -r5 with variations of block size between 4KB and 3KB.
I also tested these settings with another list of WAV files (some 3 hrs of playing time, mostly rock music) and the ranking of the results were the same.
So it's going to be either d) (block size = 0x0E00) or e) (blocksize = 0x0D00) for me.

Re: FLAC v1.4.x Performance Tests

Reply #51
The "natural" result first: As -r6 does -r5 plus another little try, a) compresses better than b). And it spends only a split second more. 

More odd is that a) outcompresses -7.  a) is basically just -7 with the "12th" order parameter forced to 0. The only explanation I can come up with, is the fact that FLAC guesstimates first and calculates more exactly when it has picked what it thinks is best, and here it seems that - contrary to guesstimate - putting that parameter to zero actually improves. It is known since back in the day that too high -l could lead to this, but it is a bit surprising that it still kicks in between 11 and 12, if only by 0.015 percentage points.

The other are alternative block sizes.  All of those are divisible by ... well all are indeed divisible by 256, so they don't put restrictions on -r.
Have you tried -b 2048 (edit: or 2304)? Sometimes that improves. 2048 means twice as many blocks as default -b 4096, so twice as big block overhead - but the other side of the coin is that each block has its own predictor, which means it could offer a better fit.

Also you don't need to type out all this. Synonyms:
a) -7 -l 11
b) -7 -l 11 -r 5
c) -7 -l 11 -r 5 -b 3072
d) -7 -l 11 -r 5 -b 3584
e) -7 -l 11 -r 5 -b 3328

Edit: Did you try "cde with -r 6"? Expected: tiny improvement at tiny cost, just like a) over b). Any case of -r5 outcompressing -r6 would, I suppose, be "due to guesstimation error".

Re: FLAC v1.4.x Performance Tests

Reply #52
Here's what I got with a concatenated 6h43m54s .wav file over 7 CDs in different genres.

-5
Total encoding time: 0:48.563, 499.02x realtime
2066932342

-6
Total encoding time: 0:58.828, 411.95x realtime
2062800852

-l 12 -b 3456
Total encoding time: 0:47.531, 509.86x realtime
2060849025

-7
Total encoding time: 1:07.625, 358.36x realtime
2056437870

PS: 3456 = 1152 * 3

Re: FLAC v1.4.x Performance Tests

Reply #53
@Porcus:
Quote
More odd is that a) outcompresses -7
It does not. It outcompresses -7 from v1.3.3.
Code: [Select]
FLAC Binary: flac141-case-haswell.exe (860160 Bytes)
FLAC Option: -7
 Average time  = 25.416 seconds (10 rounds), Encoding speed = 425.40x <= worse speed: 477x -> 425x
 FLAC file size = 1.167.014.383 Bytes (= 61,188% of WAV size) <= better compression: 0.053 percent points
which was ruled out because it is much slower than -7 on v1.3.3

I also tried blocksizes of 2048 and >3584, but results are worse.

I'm aware of the synonyms, but I prefer to use the "full" settings in my test, just to see all the parameters and don't have to remember what "-7" stands for.

Going to test cde with -r6 asap...

Re: FLAC v1.4.x Performance Tests

Reply #54
https://datatracker.ietf.org/doc/draft-ietf-cellar-flac/
Code: [Select]
10.1.1.  Blocksize bits

   Following the frame sync code and blocksize strategy bit are 4 bits
   referred to as the blocksize bits.  Their value relates to the
   blocksize according to the following table, where v is the value of
   the 4 bits as an unsigned number.  In case the blocksize bits code
   for an uncommon blocksize, this is stored after the coded number, see
   section uncommon blocksize (#uncommon-blocksize).

van Beurden & Weaver      Expires 31 March 2023                [Page 29]
Internet-Draft                    FLAC                    September 2022

      +=================+===========================================+
      | Value           | Blocksize                                 |
      +=================+===========================================+
      | 0b0000          | reserved                                  |
      +-----------------+-------------------------------------------+
      | 0b0001          | 192                                       |
      +-----------------+-------------------------------------------+
      | 0b0010 - 0b0101 | 144 * (2^v), i.e. 576, 1152, 2304 or 4608 |
      +-----------------+-------------------------------------------+
      | 0b0110          | uncommon blocksize minus 1 stored as an   |
      |                 | 8-bit number                              |
      +-----------------+-------------------------------------------+
      | 0b0111          | uncommon blocksize minus 1 stored as a    |
      |                 | 16-bit number                             |
      +-----------------+-------------------------------------------+
      | 0b1000 - 0b1111 | 2^v, i.e. 256, 512, 1024, 2048, 4096,     |
      |                 | 8192, 16384 or 32768                      |
      +-----------------+-------------------------------------------+

                                  Table 13

Re: FLAC v1.4.x Performance Tests

Reply #55
It does not. It outcompresses -7 from v1.3.3.
Ah, I cannot read.

But, try 1.3.3 at -5. Reason: You say -7 was your sweet spot, but then it is relevant: how much did you actually pay (in seconds and milliseconds) for the -7 compression improvement over -5?

Re: FLAC v1.4.x Performance Tests

Reply #56
Here's what I got with a concatenated 6h43m54s .wav file over 7 CDs in different genres.

-5
Total encoding time: 0:48.563, 499.02x realtime
2066932342

-6
Total encoding time: 0:58.828, 411.95x realtime
2062800852

-l 12 -b 3456
Total encoding time: 0:47.531, 509.86x realtime
2060849025

-7
Total encoding time: 1:07.625, 358.36x realtime
2056437870

PS: 3456 = 1152 * 3
The tests above were done using Case's GCC 12.2.0 build. I tried to disable AVX in BIOS to compare the differences but flac.exe simply crashed.

Tests below used the Xiph build which does not crash, the two sets of results showed the differences of using AVX or not.

5
Total encoding time: 0:54.640, 443.52x realtime
2066932338
Total encoding time: 0:50.906, 476.06x realtime
2066932342

6
Total encoding time: 1:09.578, 348.30x realtime
2062800850
Total encoding time: 1:01.516, 393.95x realtime
2062800852

-l 12 -b 3456
Total encoding time: 0:52.922, 457.92x realtime
2060849025
Total encoding time: 0:50.953, 475.62x realtime
2060849021

7
Total encoding time: 1:16.656, 316.14x realtime
2056437872
Total encoding time: 1:10.250, 344.97x realtime
2056437869

Re: FLAC v1.4.x Performance Tests

Reply #57
@Porcus:
Here are the -r6 results, side-by-side with the previously posted -r5:
(better/worse and faster/slower always compared to the "reference" v1.3.3 -7)
Code: [Select]
c5) FLAC Option: -l11 -b3072 -m -r5 -A subdivide_tukey(2)
 Average time  = 21.051 seconds (5 rounds), Encoding speed = 513.62x <= better encoding speed: 477x -> 514x
 FLAC file size = 1.167.708.945 Bytes (= 61,224% of WAV size) <= better compression: 0.017 percent points

c6) FLAC Option: -l11 -b3072 -m -r6 -A subdivide_tukey(2)
 Average time  = 22.011 seconds (3 rounds), Encoding speed = 491.21x <= faster encoding (477x -> 491x)
 FLAC file size = 1.167.688.587 Bytes (= 61,223% of WAV size) <= better compression: 0.018 percent points


d5) FLAC Option: -l11 -b3584 -m -r5 -A subdivide_tukey(2)
 Average time  = 20.729 seconds (3 rounds), Encoding speed = 521.58x <= best encoding speed: 477x -> 522x
 FLAC file size = 1.167.755.713 Bytes (= 61,227% of WAV size) <= better compression: 0.014 percent points

d6) FLAC Option: -l11 -b3584 -m -r6 -A subdivide_tukey(2)
 Average time  = 21.606 seconds (3 rounds), Encoding speed = 500.41x <= faster encoding (477x -> 500x)
 FLAC file size = 1.167.713.160 Bytes (= 61,224% of WAV size) <= better compression: 0.017 percent points


e5) FLAC Option: -l11 -b3328 -m -r5 -A subdivide_tukey(2)
 Average time  = 20.866 seconds (3 rounds), Encoding speed = 518.16x <= better encoding speed: 477x -> 518x
 FLAC file size = 1.167.700.585 Bytes (= 61,224% of WAV size) <= better compression: 0.017 percent points

e6) FLAC Option: -l11 -b3328 -m -r6 -A subdivide_tukey(2)
 Average time  = 21.926 seconds (3 rounds), Encoding speed = 493.11x <= faster encoding (477x -> 493x)
 FLAC file size = 1.167.669.980 Bytes (= 61,222% of WAV size) <= better compression: 0.019 percent points

tldr; speed is clearly slower with -r6 while compression gains are "nothing to write home about"  ;)

Talking about -5: I found my sweet spot @ -7 because I used -8 since I went with FLAC and here the gain in speed was remarkable while I didn't care about the compression loss of -0.039 percent points. Using -5 would boost my encoding speed by some 50% while losing 0.43% of disk space. Really worth to consider if you're planning to reencode your whole collection.

Re: FLAC v1.4.x Performance Tests

Reply #58
Here's what I got with a concatenated 6h43m54s .wav file over 7 CDs in different genres.
-l 12 -b 3456
Total encoding time: 0:47.531, 509.86x realtime
2060849025
Just tried sundance's fastest setting on my data using the same Case GCC 12.2.0 build:

-l11 -b3584 -m -r5 -A subdivide_tukey(2)
Total encoding time: 0:54.484, 444.79x realtime
2058772911

Re: FLAC v1.4.x Performance Tests

Reply #59
That's faster here to, but losing compression, so not my goal (to achieve the same (or better) encoding speed with better compression than -7 on v1.3.3)...
In the end, you'll have to make up your mind what you're after...  ;)
Code: [Select]
FLAC Binary: flac141-case-haswell.exe (860160 Bytes)
FLAC Option: -l12 -b3456 <= bennetng setting
 Average time  = 16.319 seconds (3 rounds), Encoding speed = 662.55x <= way faster (477x -> 662x)
 FLAC file size = 1.168.849.826 Bytes (= 61,284% of WAV size) <= worse compression: -0.043 percent points
But your blocksize is very close to my fastest setting along with better compression:
Code: [Select]
FLAC Binary: flac141-case-haswell.exe (860160 Bytes)
FLAC Option: -l11 -b3456 -m -r5 -A subdivide_tukey(2) <= bennetng block size
 Average time  = 20.825 seconds (3 rounds), Encoding speed = 519.18x <= faster encoding (477x -> 519x)
 FLAC file size = 1.167.698.586 Bytes (= 61,224% of WAV size) <= better compression: 0.017 percent points

Re: FLAC v1.4.x Performance Tests

Reply #60
My data with -l11 -b3456 -m -r5 -A subdivide_tukey(2)
Total encoding time: 0:55.047, 440.24x realtime
2058918957

Also worth to note that my data set's uncompressed size is 4274945852, so quite different to your data set.

Re: FLAC v1.4.x Performance Tests

Reply #61
@bennetng:
Why do you think a larger data set makes a remarkable difference? Because it covers a greater variety of music/genre/styles or something I didn't think of yet? Btw. my encoding times are one-file-at-a-time, single core, running the timer64'd flac binary in a console window.
Just out of curiosity: How did you find your "magic" blocksize of 3456? Did you try all blocksizes with an interval of 128 bytes or was this a "lucky punch"?  ;) 

Re: FLAC v1.4.x Performance Tests

Reply #62
@bennetng:
Why do you think a larger data set makes a remarkable difference? Because it covers a greater variety of music/genre/styles or something I didn't think of yet? Btw. my encoding times are one-file-at-a-time, single core, running the timer64'd flac binary in a console window.
Just out of curiosity: How did you find your "magic" blocksize of 3456? Did you try all blocksizes with an interval of 128 bytes or was this a "lucky punch"?  ;)
What I meant was your compressed data set is in general around 61.2% of PCM, but mine is in general around 48.2%, even with -5 it is still around 48.35% (see Reply #52). Which means that baseline compression ratios are quite different. So mixing both data sets yields somthing like 54.7% for better corpus averaging. For example, one of ktf's plots also showed a similar ratio:
http://audiograaf.nl/losslesstest/revision%205/Average%20of%20all%20CDDA%20sources.pdf

The table in Reply #54 shows some common blocksizes including 576, 1152, 2304 or 4608. 1152 is used in presets 0-2, and 4096 for 3-8. So 3456 is just a convenient number I got from the original presets.

Re: FLAC v1.4.x Performance Tests

Reply #63
The 7 CDs used:


Hdcd Sampler Volume 2
https://www.discogs.com/release/6921177-Various-Hdcd-Sampler-Volume-2


Kaitou Saint Tail Original Soundtrack (Disc 1)
https://vgmdb.net/album/61630


Ondekoza (VDR-25231)
Can't find a suitable link for this specific CD, but in general Japanese arrangements featuring Taiko and Shamisen.
https://en.wikipedia.org/wiki/Ondekoza


Persona 2: Innocent Sin ~ The Errors of Their Youth
https://vgmdb.net/album/4383


Picture Of Primitive Hunting (Chinese Ancient Music)
https://www.discogs.com/release/16142867-Various-%E5%8E%9F%E5%A7%8B%E7%8B%A9%E7%8C%8E%E5%9B%BE-%E4%B8%AD%E5%9B%BD%E5%8F%A4%E4%B9%90-Picture-Of-Primitive-Hunting-Chinese-Ancient-Music


Tchaikovsky : 1812, Marche slave
https://www.amazon.com/Tchaikovsky-Marche-slave-Peter-Ilyich/dp/B000001GDT


何婉盈 / Elaine
https://youtu.be/kB7vQQ7hmcg

Re: FLAC v1.4.x Performance Tests

Reply #64
Seems your test corpus is easier to encode than my stuff (80% Classic Rock, some 10% Blues, no Classical tracks, no speech). Sadly, lots of the music from the 90s and later suffer from heavy compression (DR < 6) and are a challenge for audio compression.
Just finished some tests with a random selection of 160 audio files (all CDDA) from my collection (WAV file size = 6.111.491.436 bytes) and the results don't differ much from my regular test set:
FLAC file size = 3.736.404.536 Bytes (= 61,137% of WAV size, avg. bitrate = 863 kbps)

Re: FLAC v1.4.x Performance Tests

Reply #65
A lot of metal here, CDDA averages 918 or something with 1.3.x at -8.
Classical music section encodes to the low 600s even if there are (literal!) tons of loud organ pipes.

Re: FLAC v1.4.x Performance Tests

Reply #66
My neigbour was nice enough to give me a Classical CD (Mozart) for this test:
Code: [Select]
FLAC Binary: flac141-case-haswell.exe (860160 Bytes)
FLAC Option: -l11 -b3456 -m -r5 -A subdivide_tukey(2)
 Average time  = 5.128 seconds (3 rounds), Encoding speed = 553.66x
 FLAC file size = 220.191.908 Bytes (= 43,964% of WAV size, avg. bitrate = 620 kbps)

FLAC Binary: flac133_case.exe (718848 Bytes)
FLAC Option: -7
 Average time  = 5.712 seconds (3 rounds), Encoding speed = 496.99x
 FLAC file size = 220.654.634 Bytes (= 44,056% of WAV size, avg. bitrate = 622 kbps)
So compression of this kind of music is in bennetng's ballpark. And still outperforms v1.3.3 -7  :))


Re: FLAC v1.4.x Performance Tests

Reply #68
Here is a very biased corpus with only EDM music showing -b2880 is optimal. I would try something divisible by 512 or 576 without further subdividing, as the differences are too small.

EINHÄNDER ORIGINAL SOUNDTRACK
https://vgmdb.net/album/14

Dariusburst Original Soundtrack
https://vgmdb.net/album/16136

carpe diem "SENKO no RONDE" ORIGINAL SOUND TRACKS Volume 2
https://vgmdb.net/album/4419

BORDER DOWN -Sound Tracks-
https://vgmdb.net/album/311

PCM
2822769308

-6
Total encoding time: 0:40.313, 396.94x realtime
1937591945
68.6415%

-l11 -b3456 -m -r5 -A subdivide_tukey(2)
Total encoding time: 0:37.360, 428.32x realtime
1933346426
68.4911%

-l11 -b2560 -m -r5 -A subdivide_tukey(2)
Total encoding time: 0:37.734, 424.07x realtime
1933285028
68.4889%

-l11 -b3072 -m -r5 -A subdivide_tukey(2)
Total encoding time: 0:37.500, 426.72x realtime
1933147042
68.4841%

-l11 -b2880 -m -r5 -A subdivide_tukey(2)
Total encoding time: 0:37.938, 421.79x realtime
1933105922
68.4826%

-7
Total encoding time: 0:45.453, 352.05x realtime
1932486517
68.4607%

 

Re: FLAC v1.4.x Performance Tests

Reply #69
-b 3456 makes for larger files in my corpus (on first few tests). 0.05 to 0.06 percent, so not much, but not the other way.

Re: FLAC v1.4.x Performance Tests

Reply #70
What FLAC settings should you avoid?
(... if you have my CDDA test corpus and no special considerations.)


Idea: How many bytes do you get for spending an extra second encoding? 
You would pick the low-hanging fruit first. That means, settings that pick "expensive" improvements should be avoided until you are squeezing the last drops out. For example, in my tests, -e is not worth it because you can get the same improvement cheaper elsewhere. (At least, up to settings "nobody" will want to use.) It surely is material dependent; for example, @Gravity Stupor has posted two examples here and here about -e actually not being completely dead.


Some initial tl;dr's:
* Avoid -e.
* Avoid -6.
* -p: only  for "-8p", as -7p is not worth it. If you want something in between -8 and -8p, then -8 -A subdivide_tukey(4) is not a bad thing - and maybe even -8 -A subdivide_tukey(5) also makes sense, but somewhere around there you would rather jump to -8p.
* -r8 is not worth it. -r7 ... that depends. I tried an Intel-equipped Dell business laptop and a Ryzen-equipped Acer consumer laptop, and I wouldn't use -r7 on the latter, it took too much time. The Intel Dell shows quite a bit of timing variability, but it seems it does the -r part a bit faster for some strange reason - anyway, I guess it is only in consideration for those who are already at -8p even if I could find a "non-p" setting where it wasn't hopeless?

Note that -8 is "-7 but changing the subdivide_tukey from 2 to 3" [there are some fine detail about that, but forget those], then the natural continuation would be to increase that number - and if you for simplicity apply the rule of thumb that above -5, there is -7, -8, -8p and higher subdivide_tukey(N), you won't do that much wrong.
That said, -7 -l 11 can serve the purpose @sundance tested it to obtain. And myself I found a higher-than-8 customized setting that worked fairly well - the -A "tukey(666e-3);subdivide_tukey(3/333e-3)" which takes the -8 windowing functions, the -5 tapering function, makes them more different from each other and combines them. But that might be a spurious result. But these are fine-tunings, and don't change the impression that the "natural choices" are pretty good with the exceptions in the above bullet items.


Assumptions made:
* FLAC subset or bust - and you will anyway choose -4 or higher
* If you think saving B bytes for a second extra encoding time is acceptable improvement, you will also accept waiting another second for another B bytes saving.  Only when the savings per second falls, you've had enough.
The dubious part of this assumption is that it requires you to behave as if you knew the outcome in advance.
* FLAC decodes fast enough for you: you don't care about decoding CPU footprint.  See https://hydrogenaud.io/index.php/topic,123025.msg1016398.html#msg1016398 and the bottom chart at http://audiograaf.nl/losslesstest/revision%205/Average%20of%20all%20CDDA%20sources.pdf .
Nothing decodes as fast as FLAC, but if you want something for mass-decoding from SSD or the like, you might want to use -6 or oddballs like -6p or -8p -l 8.
* My hardware and test corpus (and official 1.4 build) are sane enough for testing :-)

One implications is that curves like 3rd diagram at https://hydrogenaud.io/index.php/topic,120158.msg1014227.html#msg1014227 "should be convex" (i.e.: stretch a ribbon around it, you won't select a point that is above the ribbon).  You see that -6 violates that: it lies above the straight line from the -5 to the -7 points.  And so you should not choose -6: if you are willing to wait for the improvement -5 to -6, you would also be willing to wait for -7.

Because I am in the land of imprecise measurements, as this involves dividing by the difference between times that may vary between runs.  So I did three measurements, picked the median time for each "genre section" (see signature link), each for an Intel-equipped Dell business laptop and a much cheaper Ryzen-equipped Acer Aspire.  Still the latter is more reliable.


Settings tested:
-4 (well even more) to -8 with or without -e or -p. 
-8 with higher subdivide_tukey(N); that is the same as -7 with higher subdivide_tukey(N): -7 implies N=2, upping it to N=3 yields standard -8, so also N=4 to N=8 are a natural continuation. (Edit: to test. Not saying the borderline between natural to use and not, lies precisely between N=8 and N=9.)
-8 with an additional function; partial_tukey(2) was tested for having a different taper parameter than subdivide_tukey(3), but "more promising" was  -A "tukey(666e-3);subdivide_tukey(3/333e-3)" that adds "only one" tukey function, but changes the tapering "in opposite directions" to make them more different.
-8ep
Also various -r settings, but not on everything.
Finally, a few results with the lighter-than-7 -7 -l 11 because @sundance started on it for time saving purposes. -7 -l 11 isn't a bad thing! (Could also have tested -l 10 then, but I didn't.)
With reference to @sundance 's testing: alternative -b were at no use here.


To the results:
What eliminates -e, is that you can get better compression cheaper. Smaller and faster than -8e are -8 -A subdivide_tukey(5) to (7). Smaller and faster than -8ep are are -8p -A subdivide_tukey(5) to (7).
What eliminates -6, is the "convexity" argument: if you are willing to pay for the improvement from -5 to -6, then the improvement from -6 to -7 is so much cheaper: the next byte saved costs less time than the previous.
What eliminates -r 8 is the same as -6. Although, I have not checked whether -8p -A subdivide_tukey(big number) -r 8 can be improved upon by -8p -A subdivide_tukey(bigger number) when "big" is too high.
And I don't think -r7 is useful until you are already at least at -8p.
Since -r7 is "questionable", one may wonder whether -r5 saves a lot of time with insignificant size difference? Turns out that it doesn't save much time. If I allow myself to deviate a little bit from the assumptions and say "at best you won't bother"

When to use -p? As said above, somewhere around -8 -A subdivide_tukey(4) to (5) you will rather take the (quite big!) time cost to get the benefit, as increasing the subdivide_tukey parameter will increase time at small benefit.
At -8p you are at about the same s(h)avings per second as going from WavPack -hx to -hx4.

Then the convexity argument could even rule out -5 in favour of "either -4 or -7" - even more so if one finds a good "lighter -7 alternative". Of course -5 will be used quite a lot out of being the default, but the argument for -4 would be as follows: If you think -5 is better than -7, as the time saved makes up for the size, you get about as good a time saving per megabyte to go to -4. (Not to -3, reveals a brief check.)

The above considerations don't appear to depend heavily on what "genre section" of my corpus I used. Sure some features are more pronounced on this or that, and there might possibly be an odd "error" in the sense that if I had deleted myself down to only one, I would have eliminated otherwise - but the big picture remains.

Lighter than -7, you said? Actually, -7 -11 might be considered. It saves some ten percent time on the total; but more significant, the extra time -5 to -7 is cut by a third. But the size gain from -5 to -7 then? Oh, you only forego ten percent of that.

And finally, how do all these considerations compare to FLAC 1.3? Not tested so much, but as you can expect: the double percision improvement picks some low-hanging fruit, so going -5 to -7 is not as lucrative as it was. On the Ryzen: 1.3.x -5 to -7 would save you 143 kilobytes per extra second taken, and this number is now reduced to 105. For -7 to -8, you would save 21 kilobytes per extra second of encoding, now down to 8.


Some numerical examples: on the Acer/Ryzen, not the most expensive CPU but I got more consistent timings.
-5 takes 419 seconds for 12034842978 bytes, -7 takes 603 seconds for 11976656017 bytes. Size difference divided by time difference becomes 316 kilobytes.
But if instead we went only to -7 -l 11, that would be 535 seconds for 11982589839 bytes. Now we can calculate two sizediff/timediff ratios: -5 to -7 -l 11 becomes 449, while -7 -l 11 becomes 87.
Those are as they should be, 449 is > 87; had the order been the other way around, -7 -l 11 would have been outright bad.
-7 to -8: saves about 30k per second. Again, makes sense that this is < 87, that means we pick those two fruits in the correct order.
-8 to -8 -A "tukey(666e-3);subdivide_tukey(3/333e-3)"  (mentioned in a post above): saves about 10k per second
-8 -A "tukey(666e-3);subdivide_tukey(3/333e-3)"  to -8 -A subdivide_tukey(4): saves about 4k.
Going forth to (5) and then from there to -8p and then from -8p to -8p -A "tukey(666e-3);subdivide_tukey(3/333e-3)": around 3k at each step.
-8p -A "tukey(666e-3);subdivide_tukey(3/333e-3)" to -8p -A subdivide_tukey(4): down to half a k. Only slightly less from (4) to (5)

What about -r7 being bad (on the Ryzen)? From -8p to -8p -r7 you only save like 0.36 kilobytes per extra second. And from -8p -r7 to -8p -r8: only 22 bytes. But it looks like the Intel i7 does -r7 faster than the Ryzen does and can be worth it somewhere.



Re: FLAC v1.4.x Performance Tests

Reply #71
Oh, and:
To speed up -7, one could of course consider dropping the subdivide_tukey(2) (that would then default to subdivide_tukey(1)=tukey(5e-1)). But that gives worse results than going down to -7 -l 11.

Then a stupid error, essentially a common factor of three due to three runs:
And finally, how do all these considerations compare to FLAC 1.3? Not tested so much, but as you can expect: the double percision improvement picks some low-hanging fruit, so going -5 to -7 is not as lucrative as it was. On the Ryzen: 1.3.x -5 to -7 would save you 143 kilobytes per extra second taken, and this number is now reduced to 105. For -7 to -8, you would save 21 kilobytes per extra second of encoding, now down to 8.
Wrong, that was sum over three runs. Luckily, the relationships between them are all good: you can multiply them all by three (well that gets you mean rather than the median I have elsewhere used, but, no big deal).
-5 takes 419 seconds for 12034842978 bytes, -7 takes 603 seconds for 11976656017 bytes. Size difference divided by time difference becomes 316 kilobytes.
This is correct with 1.4.1: Divide 316 by three and you get 105-ish. (Three quite even runs and roundoffs ...)

Re: FLAC v1.4.x Performance Tests

Reply #72
@john33
I tried some files in the 2L websites.
http://www.2l.no/hires/
Code: [Select]
filename                              MD5 sum
-------------------------------------------------------------------------------
2L-038_01_stereo_FLAC_44k_16b.flac             80b5c0c20168c21073c95699a0bbf992
2L-064_stereo192kHz_01_08.flac                 576ed036eb1ffe78bdb40a131e6dd23f
2L-120_01_stereo.mqacd.mqa.flac                622afbe406784c4cd54226350b2289c9
2L-125_stereo-352k-24b_04.flac                 3f3ffbdb84654e7fb22767fedcfaa30e
2L-139_01_stereo.mqa.flac                      78f2e3afc697cbc4d3ff43567e968187
2L48SACD_14_stereo_96k.flac                    1fb4209b9db97a0089baf37ca5846214
The files are then converted to wav without changing the original bit-depth and sample rate on a RAM drive, then converted to flac with these settings.
X

Then I disabled and enabled AVX support in BIOS, and CPU-Z reported that AVX, AVX2 and FMA3 are being affected. Some flac builds crashed wtih AVX disabled, so here are the tests on some non-crashing builds:

Xiph
Total encoding time: 2:32.250, 14.12x realtime
425513526 bytes
Total encoding time: 1:12.562, 29.63x realtime
425513433 bytes

Free encoder pack
Total encoding time: 2:17.781, 15.60x realtime
425513499 bytes
Total encoding time: 1:36.328, 22.32x realtime
425513499 bytes

https://hydrogenaud.io/index.php/topic,123014.msg1016215.html#msg1016215
Total encoding time: 2:26.344, 14.69x realtime
425513526 bytes
Total encoding time: 1:13.328, 29.32x realtime
425513471 bytes

https://www.rarewares.org/files/lossless/flac-1.4.1-x64.zip
Total encoding time: 2:35.859, 13.79x realtime
425513666 bytes
Total encoding time: 2:36.781, 13.71x realtime
425513632 bytes

The rarewares build is the only one showed almost no speed difference, is this expected?

Re: FLAC v1.4.x Performance Tests

Reply #73
The rarewares build is generic with no cpu optimisations so I'm not really surprised.

Re: FLAC v1.4.x Performance Tests

Reply #74
Thanks. Here are results with some AVX-only builds.

Case GCC 12.2.0
Total encoding time: 1:11.218, 30.19x realtime
425513472 bytes

http://www.rarewares.org/files/lossless/flac-1.4.1-x64-znver2-GCC1220.zip
Total encoding time: 1:13.328, 29.32x realtime
425513429 bytes

znver3
Total encoding time: 1:11.891, 29.91x realtime
425513429 bytes

http://www.rarewares.org/files/lossless/flac-1.4.1-x64-AVX2%20-GCC1220.zip
Total encoding time: 1:12.250, 29.76x realtime
425513472 bytes

Case Haswell
Total encoding time: 1:16.328, 28.17x realtime
425513511 bytes

It seems that the Ryzen builds have no compatibility issue with my Intel CPU.