Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC v1.4.x Performance Tests (Read 71195 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: FLAC v1.4.x Performance Tests

Reply #250
Here is an example that -p works better than -e:
https://www.soundliaison.com/index.php/studio-masters/856-ray-carmen-gomes-inc
The 768kHz file is free for download. --lax is required for 768kHz files.
X

PS H:\> measure-command{h:\flac -f *.wav --lax -8e -b16384}|select totalseconds
wrote 641447814 bytes, ratio=0.663

TotalSeconds
------------
  52.4976358


PS H:\> measure-command{h:\flac -f *.wav --lax -8p -b16384}|select totalseconds
wrote 641307624 bytes, ratio=0.663

TotalSeconds
------------
   68.536428


The additional windows are not very effective if the spectrum does not have a smooth decaying trend in higher frequencies. Blindly use higher subdivide_tukey(n) is just a waste of time.

PS H:\> measure-command{h:\flac -f *.wav --lax -8 -b16384 -A "subdivide_tukey(6)"}|select totalseconds
wrote 641444963 bytes, ratio=0.663

TotalSeconds
------------
  37.5487237


PS H:\> measure-command{h:\flac -f *.wav --lax -8 -b16384 -A "subdivide_tukey(5);tukey(75e-2);gauss(5e-2);blackman"}|select totalseconds
wrote 641444667 bytes, ratio=0.663

TotalSeconds
------------
  33.4128744


PS H:\> measure-command{h:\flac -f *.wav --lax -8 -b16384 -A "subdivide_tukey(5);welch;hann;flattop"}|select totalseconds
wrote 641443864 bytes, ratio=0.663

TotalSeconds
------------
   33.674331


Usually, the quickest way to reduce size is increasing -l, but setting -l too high can harm decoding speed.

Re: FLAC v1.4.x Performance Tests

Reply #251
Here is an example that -p works better than -e:
https://www.soundliaison.com/index.php/studio-masters/856-ray-carmen-gomes-inc
The 768kHz file is free for download. --lax is required for 768kHz files.
Looks like conversion from DSD made without high-frequency noise filtering, not normal PCM.

Re: FLAC v1.4.x Performance Tests

Reply #252
The file was originally recorded in DXD, then played through a Studer tape machine and re-digitized at 768kHz using an RME interface with AKM ADC. ADC these days are mostly multibit delta-sigma and therefore the rise of noise. You can also see a faint dip at 150kHz. It is the tape bias, but someone in ASR complained about it and Sound Liason filtered the bias tone.

Read this for the whole story:
https://www.audiosciencereview.com/forum/index.php?threads/finally-music-we-can-buy-in-768-khz-sampling-rates.29544/

You can see the rise of noise when the RME interface is operating at high sample rates even when using PCM.
https://archimago.blogspot.com/2019/03/measurements-look-at-audio-ultra-high.html

Re: FLAC v1.4.x Performance Tests

Reply #253
48 kHz sample rate. Heavy (quite heavy indeed) metal.

Questions:
* For high sampling rates, 1.4.x would lead to quite impressive improvements. For CDDA, 1.4.0 "at preset N" would by and large beat 1.3.x "at preset N+1" - for high resolution, new -3 or -4 would beat old -8pe. What about for 48 kHz?
tl;dr: New -4 beat old -8p (didn't try -8ep).
* Block size. Is it a good idea to go up to the next standard block size (namely -b 4608) to closer "maintain time per block"? (Relative to 4096 samples for 44.1 kHz.)
tl;dr: from -7 up it did improve, but at -8p the improvement was only 0.003 percent, so ... do you care?

Corpus: Took everything I had of lossless 48 kHz. foobar2000 reports 58 percent 24-bit and 42 percent 16-bit. 
Not a well-balanced corpus: Mostly heavier metal, indeed a quarter of the forty-ish GB came from one single publisher of doom/stoner samplers.

Results for 739 files, sorted by file size. All are 1.4.1 except those marked "1.3.4"
462301283893b4608
462230031753
451037146201.3.4 -8b4608
450987188251.3.4 -8
450756239831.3.4 -8pb4608
450678931801.3.4 -8p
450487638724b4608
450451736844
449846006675b4608
449786770315
Below this line,-b 4680 will improve
446100977277
446071122737b4608
445874267328
445857801548r7
445848594968r8
445835800238b4608
445825244138e
445817437578r7b4608
445808340288r8b4608
445786332978eb4608
445726637888r7 -A "subdivide_tukey(5)"
445679434108r7 -A "subdivide_tukey(5)" -b4608
445611326018p
445597567638pb4608
.
Not sure if 4608 is worth the effort compared to just encoding and be done with it, biggest impact here is 0.01 percent, but ... anyway, to the questions I raised, this test indicates the following:
* So -3 is not enough to slay 1.3.x, but it seems you don't have to go much above 44.1 to see how even low 1.4.x presets are better than anything 1.3 could accomplish.
* This is material where -r8 does matter, and that suggests that smaller block size could be advantageous.
* But still, -b 4608 improves once the predictor already is good enough, which requires 1.4.x. And with 1.4.x it does happen earlier (i.e. for lower subdivide_tukey) than for 96/24, see below. At default -5, -b 4608 was slightly harmful, but even at -7 it would help.
Actually, the biggest difference that -b 4608 did, was at the -8r7 -A subdivide_tukey(5); that was the only that crossed the 0.01 percent mark. 
But then the difference is down to 0.003 percent at -8p, so ... mostly academic interest this.

However the corpus may reduce the benefit of -b 4608. At least, for 96/24 it was the classical music that benefited the most from adjusting block size, and heavier music (like here) did not benefit as much.


High resolution coming up.
[...]
* -l 13 to -l 15 have something to them, but careful: It does not seem to be the case directly off -7 or -8. Say -8 -l 13 is not good, but -8 -A [something slow] -l 13 is. A bit of testing indicates that -l 13 starts saving space at -A subdivide_tukey(5) and -l 14 at (6).
With high-res classical music, -l 13 is the setting that improves over -7.
* -b 8192 also needs "-A [something slow]", it seems also to do harm when applied to -7 or -8 plain. But it doesn't help much here.

... for 48 kHz and 4608, it didn't need "something slow".

 

Re: FLAC v1.4.x Performance Tests

Reply #254
How about dividing your metals into two groups?
[1] A lot of fast drumming especially the higher pitched ones with strong transients (hi-hat, snare, rim shot...)
[2] Mostly heavy in guitar and bass but in general slower paced.
Does one of them benefit from a different block size than the other? (edit: including -b below 4096)

Re: FLAC v1.4.x Performance Tests

Reply #255
-e requires fairly low noise to work. Spek's spectrogram only has about 14 bits of dynamic range. The attached noisy.flac looks identical to clean.flac which can be misleading. Better use other tools like SoX to view the spectrum.

H:\>flac -f *.wav -8p
clean.wav: wrote 562708 bytes, ratio=0.425
noisy.wav: wrote 687112 bytes, ratio=0.519

H:\>flac -f *.wav -8e
clean.wav: wrote 515394 bytes, ratio=0.390
noisy.wav: wrote 687255 bytes, ratio=0.519

The advantage of -e is gone after I added some -80dB noise.

Re: FLAC v1.4.x Performance Tests

Reply #256
How about dividing your metals into two groups?
[1] A lot of fast drumming especially the higher pitched ones with strong transients (hi-hat, snare, rim shot...)
[2] Mostly heavy in guitar and bass but in general slower paced.
Does one of them benefit from a different block size than the other? (edit: including -b below 4096)

I took ~10 GB (238 files) from the "least distorted" end of it. 

Now higher block size is not that good:
* I had to go all the way to -A <something higher> again ...
* ... and enter a "p" in there, and 4096 rules.  Even with -8p -A <something higher>.

Everywhere in the following, -A is short for -A "subdivide_tukey(4/125e-3);tukey(7e-1);flattop".

10396908384   -7b2048
10390256536   -8b2048
10387465483   -8b2048 -A
10380212639   -8b2304 -A
10374886478   -8pb2048
10372850371   -8pb2048 -A
10370624591   -7b4608
10370154789   -7
10368208665   -8b3072 -A
10366619584   -8pb2304 -A
10364101954   -8b4608
10363963614   -8
10360997863   -8 -A
10360867169   -8b4608 -A 
<-- the only case where a different block size helps!
10359300604   -8pb3072 -A
10357464670   -8pb4608
10356736011   -8p
10354455085   -8pb4608 -A
10353968178   -8p -A

.

You see that -b 2048 degrades -8p -A "subdivide_tukey(4/125e-3);tukey(7e-1);flattop" down to worse compression than -7. 2304 is much better.

What included: Prog, heavy post-rock and not-so-growling guitars.  Not much Iron Maiden-alike metal - well, https://zephaniahband.bandcamp.com/track/destiny was part of it (not the album, this track from a sampler).  And there is music like https://houseofmythology.bandcamp.com/track/the-power-of-love from former black metal act Ulver.  (Yes former, just listen.)
And a couple of bootleg albums too (here is a long one, Nine Inch Nails & David Bowie: https://ninlive.com/shows/1995/19951011.html) and a couple of vinyl rips. 

What omitted: to give you an idea, https://bspliveseries.bandcamp.com/track/you-write-your-name-in-my-skin-live-2 .  Yes the drum machine provides for some transients, not all slow - but the guitar is not strummed in fast succession.  Lots of slow heavy music in the remaining 30 GB where -b 4608 could be of help.

Re: FLAC v1.4.x Performance Tests

Reply #257
Thanks, good to have some data.

Re: FLAC v1.4.x Performance Tests

Reply #258
Yeah, your results with lower block sizes [table below now includes a couple of -b 3456 too] have puzzled me a bit in general, as I only very rarely experience the same. So I tried a few more settings and found out that on this corpus, -r7 and even -r8 have some impact. That should indicate that by halving the block size, you get for free a "better" partitioning for Rice'ing - but that is not enough by far: halving block size inflates files by around .15 percentage points (tenfold what the -A thing improves!) Hm ... have you tried, whenever lower block size improves, whether -r7 or even -r8 would make for the same benefit?

Some more tests - to compare with other "minor impact" changes in parameters - are filled in to the table below.
Also some tests are not included in the table, as I scripted them with, *cough* different padding and I didn't bother to go back and re-do it. They are not comparable with the table, but they are comparable "with each other", so I give size orderings (worse to better):
-8eb2304  >  -8eb2304 -A  >  -8eb8192  >  -8eb8192 -A  >  -8eb4608  >  -8e  >  -8eb4608 -A  >  -8e -A  >  -8eb4608 -A
Then for -3/-5, it seems that 3000s fare well, but default is never far from best.
Then for -2, ordering: b512 > b1024 > b1152 default > b8192 with --lax > b3456 > b4608 > b2048 > b3072 > b3072 > b4096 > b2304.  I guess those who contemplate -l 0 have other concerns than .15 percentage points size impact though.

Table then, this time with compression ratios. Again the "-A" signifies -A "subdivide_tukey(4/125e-3);tukey(7e-1);flattop".

100,000%.wav
63,004%-8 --no-mid-side (i.e. dual mono)
62,075%-5b8192 --lax
62,026%-5b2048
61,971%-5
61,660%-8Mb2048
61,615%-8r2 -b2048
61,601%-7b2048
61,573%-8r4 -b2048
61,561%-8b2048
61,552%-8r8 -b2048
61,545%-8b2048 -A
61,507%-8r2
61,505%-7 -l 11
61,502%-8b2304 -A
61,489%-8M
61,470%-8pb2048
61,466%-8b8192 --lax
61,462%-7b3456
61,458%-8pb2048 -A
61,445%-7b4608
61,442%-7
61,431%-8b3072 -A
61,429%-8b3456
61,422%-8r4
61,421%-8pb2304 -A
61,413%-8b3456 -A
61,406%-8b4608
61,406%-8
61,403%-8r7
61,400%-8r8
61,388%-8 -A
61,387%-8b4608 -A
61,378%-8pb3072 -A
61,367%-8pb4608
61,365%-8pb3456
61,363%-8p
61,349%-8pb4608 -A
61,346%-8p -A
61,342%-8r9 -l 13 --lax
.

The table includes dual mono and -M (which selects decorrelation strategy adaptively) and carelessly I did not think over some of it being mono already - but that material (namely the NIN+Bowie bootleg) amounts only to 4.5 percent of the total file size.

Re: FLAC v1.4.x Performance Tests

Reply #259
I compiled a CDDA list that I thought would work best with -b3456 without using --lax. Turns out -b2304 won. They are mostly J-pop, not necessarily very fast but in general have percussion with good transients and not too much reverb.
Code: [Select]
https://youtu.be/wZ6D6ikU7Qs
https://youtu.be/L-0cJqZ5WU4
https://youtu.be/MM8RufZr5lw
https://youtu.be/pYnLO7MVKno
Total length 8h56m42s, around 63.98% of original size when compressed.

-8b4608 -r8
3635937892 bytes

-8b2048
3635016402 bytes

-8
3634663623 bytes

-8 -r8
3634586639 bytes

-8b4608 -r8 -A subdivide_tukey(5/2e-1)
3634481652 bytes

-8b3456
3634303829 bytes

-8b3456 -r8
3634260975 bytes

-8b3072 -r8
3634036597 bytes

-8p -b4608 -r8
3633576341 bytes

-8b2304
3633523804 bytes

-8b2304 -r7
3633489680 bytes

-8b2304 -r8
3633480956 bytes

-8b3456 -r8 -A subdivide_tukey(5/2e-1)
3633136956 bytes

-8b2304 -r8 -A subdivide_tukey(5/2e-1)
3632658660 bytes

Looks like 1024 and 1152 based block sizes are not really correlated to sample rates, otherwise -b2048 should not perform this bad. The -b3456 thing in my previous test may have some stuff with fewer transients. If you still want to try CUETools.Flake, -8 --vbr 4 will be more effective with -r 8 and -s search.

Re: FLAC v1.4.x Performance Tests

Reply #260
Why -8b2304 is so much better than -8b2048, putting them on opposite sides of the 3456/4096 ...
... probably we are in for another brute force.

Edit: Also -r8 helps -8, much more than it helps -8b<lower>, "the 8th r" doesn't help that much over the seventh ...

Re: FLAC v1.4.x Performance Tests

Reply #261
I compiled a CDDA list that I thought would work best with -b3456 without using --lax. Turns out -b2304 won. They are mostly J-pop, not necessarily very fast but in general have percussion with good transients and not too much reverb.
Code: [Select]
https://youtu.be/wZ6D6ikU7Qs
https://youtu.be/L-0cJqZ5WU4
https://youtu.be/MM8RufZr5lw
https://youtu.be/pYnLO7MVKno
Total length 8h56m42s, around 63.98% of original size when compressed.
[...]
If you still want to try CUETools.Flake, -8 --vbr 4 will be more effective with -r 8 and -s search.
Files were transcoded form APE, flac and WavPack instead of wav, multi-thread.

flac 1.4.2

-8b2304 -r8 -p
Total encoding time: 2:05.453, 256.68x realtime
3630058884 bytes

-8b2304 -r8 -pe
Total encoding time: 24:46.406, 21.66x realtime
3629421254 bytes

CUETools.Flake 2.2.2 (MD5 mismatch in 2 files)

-8 -r 8 -b 4608
3635394735 bytes

-8 -r 8 -b 2048
3635167153 bytes

-8 -r 8
3634138483 bytes

-8 -r 8 -b 3456
3633820714 bytes

-8 -r 8 -b 3072
3633680354 bytes

-8 -r 8 -b 2304
3633559971 bytes

The fixed block sizes tests above all have around 500x speed.

-8 -r 8 --vbr 4
Total encoding time: 1:26.344, 372.94x realtime
3627508218 bytes

-8 -r 8 --vbr 4 -s search
Total encoding time: 3:54.438, 137.35x realtime
3625410640 bytes

So vbr is an amazing thing... if there is no MD5 mismatch. Perhaps just scan for integrity after encoding, and re-encode the corrupted files with another encoder. In fact, after seeing this glitch I even scan files encoded with the Xiph encoders, if I am going to delete the original.

Re: FLAC v1.4.x Performance Tests

Reply #262
Encoding from .ape will skew the timings, as .ape takes even longer time decoding than encoding. But as long as conditions are equal for each run, cardinal time figures are nothing but indications anyway, in this thread where the number of compiles x CPUs probably match the number of FLAC options humans have ever hand-coded ...

In fact, after seeing this glitch I even scan files encoded with the Xiph encoders, if I am going to delete the original.
I always do. What if the process is aborted for whatever stupid reason, leaving a partial file?
Sure there is the -V , but for mass conversion: running a foo_bitcompare on all, to obtain one single line saying no differences, that is more idiot-proof than a human (myself) reading flac.exe's output.

Of course until fb2k v2 & foo_bitcompare are updated to treat 32-bit integer losslessly (that isn't the case yet I think?!) one has to use a different approach for those ... but then they aren't many. I don't have music in that format.

(And even with that zealous attitude of mine ... the first floating-point .wav's I downloaded, were Audition's format, and I should have WavPack'ed them using official wavpack.exe rather than through foobar2000.)

Re: FLAC v1.4.x Performance Tests

Reply #263
-0 does not pick the fastest block size!  Test done on CDDA with official 1.4.1 x64.

Since different block sizes have been tested, I did that for -0 and -2. Recall the difference between those is that -0 does dual-mono while -2 brute-forces the stereo decorrelation strategy. Both have an implicit -r3 which I have not touched.
I took all multiples of 512 and 576 up to 4608. Recall that -0 to -2 use 1152=2*576.

Computer: my friend's Ryzen-equipped not-so-expensive Acer consumer laptop, which delivers more consistent results than my Intels.
Corpus: As in my signature ... well nearly: by mistake one file had a second copy. 39 CDs. But I corrected sizes, they are 38.

Timings are median of 6 runs (i.e. average between the two middle ones); first I did each setting separate, three runs; then I did three runs of -0b512, three of -0b576 etc.
Results, thanks to https://theenemy.dk/table/ , are sorted by time.
"BS" - for "BigSlow" but surely intended to be read as "bullshit" yes - indicates it is both bigger & slower than the one immediately above. "bs" indicates that if all the "BS" were removed, this would become a BS.
0b3072275,1913 304 034 333
0b3456277,3213 304 683 018BS
0b2880277,3213 304 059 146bs
0b2560278,2813 305 036 984BS
0b2048278,2813 303 957 236
0b2304280,2113 301 667 605
0b1728283,9113 316 001 267BS
0b1536288,6013 321 899 365BS
0b1152292,0113 332 311 287BS
0b1024293,6413 342 207 656BS
0b4096295,9113 304 264 090bs
0b4608299,0613 307 182 585BS
2b1728298,2712 724 538 882
2b1536301,6512 730 340 579BS
2b2880301,3412 713 304 251
0b0512301,3413 442 182 994BS - well really, this is a "-0" down in the "-2" bunch
2b2560302,9012 714 138 123bs - really a capital BS, only "saved by" the "-0"
2b3072302,7812 713 394 065bs
2b2304304,2212 710 560 120
2b1152302,7812 740 658 828BS
2b3456303,0812 714 271 305bs
2b2048308,7512 712 682 405bs
0b0576317,1913 418 999 908BS - another "-0" here
2b1024323,5612 750 559 594bs - this too saved from "BS" by a "-0"
2b0512327,2412 851 045 240BS
2b0576357,1712 827 742 003bs
2b4608385,7312 717 232 770bs
2b4096386,2812 714 139 624bs
.
Inferences:  Well I don't really believe this to be any universal truth. Why should -0b2048 be faster than -0b2304 while -2b2048 is slower than -2b2304? But some patterns are obvious. For example, the "extremes" are quite slow, and not the best. 
* The smallest block sizes are obtained for -b2304 in both settings. I guess the only reason for -2 is to avoid using multiplication while still squeezing more bytes out, so ... there you go. And -b2304 was only a couple of percent off the fastest block size as well.
* -0 to -0b2304 saves 4 percent time and a quarter percent size. -0 to -0b3572 (tripling block size) saves six percent in speed then.

I have no idea why -2b4096 and -2b4608 are that slow.

Re: FLAC v1.4.x Performance Tests

Reply #264
Block size impact on "-5" and "-7" speeds. (Atop -5 because -5 is default, atop -7 because -7 is good.)

Damn me I forgot to record sizes on this one, but a couple of manual checks indicate that nah, nothing here that is worth it in cost/benefit terms.
But "interesting" it may be, even if the time impact from default is just a few percent saved - and -b 3456 seems to be fastest both at -5 and -7.

Same computer, compile, files (CDDA) and setup as the previous posting, but this time it is only median of three runs, and this time I managed to remove the 39th. Sorted by time:

5b3456   385,554
5b3072   386,659
5b2880   389,014
5b2304   390,926
5b1728   391,627
5b2560   391,999
5b2048   393,695
5b1536   395,008
5b1152   403,659
5b4608   411,612
5b4096   413,726
5b1024   414,359
5b0576   457,707
5b0512   466,306

7b3456   566,989
7b3072   571,83
7b2880   571,947
7b2560   576,015
7b2304   581,654
7b2048   589,71
7b4608   590,918
7b4096   592,326
7b1728   594,442
7b1536   605,249
7b1152   636,17
7b1024   650,702
7b0576   777,917
7b0512   805,091


So whoever came up with -b3456 (@bennetng, I think?) might have a bonus in this.
However, I never got -b3456 to produce the best compression - and to whomever came up with -7 -l 11 to speed up -7 slightly (@sundance , I think?), that one made for both faster encode and smaller files than -7b3456 in this corpus.

Re: FLAC v1.4.x Performance Tests

Reply #265
I don't have enough RAM disk space to benchmark encoding, data listed here are decoding speed, using foo_benchmark and single thread in RAM disk. Previous tests indicate relative decoding speed vs encoding parameters can be highly CPU-dependent, so keep this in mind.

BEST WORST

Some people may choose -0 because it decodes faster.

-0
8618108022 bytes, 1797.101x realtime

-0b2048
8603509098 bytes, 1836.416x realtime

-0b2304
8602686728 bytes, 1841.398x realtime

-0b3072
8605905301 bytes, 1845.695x realtime

-0b3456
8606959660 bytes, 1838.831x realtime

-0b4096
8607581496 bytes, 1851.142x realtime

-0b4608
8610113289 bytes, 1857.613x realtime

-3 and -8 are the lowest and highest presets default to -b4096. As for --no-mid-side, most, if not all of the test materials are in normal stereo.

-3b1152
8214442723 bytes, 1454.438x realtime

-3b2048
8179862763 bytes, 1523.949x realtime

-3b2304
8177409458 bytes, 1523.017x realtime

-3b3072
8178587836 bytes, 1508.774x realtime

-3b3456
8179963450 bytes, 1523.766x realtime

-3
8182244258 bytes, 1517.872x realtime

-3b4608
8186319940 bytes, 1524.288x realtime

At -8 I think there is no need to test anything below -b2048, except for Merzbow fans.

-8b2048
7951523906 bytes, 1377.707x realtime

-8b2304
7945563198 bytes, 1360.829x realtime

-8b3072
7938910470 bytes, 1350.458x realtime

-8b3456
7936429482 bytes, 1323.356x realtime

-8
7932854429 bytes, 1331.976x realtime

-8b4608
7932931331 bytes, 1320.299x realtime

At last, -5:

-5b2048
7987232470 bytes, 1420.886x realtime

-5b2304
7983765417 bytes, 1417.812x realtime

-5b3072
7983025331 bytes, 1423.270x realtime

-5b3456
7983619874 bytes, 1410.768x realtime

-5
7984866814 bytes, 1408.704x realtime

-5b4608
7988286008 bytes, 1408.448x realtime

Decode from RAM disk (what I did in this test) is still slower than "Load whole file into memory first" with around 1400-1500x decoding speed at -8 and 2000-2100x at -0, but I don't have enough RAM to do this.

Corpus total length 23h20m25s, around 53.52% at -8 to match ktf's graphs.

The playlist is deliberately built to achieve balance, with classical, electronic, ethnic, jazz, new age, pop, speech etc, including Eastern and Western works. The attached corpus.txt is not very well organized and shows a lot of "game music", but they are mostly individual tracks from different albums while some other files are big images without showing track names. Anyway "game music" is just all kind of genres used in games, except they don't have too many vocals. This highly deliberate effort gave the intended results at -8 in terms of file sizes I suppose. The lower presets are most likely limited by -l and -r, but -b1152 is still too low for the lower presets.

Re: FLAC v1.4.x Performance Tests

Reply #266
In fact, after seeing this glitch I even scan files encoded with the Xiph encoders, if I am going to delete the original.
I always do. What if the process is aborted for whatever stupid reason, leaving a partial file?
[...]
(And even with that zealous attitude of mine ... the first floating-point .wav's I downloaded, were Audition's format, and I should have WavPack'ed them using official wavpack.exe rather than through foobar2000.)
I have many WavPack files directly saved with Audition without going through wav. WavPack saves markers and loops and they can be read by other software like Sound Forge and Reaper, therefore I also have 16 and 24-bit WavPack files.

Re: FLAC v1.4.x Performance Tests

Reply #267
I don't have enough RAM disk space to benchmark encoding, data listed here are decoding speed
[...]
Decode from RAM disk (what I did in this test) is still slower than "Load whole file into memory first" with around 1400-1500x decoding speed at -8 and 2000-2100x at -0, but I don't have enough RAM to do this.
Corpus total length 23h20m25s, around 53.52% at -8 to match ktf's graphs.
Even CUETools.Flake is happy with the corpus and showed no error.

-8 --vbr 4
7925706124 bytes, 1317.942x realtime

-8 -r 8 --vbr 4
7925683221 bytes, 1333.292x realtime

Adding -s search makes the encoding speed comparable to -8p in flac 1.4.2 and therefore not tested.

Re: FLAC v1.4.x Performance Tests

Reply #268
Some people may choose -0 because it decodes faster.

Just to have repeated this for the record: in ktf's test done on an AMD processor, -3 decodes faster than -0.

Also -1 and -2 are -0M and -0m respectively, so the only difference in decoding are the transformations from mid+side / left+side / right+side to dual mono - and potentially the following: In the above link, we see that -2 decodes slightly faster than -1, and that is likely due to better compression and thus less data to handle and unpack (what else could it be?)

In your test, -3 is slower than -0, so this is where your Intel behaves ... not like AMD ;-)


Then:
-3: It seems that 1152 is slow and everything else is about equal. Don't know how much variation you would get by a re-run.
-8: I don't know why lower block sizes are faster here, but it might be that they use less complicated Rice partitioning - that is, 4096 could get another subdivision of two, compared to 2048? Just thinking aloud.

Re: FLAC v1.4.x Performance Tests

Reply #269
So I realize there's a very steep wall of diminishing returns when using encoding options beyond simply using -8.
I randomly tested this about a week ago.  Using Nine Inch Nails - The Fragile, full album as a single wave file.

CPU = AMD 5850U
Original wave size = 1046.52 MiB
-8 = 20 seconds, 628.39 MiB  (60.04%)
-m -b 4096 -p -r7 -l18 -A subdivide_tukey(21/15e-1) = 283 minutes, 626.29 MiB  (59.84%) -0.02%
-m -b 4096 -p -e -r7 -l24 -A subdivide_tukey(7) = 564 minutes, 625.95  (59.81%) -0.23%
-m -b 4096 -p -e -r7 -l32 -A subdivide_tukey(21/15e-1) = 8972 minutes, 625.47 MiB  (59.76%) -0.37%

6.25 days to shave off just under 3 MiB  compared to just using -8!

Re: FLAC v1.4.x Performance Tests

Reply #270
-m -b 4096 -p -e -r7 -l32 -A subdivide_tukey(21/15e-1) = 8972 minutes, 625.47 MiB 
8972 minutes! That is what i call a performance test  8)
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #271
So I realize there's a very steep wall of diminishing returns when using encoding options beyond simply using -8.
You can argue for "beyond -5" and "beyond -7" there as well. Try those!

Yes there is no "practical limit" to how slow you can get the encoding. I have also once run it over a few days I was away (the -A enough-for-five-days line here). Your test was on a high resolution thing (otherwise the -l would have called you to invoke --lax), where -e often makes more difference than to CDDA.
Still, try the following, which won't take nearly as much time:
-8p
-8e
-m -b 8192 -r9 -l 15 -A "tukey(7e-1);punchout_tukey(4);subdivide_tukey(12);welch"
-m -b 8192 -r9 -l 15 -A "tukey(7e-1);punchout_tukey(4);subdivide_tukey(12);welch" -p
-m -b 8192 -r9 -l 15 -A "tukey(7e-1);punchout_tukey(4);subdivide_tukey(12);welch" -e
and see how they compare.
If it does good ... well how to tell? Need to run them all possible combinations brute-force. That is what takes time.


-m -b 4096 -p -r7 -l18 -A subdivide_tukey(21/15e-1) = 283 minutes, 626.29 MiB  (59.84%) -0.02%
-m -b 4096 -p -e -r7 -l24 -A subdivide_tukey(7) = 564 minutes, 625.95  (59.81%) -0.23%
-m -b 4096 -p -e -r7 -l32 -A subdivide_tukey(21/15e-1) = 8972 minutes, 625.47 MiB  (59.76%) -0.37%
There was something very unreasonable about your differences, as you earned only half a megabyte between the two latter. And indeed you have not quoted them correctly. They are
-0.20 not 0.02
-0.23 yep
-0.27 not 0.37. Well it is really 0.28 after roundoff.
So going from slow to ultra-slow gives you slightly less than 0.08 - not the mighty 0.37-0.02=0.35 your numbers could suggest.

(Oh, but it is percentage points - you get nearly half a percent ;) )

Re: FLAC v1.4.x Performance Tests

Reply #272
Apologies for the typos on the percentages.  My window to edit my post closed before I realized.  It's CDDA audio, so I should have specified the --lax option was used.

I had run other random tests, but only included a few.  I find that running either -e or -p in combination with subdivide_tukey gives better compression results than using -e and -p combined, while being faster.  I also noticed using higher values can sometimes result in worse compression than using smaller values.

Re: FLAC v1.4.x Performance Tests

Reply #273
Oh, I forgot that The Fragile is a double CD  :-[

I find that running either -e or -p in combination with subdivide_tukey gives better compression results than using -e and -p combined, while being faster.
My experience is that for CDDA, you can outdo -e in shorter time by subdivide_tukey around 5. Above that, go for -8p. And that -8p -A subdivide_tukey(reasonably high) still will outdo -8pe.
For high resolution, the -e still isn't useless.

The thing is, brute-forcing "roundoffs" (well -e tries to put the upper coefficients equal to zero, in which case you don't have to store them, that saves space) does not only find the best trade-off between the bits the coefficients take up and the bits the coefficients will save - it also "unpredictably" moves the coefficients somewhere in a direction which every now and then is "better without us knowing why before actually calculating it". Brute-forcing means to actually go through the encoding for a lot of different (possibly only-slightly-different) coefficient vectors, and even if the resolution doesn't really save much, it could "by chance" be better. And even more so if FLAC's way of "guesstimating first to pick the best which is then calculated thoroughly" is not-so-good - and that has evidently been tested on CDDA.

I also noticed using higher values can sometimes result in worse compression than using smaller values.
Higher number in subdivide_tukey(N)? That also changes the effective tapering. Which does not necessarily give a better or worse, but if you do many tests you should see both directions. Your 21/15e-1 means that the the "full tukey" part of it has tapered 1.5/21 = 1/14 is around 0.07, which in my tests would be too close to a rectangle. Hence my suggestion to include a separate tukey with more tapering.

Re: FLAC v1.4.x Performance Tests

Reply #274
And just to get some idea on what makes differences on The Fragile:
* Three MB saved -5 to -7.
* Another half a MB-ish saved -7 to -8.
* Another half a MB-ish -8 to -8p. (Midway in between: -8r7.)
* Another half a MB-ish -8p to -8pr8
* Another half a MB-ish for stacking up with -A "subdivide_tukey(12);tukey(7e-1);punchout_tukey(4);welch;hann;flattop"