FLAC v1.4.x Performance Tests

Topic: FLAC v1.4.x Performance Tests (Read 91174 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Re: FLAC v1.4.x Performance Tests

Reply #450 – 2023-10-31 22:49:27

Quote from: Porcus on 2023-10-26 23:38:56

-e and the model selection algorithm. Evidence from upsamples (and ... higher block sizes)

Block sizes, then. Yes, and more -e as well. (And adding a blackman function, at @bennetng 's suggestion.)

TL;DR: for upsampled material,

-b8192 is good on upsamples. Improves all the albums at 192 kHz and 33 of 38 at 96 kHz - relative to default -b4096.
But going further into -b16384 might be harmful up to .4 percentage points. True, these large blocks benefit 10 of 14 classical albums (and for classical music you might consider increasing -l as well, which I have not tested rigorously) - but doesn't save many bytes.
Actually, even on as high as 576 kHz (still upsampled from CDDA yes), -b16384 might be worse than -b8192 - but then -e matters so much that one should test -eb16384 will improve over -eb8192 ... if one at all were interested in tuning anything to signals that have three and a half octave empty on top. (Well we know that vendors are doing that sort of nonsense!)

A summary of compression ratios:

Only every other artist labeled by name. The bottom curves reveal that the settings don't matter that much, but for some albums it isn't insignificant in percents.
The apodization chosen for the "-A ..." is -A "subdivide_tukey(3);blackman", i.e. like -8 with an additional blackman. We shall see the impact of that extra function.

Some albums in particular:

The leftmost is the harpsichord. Hard to compress. No surprise.
The third one, where 96 kHz compresses to 28.7, nearly as much as CDDA at 31.3, is Bruckner: Motets. Vocals!
Not surprisingly, metal is hardest to compress. "Worst" is the Sodom live album - with crowd noise and all that. It shall turn out that it is the one that benefits the most from -e when upsampled.
Also the first among the "other" section, Tori Amos, requires a high bitrate. Biased choice: I chose that particular Tori Amos album over a "huh, is that supposed to be that hard to compress?" Then the Armand Van Helden techno also takes some bits to compress.
The Miles Davis album doesn't have that much stereo separation, so no surprise the ratio is low.

Not a single subframe encoded as VERBATIM. Some CONSTANT subframes that are absolutely not 0, an artefact of the resampling.

The impact of block size:
First, to see if the resampling to 24-bit by itself makes a big impact, I also did a resample to a requency not that far from 44.1: 64 kHz, 24 bits. (That is where the surprise of the -e came in, more about that later.)

It isn't so that 44.1/16 -> 64/24 "changes everything": Sure the audio format makes a quantitative impact, sure the upsampling makes b2048 worse and high block sizes less harmful - but the "shape of the graphs" are not too different.

Explanations and remarks:

"-8 vs +blackman": All but the grey are done with -A "subdivide_tukey(3);blackman". The grey doesn't apply blackman and is slightly bigger. Not much, you can see.
Classical music aggregate: Already at CDDA, -b8192 improves over -b4096. For 64 kHz: coincides with the blue -b16384, that is why it isn't visible. So there is no reason to use 16384 here.
The eternal issue of % vs points improvement? My files aren't uncompressed, so "points" is then kinda irrelevant. And the corpus is about evenly in the three sections measured compressed. Yet I used points here. In a way it overweights the classical music in particular.

.

More severe upsampling. Still sticking to percentage points, though that might not be the best metric.
Also I forgot the following, which I have sometimes suggested: when doubling block size, one could - to compare apples more closely to apples - try to maintain partition size by increasing the -r. So, -b4096 -r6 (as -8 does), -b8192 -r7, -b16384 -r8. I did not do that!

At 192, classical music benefits 0.06 percentage points from -b16384 vs -b8192. It is a bit more in percent, when it compresses to 20. But the "other" section is harmed by 0.07 (and compresses to 22). And the heavier music section, which compresses to 30, is harmed by 0.05.
But, every album at 192 and 33 of 38 albums at 96, benefit from 8192.
At 192, you see the extra blackman starts to matter a little bit, relatively speaking - at least for the high bitrate albums, not surprised to find the impact there.

.

Impact of -e. Why bother?
First, confirming that -e makes a difference you don't see in CDDA.
Second: If the model selection does bad for a certain block size, then improving the algorithm could overturn everything above.

The 64 kHz had this exceptionally small impact of -e - even, in some instances producing bit-identical files - so it is not (only) the resampling procedure that makes for the -e impact:

More severe upsampling again. When upsampled an octave or two, graphs start to wiggle big time:

These are percentage points. "Triple-ish magnitude" in the 192 kHz chart is for the most bad-ass figures; for the Sodom album, -8e is seven percent smaller than -8. However with -b8192 -A "subdivide_tukey(3);blackman" it is down to slightly above five. And furthermore, -b8192 is better than -b4096 both with and without -e and the -e benefit is smaller.

Re: FLAC v1.4.x Performance Tests

Reply #451 – 2023-11-01 00:07:34

More block size testing. This time on much smaller material I obtained as 96/24 (well some percent were 88.2/24). Not saying the high resolution isn't "fake".

TL;DR:
Depends on corpus, especially if you are looking for a "winner" - but 8192 seems like a safer bet, always within plus/minus 0.4 percent of 4096.
Block size 16384 might perform outright bad, like > 1 percent worse than 4096 and 0.9 percent worse than 8192 - and its benefits are predictably small, it halves the already low block overhead of 8192.
Impacts are not that different with -e or -p.
-p or -e best? For most files, -p makes for smaller files than -e, but the exceptions could have much bigger impact.

Settings:

Tried -7b<block size> with and without -e, and with -r as follows (which was not done in the previous post, as I forgot all about it then): -b4096 keeps its -r6, partition allowed to be as low as 64 samples; -b8192 gets -r7 which then also allows partitions as low as 64 samples; -b16384 gets -r8 for same reason
Same but with -A subdivide_tukey(5) for better compression, with and without -p or -e (bot not both).

Figures based on the latter. FLAC build used: https://hydrogenaud.io/index.php/topic,123176.msg1033168.html#msg1033168 with multi-threading.

Corpus and results:
* 7 first tracks merged to one file from Kimiko Ishizaka: Die Kunst der Fuge for solo piano, 96/24 free download from here.
Block size 16384 saves 0.04% over 8192.
* 7 first tracks merged to one file from Nine Inch Nails: The Slip. 96/24 free download.
4096 best, 8192 costs 0.02%, 16384 costs 0.23%
* 7 first tracks merged to one file from Kayo Dot: Hubardo. 96/24 purchase from here, as used in https://hydrogenaud.io/index.php/topic,120158.msg1003288.html#msg1003288
8192 best, others cost 0.6 to 0.7 percent.
* EP merged to one file: The Tea Party Tx20. 96/24 purchase. As mentioned in the same thread.
@bennetng tells me it must be intentionally clipped for rawer sound, and this is one where I know that 2048 is good. So 4096 beats the two others, and the impact is as big as 0.4 and 0.9 percent.
And -e beats -p.

Tracks and the like:
* Nearly a minute in one file of Anal Trump: That Makes Me Smart!. 88.2/24, from https://analtrump.bandcamp.com/album/that-makes-me-smart
Grindcore, also wants smaller block size. Bigger costs 0.1 resp. nearly 0.4 percent.
And -e beats -p, making 0.3% smaller files.
* Track: Thy Catafalque: https://thycatafalque.bandcamp.com/track/erdgeist-2021
Melodious black metal, this is called. The only 16384 win outside the "classical" bracket, and very narrowly by 0.02%, and 4096 costs 0.15%
* 16 single tracks from https://www.discogs.com/release/25346929-Various-HDTracks-2022-Hi-Res-Sampler . Omitted tracks 2 and 9 which are normal sample rate.
. Up to track 8 and 12 and 13: The "more classical music" including modern classical. Higher block size better, but max impact 0.1 percent over -b8192 - which in turn saves nearly 0.3 percent over 4096.
. But: 14 (vocal music) and 18 (violin) narrowly prefer 8192.
. 10: Blues Company. Here 4096 wins by a "sizeable" impact. A quarter of a percent over 8192 and 0.74% better than 16384.
. 17: Even "worse" is the piano jazz, the second-biggest impact in 4096's favour. 0.3 and 1.1 percent bigger.
. The remaining three: 8192 wins, but never more than by 0.2.

Counting it up, -b16384 "lands most victories", but that is because there are so many classical/modern classical tracks here. Even still, -b8192 is about as good measured with unweighted average: better with the "lightest" -7 based settings AND with the heaviest -8p -A subdivide_tukey(5) etc setting (but not without the -p).
For the heaviest setting tried, make the following comparison:
* Choose the block size that makes smallest for each file,
vs
* Choose always 4096 resp. always 8192, resp always 16384
The "always" make for 0.14% larger resp 0.03% larger resp. 0.04% larger in unweighted average, and it is a corpus that in unweighted average is likely imbalanced in favour of larger block sizes.
In comparison, -p saves 0.06%.

So ... for 96 kHz, block size 8192 could be something to consider, but even thinking of 16384 at that sampling rate is "for those so inclined" (as if compression improvements above -8 isn't already).

Re: FLAC v1.4.x Performance Tests

Reply #452 – 2023-11-01 07:07:39

The higher the resampling ratio is, the smoother the resulting waveform is with more sine-looking patterns. To exploit the pattern a longer sample history lookup is required and -e reduced this requirement.

8k white noise

8k white noise upsampled to 48k

overlapping two plots

Re: FLAC v1.4.x Performance Tests

Reply #453 – 2023-11-01 09:26:40

In general, the higher the oversampling factor, the better the sound quality, right?

Re: FLAC v1.4.x Performance Tests

Reply #454 – 2023-11-01 10:18:32

There are different resampling algorithms, but before talking about which is "good" or "bad", not all resamplers or not all resampling settings are designed to remove as much high frequency content above the original sample rate's Nyquist as possible. In such cases, flac's -e setting won't be very effective in file size reduction.

If people believe (hydrogenaud.io has TOS#8, so "believe" is not enough) upsampling a CDDA file to to something like 352.8kHz before sending to the DAC can improve perceived sound quality, they can do this on their own files on the fly using the playback software's DSP options instead of obtaining such files from content providers.

Re: FLAC v1.4.x Performance Tests

Reply #455 – 2023-11-01 13:38:13

Quote from: bennetng on 2023-11-01 07:07:39

The higher the resampling ratio is, the smoother the resulting waveform is with more sine-looking patterns. To exploit the pattern a longer sample history lookup is required and -e reduced this requirement.

This is where I get (mildly) surprised at what actually happens. Yes "more sine-looking", but not "sine" in the sense that two past samples determine the entire thing. Nearer that when you take it to the extreme (way more than the 6x in the above waveform), but still. (Also we have seen that even with -e, reference FLAC doesn't predict sines by order 2, likely due to quantization to integers, so even if upsampling would smoothen it to sine-like, it wouldn't reduce the order all the way.)

When invoking -e (taking that as best shot for "best" predictor), I "often" see the following: When the model selection algorithm suggests 11, -e will select 12 for the CDDA original and 10 for the upsample. Hm?
If you hypothetically could choose the ten samples samples -2, -4, -6, ... , -20 you would likely get the interpolated samples predicted well too, but there is no provision for that in the FLAC format.

"Mildly" surprised only, I am used to getting surprises here. And we aren't comparing random apple to random apple, when one of them is well-tuned to its own home ground.
If we look at other codecs, going CDDA -> high resolution reveals that they were well tuned for CDDA. http://audiograaf.nl/losslesstest/Lossless%20audio%20codec%20comparison%20-%20revision%206%20-%20hires.html : Monkey's (which uses big blocks) misbehaves for the highest resolutions, where WavPack isn't much good without -x. (Even -x1 improves so much one shouldn't wavpack hi-rez without it.)
Or maybe it is FLAC that is good because it doesn't spend bits modeling long-term patterns that aren't there? Even if we see that FLAC compression often can be improved quite a lot for those signals?

Re: FLAC v1.4.x Performance Tests

Reply #456 – 2023-11-01 14:36:49

More sine-looking, but a complete sine is not necessary, or not even sine, just some curves. When the signal is bandlimited to a certain cutoff point, the samples cannot be bent to the opposite direction instantaneously because it will create a glitch which is not bandlimited. It must take several samples (depends on resampling ratio) to reach the opposite direction, and those "several samples" can be exploited even when a long term lookup is not possible.

Quote from: Porcus on 2023-11-01 13:38:13

Or maybe it is FLAC that is good because it doesn't spend bits modeling long-term patterns that aren't there? Even if we see that FLAC compression often can be improved quite a lot for those signals?

Maybe, especially when you see bloat when using WavPack's -hh or APE's extra high or insane, trying too hard to find patterns but turns out nothing can be exploited. I observed similar phenomenons in other aspects like video codecs as well, the encoder can be set to analyze macroblock with different sizes, enabling every block sizes can result in bloating in certain contents.
https://shopdelta.eu/h-265-video-coding-standard_l2_aid860.html

[edit]
Also flac allows a lot of user settings while many other codecs are preset-based. Other codecs either only provide hard-coded settings which cannot be tweaked, or the tweaks are not exposed to users.

Take WavPack as example, I would actually like to ask Bryant if it is possible to build a preset which is faster than -x4 to -x6 to specifically optimize for upsampled content without breaking backward compatibility.

flac is like "I can do a lot of things but you need to tell me how to do it right". Other codecs are like "trust me or don't use me".

Re: FLAC v1.4.x Performance Tests

Reply #457 – 2023-11-01 16:30:26

Quote from: bennetng on 2023-11-01 14:36:49

More sine-looking, but a complete sine is not necessary, or not even sine, just some curves. When the signal is bandlimited to a certain cutoff point, the samples cannot be bent to the opposite direction instantaneously because it will create a glitch which is not bandlimited. It must take several samples (depends on resampling ratio) to reach the opposite direction, and those "several samples" can be exploited even when a long term lookup is not possible.

Yeah sure, and that explains why the in-betweens can be well compressed (and why the higher-res compression levels in the top graph in #450 are so much lower). But that is not precisely the same as to say that the optimal prediction length with twice as many samples is half the time. Sure you can predict the next few samples well, since you got the smoothness - but that is not the same as to predict one that is say 10/44100ths of a second away.

BTW, you got any idea how computationally costly it is to perform a simple signal analysis to check upper frequency? If that is where the model selection algorithm could be improved, I mean.

Quote from: bennetng on 2023-11-01 14:36:49

[edit]
Also flac allows a lot of user settings while many other codecs are preset-based. Other codecs either only provide hard-coded settings which cannot be tweaked, or the tweaks are not exposed to users.[/edit]

ktf's tests uses the standard presets. But of course, those are tuned too - and retuned. But obviously: not specifically for high resolution. Yet high resolution material makes FLAC (1.4!) catch well up with heavier codecs - despite what we see, there are often quite significant improvements possible.

It is not the format. The quite unique thing is how the FLAC reference encoder allows a lot of user choice that other codec formats could very well support, had anyone bothered to implement it into the encoder.
ALAC? Not too well known, CUETools' ALAC encoder lets you pick apodization functions among welch, hann, flattop, tukey, and offers search effort also beween flac.exe's default method and its "-e". I tested that as well (corporate codecs should die, though!)

Monkey's certainly has a different philosophy, compressing to precisely the same bitstream as nineteen years ago. (There is a reservation for high resolution, actually.) Even, it stores its MD5 hash computed on the encode, not on the uncompressed audio.

Quote from: bennetng on 2023-11-01 14:36:49

Take WavPack as example, I would actually like to ask Bryant if it is possible to build a preset which is faster than -x4 to -x6 to specifically optimize for upsampled content without breaking backward compatibility.

The answer is yes, it is "possible" :-)
(Pending a successful design, you will have to use --threads .)

Re: FLAC v1.4.x Performance Tests

Reply #458 – 2023-11-01 17:36:42

Quote from: Porcus on 2023-11-01 16:30:26

BTW, you got any idea how computationally costly it is to perform a simple signal analysis to check upper frequency? If that is where the model selection algorithm could be improved, I mean.

If the goal is to catch up -e's performance without using -e, the analysis does not need to be in very high accuracy. For example, you can generate a prime number sine like the 4567Hz one, but with some numbers beyond 16kHz, and the benefit of -e will diminish and overtaken by -p, and I already tested in the mp3 corpus that -e is not useful for 16k cutoff, and you tried the 64k upsampled contents too.

The usual spectrum analyzers in DAWs or visualization software often default to more than thousand of samples and can show the spectrum of a CD image within several seconds, and can do animated spectral analysis during playback, which is trivial. For example, foobar's built-in spectrogram can be set to the lowest 128 and one can still easily identify a 18kHz cutoff on an mp3 file, more sensitive than -e.

Re: FLAC v1.4.x Performance Tests

Reply #459 – 2023-11-04 17:42:20

Quote from: german87 on 2023-11-01 09:26:40

In general, the higher the oversampling factor, the better the sound quality, right?

The oversampling ratio on a typical DAC is 64-1024x, so the whatever ratio is built into the file (1-2x) is basically irrelevant compared to that except in so much as it complicates efficiently compressing the audio.

Re: FLAC v1.4.x Performance Tests

Reply #460 – 2024-02-24 23:43:37

Dear @Wombat, where can I get your high-speed FLAC 1.4.3 build, suitable for a processor with the following features?

Re: FLAC v1.4.x Performance Tests

Reply #461 – 2024-02-25 00:58:43

I once tried to do something faster for my j5005 but gcc optimized compiles did roughly nothing. Since your T7250 is even slower and older i am sorry.

Re: FLAC v1.4.x Performance Tests

Reply #462 – 2024-02-25 12:09:04

@Kraeved: I don't know what the Rarewares compiles require, but https://hydrogenaud.io/index.php/topic,123025.msg1029768.html#msg1029768 indicates that AVX is key to improvements over the official build.
Why don't you run a comparison? I don't think anyone else has posted figures on a (*looks up*) 2007 CPU.

Re: FLAC v1.4.x Performance Tests

Reply #463 – 2024-02-25 12:41:00

From memory, the non-AVX2 64 bit compiles are SSE-3 as the maximum requirement.

Re: FLAC v1.4.x Performance Tests

Reply #464 – 2024-02-25 13:13:55

@Wombat, I mean do you have 1.4.3 GCC build without AVX requirement? So far I use x64 one from Rarewares.

Code: [Select]

FLAC v1.4.3 Release bundle 2023-06-23
Latest Release, flac.exe, metaflac.exe. win32 compile is XP friendly.

win32-nonXP Download (555kB)
win32 Download (689kB)
x64 Download (537kB)                <<<<<<<<<<<<<<<<<<<<--- This one.
x64-AVX2 Download (995kB)

Re: FLAC v1.4.x Performance Tests

Reply #465 – 2024-02-25 15:58:26

I guess you mean a generic GCC 13.2.0 compile then. I attached one. Runs fine on my j5005.

Re: FLAC v1.4.x Performance Tests

Reply #466 – 2024-02-28 14:39:49

Thank you, @Wombat. I compressed and re-compressed several WAV and FLAC albums, but, indeed, noticed no difference in speed. Also I would like to thank all those developers who contribute to future-proof solutions, not just rolling out some binaries to show off, that run smoothly across the globe for decades, even on vintage computers.

Re: FLAC v1.4.x Performance Tests

Reply #467 – 2024-02-28 14:56:30

Quote from: Kraeved on 2024-02-28 14:39:49

even on vintage computers.

flac for MS-DOS: https://hydrogenaud.io/index.php/topic,123374.0.html

Re: FLAC v1.4.x Performance Tests

Reply #468 – 2024-03-02 06:26:13

Those of you who have been aware of the development of FLAC all these years, please tell me if there are any significant changes between versions 1.3.4 and 1.4.3? I ask because the amazing FSLAC lossy encoder by @C.R.Helmrich that works like LossyWAV is still based on version 1.3.4, and I'm worried.

Re: FLAC v1.4.x Performance Tests

Reply #469 – 2024-03-02 07:59:00

1.4.x improves compression, especially on high resolution, but also on enough CD material to make the difference you see in the y-axis:
http://www.audiograaf.nl/losslesstest/revision%205/Average%20of%20all%20CDDA%20sources.pdf
http://www.audiograaf.nl/losslesstest/revision%206/Average%20of%20CDDA%20sources.pdf
The difference in time taken are due to new computer (Intel CPU this round).

Re: FLAC v1.4.x Performance Tests

Reply #470 – 2024-03-12 22:58:46

How is it that SoX creates a file smaller than FLAC -8 and CUETools.Flake -8?

Re: FLAC v1.4.x Performance Tests

Reply #471 – 2024-03-12 23:19:20

Padding (for future tags). Reference flac spends 8196 bytes on that by default. CUETools.Flake spends 4096.
You can of course reclaim the space, but then you will have to rewrite the entire file (which isn't much ...) upon if you add any tags.

Actually, compare the -sox and the -flac8 in a text editor. You will see they are identical except at the beginning.

Re: FLAC v1.4.x Performance Tests

Reply #472 – 2024-05-06 20:49:10

This has probably been well covered over the last nearly-nineteen pages, but:
Is there any particular reason why 64-bit flac.exe should be so much faster than 32-bit?

Re: FLAC v1.4.x Performance Tests

Reply #473 – 2024-05-07 06:46:22

Last time I checked (which is a while ago) difference was about 30%, which I don't think is extraordinary. There are lots of reasons for 64-bit compiles to be faster. 64-bit mode on x86 has double the number of registers for SIMD (SSE, AVX etc), it can do 64-bit math twice as fast, and SSE2 is standard, whereas 32-bit compiles don't use SSE2 throughout the whole program (only when explicitly coded) for compatibility reasons.

Re: FLAC v1.4.x Performance Tests

Reply #474 – 2024-05-07 09:58:37

It'll be interesting to see how APX (the biggest thing being doubled registers) improves things. Probably ~10% (pure speculation) in general but every workload will be different.

https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html

Notice