Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC v1.4.x Performance Tests (Read 44977 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

Re: FLAC v1.4.x Performance Tests

Reply #350
BTW, should it matter at all what CPU is used for compiling?

(Compatibility issues here are only AVX for one Rarewares build and which ones of yours? Plus AVX512 for your v4?)
The CPU shouldn't play a role for compiling until you tell the compiler to use "native" optimization and it tries to detect the CPU in use.
The last binaries are all AVX2 if not stated otherwise.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #351
Some multithreading results posted in that thread: https://hydrogenaud.io/index.php/topic,124437.msg1030783.html#msg1030783

In the last line of the reply, the "three -j1 runs discarded" (from that table) were done with Wombat's "GCC" build, to check whether times were about the same for the "v3" build posted above. Not much differences to write home about. The "v3" had so small variations to ktf's build, and also the GCC build might have had some inconsistent timing, being ran immediately after a different setting (say its -5r7 ran after another build doing something that was more than twice as intensive).

Anyway, "v3" didn't make for miracles on that computer.

Re: FLAC v1.4.x Performance Tests

Reply #352
Many hanks for more numbers! While the last v3+v4 was more meant to check if AVX-512 helps.
If this was your i5-7500T it also doesn't like the -falign-functions=32 compiler flag.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #353
It was the i5-7500T - which also, I misread it, is 4 cores 4 threads.

But here are some figures from my i5-1135G7. Lower is better (and more negative is better). Only one run each, so take with a grain of salt.
I have two different things into the table here as well: typically, percents are "overhead penalty" of actual time vs the idealized "j1 time / # of threads". Edit: The "j9" is corrected to be 8 threads, which is what the CPU has. The -j9 setting was to verify that nothing too stupid happens.
But, the "time w4 vs w3" line has nothing to do with overhead, it measures how much time changed moving to your v4 from your v3. You see benefits at -8, but the other way around for -5, -3 and -2er7. At -2r0 things are going so fast anyway that I won't trust the numbers without multiple runs, but in the very least the sign is negative more often than positive.

The two rightmost columns are the same encoding parameter plus the "-M" to check if the overhead is that nasty as previous figures suggested (it is).
I also ran a range of -0r0 -b <something>, and there is no sign that the "v4" is worth it.

-8:j1 time/diffj2 ovrhd/diffj3 ovrhd/diffj4 ovrhd/diffj5 ovrhd/diffj8 ovrhd/diffj9 ovrhd/diff-Mj1 time/diffj2 ovrhd/diff
v51218%23%40%55%124%138%8397%
Wombat311316%29%43%70%138%158%8196%
Wombat410223%34%46%75%153%155%75111%
time w4 vs 43−9%−4%−5%−7%−6%−4%−10%−8%0%
-5:
v54924%56%80%105%239%246%4088%
Wombat34629%56%77%123%248%243%3991%
Wombat44746%55%111%127%257%270%4184%
time w4 vs w31%14%0%20%3%3%9%5%1%
-3:
v53726%68%100%148%312%307%3892%
Wombat33625%62%109%146%320%317%3883%
Wombat43632%66%120%162%335%345%3899%
time w4 vs w3−1%4%1%4%5%2%5%0%8%
-2er7:
v57016%38%52%75%180%212%50102%
Wombat36520%39%64%86%188%227%48104%
Wombat46617%37%64%94%203%229%4895%
time w4 vs w32%−1%1%1%6%7%2%1%−4%
-2r0:
v54424%57%85%153%377%320%3891%
Wombat33946%81%123%180%394%362%38101%
Wombat43855%63%111%205%379%360%36120%
time w4 vs w3−1%5%−12%−7%7%−4%−2%−5%4%

Re: FLAC v1.4.x Performance Tests

Reply #354
I guess original v5 is without AVX-2. So for high compression the speedup from non to AVX-2 to AVX-512 scales well at least.
For -5 the numbers are a bit surprising.
Again thanks for the numbers!
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.x Performance Tests

Reply #355
In order not to hijack @ktf 's comparison update thread with too many FLAC specifics:
The fastest flac.exe encoding is now faster than decoding. I replicated this on a small sample of 192/24 files on an i5-7500T, repeated a few times to give you "ballpark" timings. Everything done on an internal SSD to the same SSD.
FLAC files: created using -0, or for the even faster times: -0 --no-md5

16 seconds / 14 seconds: decoding, respectively, -0 files / -0 --no-md5 files
13 seconds / 8 seconds: encoding -0 / -0 --no-md5
10 seconds / 6 seconds: flac -t on -0 / --no-md5

I did much of the same thing on a larger corpus with 96 kHz files too, similar results.

Now if I add -b4096 to everything, numbers improve slightly. Same order as above:
13/11 decoding
11/7 encoding
9/5.5 test

Add a very few tenths of a second to get the difference between -0b4096 and -1b4096, the latter being faster than -0 as well - actually, -2b4096 (optimizing joint stereo / dual mono over -0's dual mono only) was at -0 speed, give or take a few tenths of a second.  It looks like -b2048 could be just as fast as -b4096, which would be in line with what I have seen before - but I am not going to pretend that kind of accuracy.

Re: FLAC v1.4.x Performance Tests

Reply #356
Didn't read the whole thread, so apologies if this was mentioned...
I'm seeing significantly higher bitrates with FLAC 1.4.2 and 1.4.3 for "simple" signals, like sine waves, compared to FLAC 1.3.4, both at -8 compression.
For example, a 1kHz 48k-16bit sine tone is at 164 kbps with 1.3.4 and at 231 kbps with 1.4.3.

Pink noise, on the other hand, has almost the same bitrate with both versions.

Is this known/expected?

Re: FLAC v1.4.x Performance Tests

Reply #357
Is this known/expected?
No, not really. I'm able to reproduce this.

FLAC 1.4.0 had some major changes that benefit most sources, but not all. Apparently the sine wave you mention is one of the cases that do not benefit. However, a 1kHz sine sampled at 44.1kHz or a 1.01kHz sine sampled at 48kHz show a much smaller loss.
Music: sounds arranged such that they construct feelings.

Re: FLAC v1.4.x Performance Tests

Reply #358
Just tried a 10 seconds full scale dithered 1kHz sine at 16/48.

1.4.3
253kbps -8
146kbps -8p
114kbps -8e

1.3.4
186kbps -8
157kbps -8e
143kbps -8p

[edit]Attached a 4567Hz sine, 1.4.3's -8e performs much better than 1.3.4.

Re: FLAC v1.4.x Performance Tests

Reply #359
One thing is that "-8" differs, because it is now synonymous to something else with different apodization functions.
But -5 also. bennetng's file recompressed:
322377 bytes with 1.3.4 win64 at -5 (379745 (bigger!) by adding --lax -l32)
381923 bytes with 1.4.2 win64 at -5 (382278 (bigger!) by adding --lax -l32)

Adding -p, similar happens:
222297 bytes with 1.3.4 win64 at -5p
268901 bytes with 1.4.2 win64 at -5p
 
-e instead of -p reverses the order, now 1.4 makes smaller:
276571 bytes with 1.3.4 win64 at -5e
269589 bytes with 1.4.2 win64 at -5e

-pe then, 1.3.4 is back winning, but not at -l32:
184466 bytes with 1.3.4 win64 at -5pe (down to 183542 by adding --lax -l32)
188771 bytes with 1.4.2 win64 at -5pe (down to 180144 by adding --lax -l32)

Edit: Could get it down as far as this:
161713 bytes with 1.4.2 win64 at -pe --lax -l32 -A<tonsofthem>. -r4 or -r15 didn't matter, -r3 inflated it one byte
139241 by adding -b 32768 -r15
136487 by adding -b 65535 , confirming that once the predictor is good enough, ... or am I interpreting it wrong?


Since -e makes the difference, is there something about the model guesstimation algorithm?

Re: FLAC v1.4.x Performance Tests

Reply #360
https://hydrogenaud.io/index.php/topic,123025.msg1025264.html#msg1025264
https://hydrogenaud.io/index.php/topic,123025.msg1027285.html#msg1027285
A good thing about flac 1.4 is that the effect of -8e is quite predictable when the input files have a lot of unused spectral spaces, so the first thing to try with simple sine waves is to use -e.

Anyway, the tunings are still based on a very large set of corpus instead of very specific set of test samples, like waveforms from South Pole or 384-channel brainwaves.

Re: FLAC v1.4.x Performance Tests

Reply #361
Yes, I think tuning for specific samples will result in a bad overall tuning.
Music: sounds arranged such that they construct feelings.

Re: FLAC v1.4.x Performance Tests

Reply #362
Yeah. The observation that -e improves on certain material does suggest that the model selection algorithm could be improved, but until one can actually capture both these and those signals, it is not a good idea to chase the oddballs.


429428 bytes for the above file with TAK -p4m
420210 bytes for wavpack -hhx6


More sines tested: https://hydrogenaud.io/index.php/topic,122444.0.html

Re: FLAC v1.4.x Performance Tests

Reply #363
A weirdness or two, though. (Got 1.4.3 on this computer and replicated with that.)

-e -l <N> would never pick order = N. At most order = N-1. Or do I misinterpret the flac -a output? (I read the "order=" part, which does not start at 0. order=7 means coefficients enumerated 0 through 6, and was the highest I got out of -l7.)
Anyway, a bunch of .ana files attached.

Also, with -b48000 - so the blocks "nearly repeat", but not exactly - the predictor coefficients do vary quite a lot between the frames. However as a sine should be perfectly replicable with order = 2 - presuming sufficient precision, which I am too lazy to check out - there would be a whole lot of different predictor vectors that would make for equally good prediction.