Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC v1.4.2 (Release) (Read 10630 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: FLAC v1.4.2 (Release)

Reply #1
Improve ability to tune compile for a certain system (for example with -march=native) when combining with --disable-asm-optimizations: plain C functions can now be better optimized

interesting, can't wait for new benchmarks



 

Re: FLAC v1.4.2 (Release)

Reply #5
GCC has "alderlake" and "znver4" too.

Re: FLAC v1.4.2 (Release)

Reply #6
GCC has "alderlake" and "znver4" too.
Unfortunately the cost tables for znver4 are currently re-using the tables for znver3 so it's a WIP, there isn't really an easy way to optimally tune for Zen4 yet. The best bet to proper tunings for new hardware is to use aocc, AMD's llvm fork that should be the first place optimal tunings exist (which AFAIK still doesn't have tunings for Zen4 yet, AMD are normally slow to update software and this seems to be no exception).

intel are traditionally better at baking in compiler/kernel support well in advance, but that has slipped from years to months and sometimes post-release, so I don't know how optimised gcc's 13th gen support is.

Re: FLAC v1.4.2 (Release)

Reply #7
GCC has "alderlake" and "znver4" too.
Unfortunately the cost tables for znver4 are currently re-using the tables for znver3 so it's a WIP, there isn't really an easy way to optimally tune for Zen4 yet. The best bet to proper tunings for new hardware is to use aocc, AMD's llvm fork that should be the first place optimal tunings exist (which AFAIK still doesn't have tunings for Zen4 yet, AMD are normally slow to update software and this seems to be no exception).

intel are traditionally better at baking in compiler/kernel support well in advance, but that has slipped from years to months and sometimes post-release, so I don't know how optimised gcc's 13th gen support is.

I just checked it out.  AOCC 3.2 seems to be the latest, and is based on Clang 13, which in when znver3 support came in.  I'm not finding anything for znver4 yet for Clang.

Re: FLAC v1.4.2 (Release)

Reply #8
Intel (John@Rarewares) vs. GCC vs. CLANG build, I'm not into that. Any word on what's preferred for my old computer which is an Intel Core i7 2600k? I hope they're all safe to use because I need to reconvert a lot of audio from Image file/embedded CUE to single files. thx


Re: FLAC v1.4.2 (Release)

Reply #10
@Case

Thank you VERY much!

Re: FLAC v1.4.2 (Release)

Reply #11
Improve ability to tune compile for a certain system (for example with -march=native) when combining with --disable-asm-optimizations: plain C functions can now be better optimized

interesting, can't wait for new benchmarks
This new option in flac 1.4.2 allows clearly faster 16bit performance but slower 24bit performance. A good solution for CD only apps like CUETools.
Attached is a GCC compile using -O3, fast math and --disable-asm-optimizations. In the 1.4.x performance thread i have a faster -Ofast compile but it was mentioned -Ofast enables things that can do strange things to its hosting apps using these binaries. I think they don't belong in the official 1.4.2 thread if that is so. I didn't experience any problems but nonetheless. I am still using the CPU optimization -march=haswell. Even using optimization for Zen v3 does only miniscule speed differences on my Zen v3. It seems flac does not benefit much from newer instructions after haswell. Case found that out long ago.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.2 (Release)

Reply #12
For me there is no significant speed difference between your -O3 binary and the -Ofast-manyflags-noasm you posted on Oct. 23rd.
With my setup, they are the fastest v1.4.2 binaries so far (strictly CDDA material)

Re: FLAC v1.4.2 (Release)

Reply #13
Thank you for testing sundance. I attach the GCC 12.2.0 files also with -O3 and fast math but with asm-optimizations left in.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.2 (Release)

Reply #14
This one is on-par with your -Ofast build (wo/noasm) from Oct.23rd
https://hydrogenaud.io/index.php/topic,123025.msg1018150.html#msg1018150
Code: [Select]
FLAC Binary: flac142-x64-gcc1220-O3+fastmath-wombat_2022-10-28.exe (737280 bytes)
FLAC Option: -7
- Average time =  25.674 seconds (5 rounds), Encoding speed = 421.12x
- FLAC size = 1.167.014.374 bytes (= 61,188% of WAV size, ~863 kbps)
What exactly does this "noasm" option?

P.S. Sorry, wrong thread...

Re: FLAC v1.4.2 (Release)

Reply #15
What exactly does this "noasm" option?
It simply removes all hand-coded instruction-set specific functions. These functions normally take precedence, and cannot be fully optimized by a compiler. By removing them, the compiler optimizations can really shine. I turns out GCC does a better job on the functions used for 16-bit audio, but does a poorer job on the 24-bit audio specific ones.
Music: sounds arranged such that they construct feelings.


Re: FLAC v1.4.2 (Release)

Reply #17
I cleaned up my config and the additional flags i use for compiling.
Single file speeds increased slightly. For example the binary with disable-asm-optimizations for a single 16-44.1 file is up to ~135x against ~132x before (-8 -p).
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.2 (Release)

Reply #18
GCC has "alderlake" and "znver4" too.
Unfortunately the cost tables for znver4 are currently re-using the tables for znver3 so it's a WIP, there isn't really an easy way to optimally tune for Zen4 yet. The best bet to proper tunings for new hardware is to use aocc, AMD's llvm fork that should be the first place optimal tunings exist (which AFAIK still doesn't have tunings for Zen4 yet, AMD are normally slow to update software and this seems to be no exception).

intel are traditionally better at baking in compiler/kernel support well in advance, but that has slipped from years to months and sometimes post-release, so I don't know how optimised gcc's 13th gen support is.

I just checked it out.  AOCC 3.2 seems to be the latest, and is based on Clang 13, which in when znver3 support came in.  I'm not finding anything for znver4 yet for Clang.
AOCC 4.0 is now out with znver4 support, I can't speak to its quality but the benchmarks look okay.

https://www.phoronix.com/review/amd-aocc-4


Re: FLAC v1.4.2 (Release)

Reply #20
Here are compiles using a more generic CPU optimization x86-64-v3 instead of haswell while using similar capabilities up to AVX2 in the hope for better performance across more modern CPU types.
Inside are builds with Clang 16.0.1, GCC 12.2.0 and a "disable-asm-optimizations" version with faster 16bit performance but slower 24bit performance for apps like CUETools or EAC.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: FLAC v1.4.2 (Release)

Reply #21
I cleaned up my config and the additional flags i use for compiling.
Single file speeds increased slightly. For example the binary with disable-asm-optimizations for a single 16-44.1 file is up to ~135x against ~132x before (-8 -p).

flac-1.4.2-x64-gcc1220-O3+fastmath+manyflags2

hands down, the fastest flac compile I have ever used on my system. It Encodes up to, umm, 1000x on CDDA files... not a typo.

two words: blown away.