I just optimized Musepack for OS X
Reply #3 – 2004-08-16 01:58:42
If no serious sound issues are revealed, I will release the (relatively minor) changes for mass consumption. [a href="index.php?act=findpost&pid=234668"][{POST_SNAPBACK}][/a] The (L)GPL zealots will come after you. [a href="index.php?act=findpost&pid=234672"][{POST_SNAPBACK}][/a] Yeah, I know. All I did was use a square-root estimate (which is very close, and MUUUUCH faster), enable the built-in fast math routines, change the FFT code to Apple's vBigDSP, and change the compiler options to "-fast -mcpu=7400 -mtune=7400 -ftracer --param max-gcse-passes=8 -fsingle-precision-constant". In fact, it should be me complaining about (L)GPL violations! The source code that is available to the public compiles into a completely different build than the binaries released on www.musepack.net . They apparently have some more optimization tricks up their sleeve that are not enabled / available to the public. For example, compiling the public source code with ONLY the compiler optimizations, I got around 2.0x performance. The official build, got around 2.5x. The interesting thing to me was that: Building from the publicly available SOURCE, the biggest performance hit was the use of sqrt(). It was over 25% of execution time. However, in the public BUILDS, it appeared that they never even USED the system's sqrt(), instead using some magical inlined sqrt function instead. Not to mention they disabled debug symbols, so I can't even tell what functions are slow in the official build without looking at the assembler. I'd like to know who made the OS X builds so I can add their "hidden" optimizations to my code! Anyway, I believe this thread should be about testing for quality regressions, and not bickering over whether there is source or not. It's not like I'm selling it, and if you really REALLY want the source, I'll talk to you in private. The optimizations are nothing special. Here are some of my tests.official build: %|avg.bitrate| speed|play time (proc/tot)| CPU time (proc/tot)| ETA 100.0 179.7 kbps 2.81x 2:00.6 2:00.6 0:42.9 0:42.9 optimized: %|avg.bitrate| speed|play time (proc/tot)| CPU time (proc/tot)| ETA 100.0 179.3 kbps 3.60x 2:00.6 2:00.6 0:33.5 0:33.5 There is a 0.4 kbps drop in bitrate, which is most likely caused by the combination of reduced sqrt precision and Apple's FFT. The speed gain is quite substantial. I can't ABX the difference between the two files encoded at --standard --xlevel. Of course, I can't ABX them against the original either...