Skip to main content
Topic: WavPack 4.75.0 has been released (Read 9755 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

WavPack 4.75.0 has been released

WavPack 4.75.0

Changes:
  • complete reorganization of the library source code for modularity
  • assembler optimizations added for encode/decode on x86 and x64
  • assembler optimizations for decoding on ARMv7 (Linux)
  • miscellaneous bug fixes

General information about WavPack can be found on the WavPack website.

Source code and official binaries are available for download here.

WavPack 4.75.0 has been released

Reply #1
After discussions starting with this post on the release candidate thread, I went ahead and added assembly optimizations for the "extra" (-x) processing mode for mono audio. I originally did not do this because it seemed like sort of an edge case, but have since recalled that multichannel 5.1 audio contains two channels that are encoded in mono (FC and LFE) and that the original mono code was not very fast (except on ICC compiles with modified source -- thanks Case for reminding me of that).

This does not affect mono encoding without the "extra" mode, nor does it affect pure stereo at all, but for 5.1 audio it's now faster than the other compiles. Obviously it's too late for the 4.75.0 release, but maybe something will be found that can justify an update. In the meantime I have uploaded 32-bit and 64-bit binaries here.

These are the updated results on my i7 machine and a 24-bit, 5.1 test file encoded with -hx2:

Code: [Select]
version      encode     decode
------------------------------
4.70.0       21.53s     7.63s
4.70.0 x64   25.46s     7.30s
------------------------------
4.75.0       19.19s     5.27s
4.75.0 x64   18.39s     5.12s
------------------------------
4.75.1       16.44s     -----
4.75.1 x64   16.02s     -----


bb10, hope you can give this a shot! 

--David

WavPack 4.75.0 has been released

Reply #2
bb10, hope you can give this a shot! 

--David


Thanks and of course!

CPU: Core i7 2630QM
Norah Jones - 12. Nightingale [4:12] (24-bit 6ch WAV) -h -x2 option:

Code: [Select]
X                  |     run 1     |     run 2      |     decode     |     decode 2   |
4.70.0 x86         |     37.88s    |     38.05s     |     11.10s     |     11.28s     |
4.70.0 x64         |     39.88s    |     39.76s     |     10.73s     |     10.72s     |
4.75.0 x86         |     33.75s    |     34.06s     |      8.59s     |      8.73s     |
4.75.0 x64         |     32.79s    |     32.59s     |      8.48s     |      8.68s     |
4.75.1 x86         |     29.17s    |     29.19s     |       --       |       --       |
4.75.1 x64         |     28.60s    |     28.61s     |       --       |       --       |
gcc 4.75.1 x86     |     27.63s    |     27.72s     |      8.86s     |      8.97s     |
gcc 4.75.1 x64     |     27.19s    |     27.16s     |      8.28s     |      8.45s     |
lvqcl-icc x86      |     27.91s    |     27.86s     |      8.11s     |      8.17s     |
lvqcl-icc x64      |     27.54s    |     27.60s     |      7.65s     |      7.80s     |
lvqcl-msvc2k13 x86 |     28.75s    |     28.86s     |      8.84s     |      9.03s     |
lvqcl-msvc2k13 x64 |     28.60s    |     28.54s     |      8.45s     |      8.28s     |
icc-mmx 4.70.0 x32 |     36.99s    |     36.62s     |       --       |       --       |
icc-mmx 4.70.0 x64 |     28.10s    |     28.03s     |       --       |       --       |
icc-aw 4.75.1 x86  |     27.93s    |     27.97s     |      8.06s     |      8.19s     |
icc-aw 4.75.1 x64  |     27.09s    |     27.07s     |      7.65s     |      7.48s     |


Norah Jones - 12. Nightingale [4:12] (16-bit 6ch WAV) -h -x2 option:

Code: [Select]
X                  |     run 1     |     run 2      |     decode     |     decode 2   |
4.70.0 x86         |     37.00s    |     36.98s     |     10.24s     |     10.26s     |
4.70.0 x64         |     36.19s    |     36.21s     |      9.90s     |     10.00s     |
4.75.0 x86         |     33.70s    |     33.56s     |      8.46s     |      8.34s     |
4.75.0 x64         |     32.56s    |     32.38s     |      8.29s     |      8.39s     |
4.75.1 x86         |     29.00s    |     29.05s     |       --       |       --       |
4.75.1 x64         |     28.49s    |     28.50s     |       --       |       --       |
gcc 4.75.1 x86     |     27.97s    |     27.88s     |      8.58s     |      8.64s     |
gcc 4.75.1 x64     |     27.22s    |     27.19s     |      8.28s     |      8.13s     |
lvqcl-icc x86      |     27.87s    |     27.74s     |      7.71s     |      7.67s     |
lvqcl-icc x64      |     27.51s    |     27.65s     |      7.46s     |      7.37s     |
lvqcl-msvc2k13 x86 |     28.81s    |     28.99s     |      8.44s     |      8.48s     |
lvqcl-msvc2k13 x64 |     28.51s    |     28.38s     |      8.12s     |      8.10s     |
icc-mmx 4.70.0 x32 |     41.94s    |     41.96s     |       --       |       --       |
icc-mmx 4.70.0 x64 |     28.89s    |     28.93s     |       --       |       --       |
icc-aw 4.75.1 x86  |     28.00s    |     28.07s     |      7.76s     |      7.69s     |
icc-aw 4.75.1 x64  |     27.06s    |     26.95s     |      7.36s     |      7.35s     |



4.75.1 is now the fastest build with 16bit files! But the 4.70.0 icc build with apply_weight is still ever so slightly faster with 24bit files.

Thanks to lamedude for providing the 4.75.0 icc binary. I wonder if a 4.75.1 icc binary would make a difference.

WavPack 4.75.0 has been released

Reply #3
4.75.1
64bit has modified apply_weight (using line 456 & 465).

WavPack 4.75.0 has been released

Reply #4
I suspect that the definition of apply_weight doesn't matter now when both 32-bit and 64-bit WavPack uses asm code. And the difference between en/decoders made by different compilers should be rather small.

Anyway, I compiled The latest sources with GCC 4.9.2, MSVS 2013 Update 4 and ICC XE 2015 Update 4. (No changes in the source code, just vanilla settings).

WavPack 4.75.0 has been released

Reply #5
4.75.1 is now the fastest build with 16bit files! But the 4.70.0 icc build with apply_weight is still ever so slightly faster with 24bit files.

Thanks to lamedude for providing the 4.75.0 icc binary. I wonder if a 4.75.1 icc binary would make a difference.
Thanks so much for testing all these and posting the results! I suspect that the little difference for 24-bit files is now in the realm of CPU to CPU variation because on my i7 the icc version is slightly slower. I'll still take this as a win!

An icc binary of 4.75.1 would probably be a little faster still, but it would be pretty minor I think because as lvqcl says, almost all the time is spent in assembly language now.

4.75.1
64bit has modified apply_weight (using line 456 & 465).
Sorry, I should have noticed that the apply_weight modification would have to be in two places now because of some "improvements" I made to those macros (they really should make more sense now). But anyway, I see you figured it out... 

Anyway, I compiled The latest sources with GCC 4.9.2, MSVS 2013 Update 4 and ICC XE 2015 Update 4. (No changes in the source code, just vanilla settings).
Very cool...thanks for generating these! I'm glad to see that it wasn't too much trouble to build them on newer MSVS; it was a little bit of a hack for me to get that working on 2008 (I had to edit the project files by hand). Kudos to Microsoft if it just worked; kudos to you if it didn't!

WavPack 4.75.0 has been released

Reply #6
I'm glad to see that it wasn't too much trouble to build them on newer MSVS; it was a little bit of a hack for me to get that working on 2008 (I had to edit the project files by hand). Kudos to Microsoft if it just worked; kudos to you if it didn't!

There's a small nuisance: MSVS 2013 tries to build 32-bit exe with /SAFESEH option and Wavpack .asm modules don't use it. So it's necessary either to remove /SAFESEH from .exe linker options, or add this option to the *_x86.asm files. No problems with 64-bit builds.

WavPack 4.75.0 has been released

Reply #7
Updated the tables. lamedudes 4.75.1 64bit icc encode takes the crown once again and is followed closely by gcc, though the difference between icc and the rest isn't as big as last time with 4.70.0.

WavPack 4.75.0 has been released

Reply #8
Nice!

wavpack -hx2 <input_pcm_s16le44kHz.wav>:
Code: [Select]
4.70.0: 4.13s
4.75.1: 3.16s


Thank you!

WavPack 4.75.0 has been released

Reply #9
Heads up they nerfed MMX on Skylake.  On page 69.
Quote
In processors based on the Skylake microarchitecture, the functionality of the MMX instruction set is unchanged from prior generations. But many MMX instructions are constrained to execute to one port with half the instruction throughput relative to prior microarchitectures.


 

WavPack 4.75.0 has been released

Reply #10
It's actually page 11-69 (section 11.16.5):
Quote
The MMX instructions with throughput constraints include:
• PADDS[B/W], PADDUS[B/W], PSUBS[B/W], PSUBUS[B/W].
• PCMPGT[B/W/D], PCMPEQ[B/W/D].
• PMAX[UB/SW], PMIN[UB/SW].
• PAVG[B/W], PABS[B/W/D], PSIGN[B/W/D].

But Wavpack uses only pcmpeqd and paddusw instructions, so maybe it won't suffer much from this.

 
SimplePortal 1.0.0 RC1 © 2008-2020