Skip to main content
Topic: optimised WavPack encoder (Read 19018 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

optimised WavPack encoder

Reply #25
Sorry, I'm a bit busy these days. Now, that I finally found some time, I finished one of the alternatives to the BSR stuff. This one gives me a speedup of up to 8% on a P3 Celeron (-f -x6), and it's also faster than the original code on my old AMD K6.

wisodev, could you do some testing again? Please try 'nasm -O2' first. You find the patch in this file: wp-4.32-jfl2b.diff.gz


optimised WavPack encoder

Reply #27
[quote author=he-jo link=msg=394165 date=1148112665]Sorry, I'm a bit busy these days. Now, that I finally found some time, I finished one of the alternatives to the BSR stuff. This one gives me a speedup of up to 8% on a P3 Celeron (-f -x6), and it's also faster than the original code on my old AMD K6.

wisodev, could you do some testing again? Please try 'nasm -O2' first. You find the patch in this file: wp-4.32-jfl2b.diff.gz[/quote]

OK, I have added this optimazations but only one quick test was done by me, results are not so good, the binarys and sources are here. Maybe someone else can do some tests to confirm my results.

optimised WavPack encoder

Reply #28
I've updated the patch wp-4.32-jfl2b.diff.gz to avoid a memory stall. On a Celeron (Tualatin) I measured a speedup of 10% compared to the original wavpack sources.

wisodev, it would be nice, if you could provide binaries and test results again.

Thanks a lot,
Jo.

optimised WavPack encoder

Reply #29
What happend to the PowerPC optimized version?
The wavpack-4.31-ppc.diff fails on the WavPack 4.31 and 4.32 source code:
Quote
patch -p0 < wavpack-4.31-ppc.diff
patching file bits.c
Hunk #1 FAILED at 149.
1 out of 1 hunk FAILED -- saving rejects to file bits.c.rej
patching file pack.c
Hunk #1 FAILED at 567.
1 out of 1 hunk FAILED -- saving rejects to file pack.c.rej
patching file unpack3.c
Hunk #1 FAILED at 1583.
Hunk #2 FAILED at 1604.
Hunk #3 FAILED at 1988.
3 out of 3 hunks FAILED -- saving rejects to file unpack3.c.rej
patching file wavpack.h
Hunk #1 FAILED at 415.
Hunk #2 FAILED at 518.
2 out of 2 hunks FAILED -- saving rejects to file wavpack.h.rej
patching file words.c
Hunk #1 FAILED at 69.
Hunk #2 FAILED at 93.
Hunk #3 FAILED at 144.
Hunk #4 FAILED at 419.
Hunk #5 FAILED at 1302.
Hunk #6 FAILED at 1325.
Hunk #7 FAILED at 1376.
7 out of 7 hunks FAILED -- saving rejects to file words.c.rej

Am I missing something, or do you simply need to update the patch (hopefully for 4.32)?
Thanks!

optimised WavPack encoder

Reply #30
The problem is, that the WavPack sources for Unix have DOS line endings too. You need to remove the carriage returns from these files before you can apply the patch. You could do that with the following commands:

Code: [Select]
mkdir src
for FILE in *.[ch]; do tr -d '\r' < "$FILE" > "src/$FILE"; done
rm *.[ch]
mv src/* .

optimised WavPack encoder

Reply #31
Quote
I\'ve updated the patch \'wp-4.32-jfl2b.diff.gz\' on http://he-jo.net/download/wavpack/ to avoid a memory stall. On a Celeron (Tualatin) I measured a speedup of 10% compared to the original wavpack sources.

wisodev, it would be nice, if you could provide binaries and test results again.

Thanks a lot,
Jo.


as stated in Upload thread the new binarys will be available very soon

PS. sorry for double post, but I missed this post

optimised WavPack encoder

Reply #32
[quote author=he-jo link=msg=396986 date=1148853989]The problem is, that the WavPack sources for Unix have DOS line endings too.[/quote]Ok, I see.
I'll change the line endings and try again...

Thanks!



optimised WavPack encoder

Reply #35
Oh, this is great! I'm very happy, now that we have a faster binary for Athlons as well. I hoped to find a solution that is good for all common processors, but the P4 may still be a problem. Will see, if we can get rid of the BSR code in future.

optimised WavPack encoder

Reply #36
Are you considering adding MMX optimizations for negatives values (dpp->term), this was discussed some time ago. I think there are few percent of improvement there too. This part of code executes about 25% of all calls to function decorr_stereo_pass. Or there is reason to not optimize this? This is just a suggestion, I now it takes lot of time to do such things.

Beside the one test on P4 showed that BSR and MMX works pretty nice on this machines. The newest optimization was not tested on P4, but I think it should do as well. But I am just speculating here ;-)

I am looking for your future optimizations!!!

optimised WavPack encoder

Reply #37
Quote
Are you considering adding MMX optimizations for negatives values (dpp->term), this was discussed some time ago. I think there are few percent of improvement there too. This part of code executes about 25% of all calls to function decorr_stereo_pass. Or there is reason to not optimize this? This is just a suggestion, I now it takes lot of time to do such things.
I already did this one month ago, but tests on my old AMD K6 showed, that the code was actually slower. You can imagine, that this result wasn't very motivating. But in the meanwhile, I got the impression, that testing on this obsolete processor isn't really reliable. So maybe it's worth to have a look at the code again.
Quote
Beside the one test on P4 showed that BSR and MMX works pretty nice on this machines. The newest optimization was not tested on P4, but I think it should do as well. But I am just speculating here ;-)
Yes, that's what I meant. bsr could still be faster on a P4, because the 'jfl2b' code uses floating point instructions, which have a long latency on these processors. Theoretically, I could nearly double the throughput of the function by improving parallelism, but I'm afraid that this wouldn't give much gain at the end.
Quote
I am looking for your future optimizations!!!
Me too.  I really like to do this work, and it's nice to make people happy that way. But, you know, it's often just a matter of time. Will see, what I'm able to do.

optimised WavPack encoder

Reply #38
I made the test with my P4. I can see some improvements but BSR is still quite a lot faster. Keep up the good work he-jo.

optimised WavPack encoder

Reply #39
I have uploaded latest optimizations, the binarys and sources are available here for download. It looks very good, for more details check the Upload thread.

I find this a bit confusing: your link points to post #1, where one is redirected to post #21, but that post seems to be old as well (last edited on May 23, while the new binary should be of May 30) so I'm not sure if it is really the correct one.

Wouldn't be better to just have Post #1 to point to the latest and greatest version available?

Cheers and compliments for the great work you're all doing!

Sergio
Sergio
M-Audio Delta AP + Revox B150 + (JBL 4301B | Sennheiser Amperior | Sennheiser HD598)


optimised WavPack encoder

Reply #41
Quote
I find this a bit confusing: your link points to post #1, where one is redirected to post #21, but that post seems to be old as well (last edited on May 23, while the new binary should be of May 30) so I\'m not sure if it is really the correct one.

Wouldn\'t be better to just have Post #1 to point to the latest and greatest version available?

Cheers and compliments for the great work you\'re all doing!

Sergio


Yes, it is my fault. I have not updated this redirections and I know that this is not the best solution. This will be corrected I hope this time I can solve this problem and make things clear as possible, I am only a human ;-).

optimised WavPack encoder

Reply #42
Don't worry, wisodev, we all are human or at least we should be! :-))
Anyway, the link in post #21 is the good one, isn't it?

Cheers

Sergio

Edit: spelling

Edit 2: Oh, now I see! You edited post #1 and the correct link is now in post #31. Thanks!
Sergio
M-Audio Delta AP + Revox B150 + (JBL 4301B | Sennheiser Amperior | Sennheiser HD598)

 

optimised WavPack encoder

Reply #43
No problem!

---

I think when next update will be available I will edit post #1 and place there all downloads and in the original posts (that are now containing links) will be added redirection to post #1, this will be less confusing and keeping all downloads in one post will clarify the situation. But I am open to other solutions to.

 
SimplePortal 1.0.0 RC1 © 2008-2019