Skip to main content

Topic: Ogg Vorbis acceleration project (Read 132494 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • john33
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #150
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.
John
----------------------------------------------------------------
My compiles and utilities are at http://www.rarewares.org/

Ogg Vorbis acceleration project
Reply #151
Anyone that can give some guidance on how to compile this under Linux (Ubuntu)? Regards.

  • LigH
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #152
A few more statistics, transcoding 01:42:28 h of a 5.1 AC3 on a Phenom-II X4 945 using BeSweet with DPL-II downmix and fixed gain (to avoid including the normalization pass):

Generic
06:42 (686)
06:04 (P4)

Lancer
04:30 (SSE)
03:51 (SSE2)
03:50 (SSE3)

The gap between generic and extreme optimization is quite impressive. And even the gap between SSE and SSE2 is still remarkable. But after all, decoding and downmixing takes its time too, so a certain degree of saturation is expectable.
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum

  • Brazil2
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #153
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.

Thanks but unfortunately, and unlike your previous builds, it's not running anymore on older OSes pre-XP SP2 on which VC2010 runtimes can't be installed

But this might be helpfull: http://mulder.googlecode.com/svn/trunk/Uti...rLib/README.txt

  • LigH
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #154
There are reasons why such old OS are deprecated. An excuse would be running them offline.
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum

  • john33
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #155
Thanks but unfortunately, and unlike your previous builds, it's not running anymore on older OSes pre-XP SP2 on which VC2010 runtimes can't be installed

But this might be helpfull: http://mulder.googlecode.com/svn/trunk/Uti...rLib/README.txt

OK, what optimisation does your CPU support?
John
----------------------------------------------------------------
My compiles and utilities are at http://www.rarewares.org/

  • LigH
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #156
It's rather a question of PE-building and linking than of CPU optimizations, john33. Not the CPU is the limit, but the OS and its set of supported Windows API functions.
  • Last Edit: 04 July, 2012, 09:41:20 AM by LigH
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum

  • Brazil2
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #157
OK, what optimisation does your CPU support?

MMX, SSE, SSE2, SSE3, SSSE3 and I'm usually using your P4 optimized builds.
Thank you

Ogg Vorbis acceleration project
Reply #158
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.


Hi, John. But what is the difference between your new compiles and this?

I don't remember where I got it, but it was more than one year ago and actually this is also OggEnc v2.87 LancerMod(SSE3) based on aoTuV b6.03 [20110424]. Could you clarify?
  • Last Edit: 04 July, 2012, 11:28:18 AM by Steve Forte Rio

  • john33
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #159
OK, what optimisation does your CPU support?

MMX, SSE, SSE2, SSE3, SSSE3 and I'm usually using your P4 optimized builds.
Thank you

Try this: http://www.rarewares.org/files/ogg/oggenc2...cerSSE2_OLD.zip
and perhaps you could let me know if it's OK?
John
----------------------------------------------------------------
My compiles and utilities are at http://www.rarewares.org/

  • john33
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #160
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.


Hi, John. But what is the difference between your new compiles and this?

I don't remember where I got it, but it was more than one year ago and actually this is also OggEnc v2.87 LancerMod(SSE3) based on aoTuV b6.03 [20110424]. Could you clarify?

I couldn't say with any certainty, but probably the only difference from looking at the size of the executables is that I don't think they were compiled with the libsamplerate resampler.
John
----------------------------------------------------------------
My compiles and utilities are at http://www.rarewares.org/

  • Brazil2
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #161
Try this: http://www.rarewares.org/files/ogg/oggenc2...cerSSE2_OLD.zip
and perhaps you could let me know if it's OK?

Brilliant! Works like a charm, thanks a lot
Code: [Select]
G:\Test\>oggenc2 -h
OggEnc v2.87 (LancerMod(SSE2) based on aoTuV b6.03 [20110424])
(c) 2000-2005 Michael Smith <msmith@xiph.org>
& portions by John Edwards <john.edwards33@ntlworld.com>

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #162
My versions of oggenc2.exe doesn't include SRC and FLAC libraries and I commented out all relevant options and calls.

@john33: in your compiles these options are disabled too    I think it's not what you want, and 3 source files with re-enabled options are attached to the post.

  • Raimu
  • [*]
Ogg Vorbis acceleration project
Reply #163
Quote
Hi, John. But what is the difference between your new compiles and this?
I don't remember where I got it, but it was more than one year ago and actually this is also OggEnc v2.87 LancerMod(SSE3) based on aoTuV b6.03 [20110424]. Could you clarify?

Some tests (out of interest) on my PC reveal that john33's current binaries are slightly but noticably faster than these in your link, in the very least.
  • Last Edit: 04 July, 2012, 01:09:07 PM by Raimu

  • john33
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #164
My versions of oggenc2.exe doesn't include SRC and FLAC libraries and I commented out all relevant options and calls.

@john33: in your compiles these options are disabled too    I think it's not what you want, and 3 source files with re-enabled options are attached to the post.

Thanks, but the versions at Rarewares have these enabled.

EDIT: I just realised that the options were disabled in the oggenc2 code!  I had enabled the inclusion of the libs in the compiles and hadn't checked the code!
  • Last Edit: 04 July, 2012, 02:05:13 PM by john33
John
----------------------------------------------------------------
My compiles and utilities are at http://www.rarewares.org/

  • john33
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #165
All of the above oggenc2 compiles have been updated at Rarewares. Sorry for the confusion!
John
----------------------------------------------------------------
My compiles and utilities are at http://www.rarewares.org/

Ogg Vorbis acceleration project
Reply #166
All of the above oggenc2 compiles have been updated at Rarewares. Sorry for the confusion!


Great work, thanks a lot.

Although the version by lvqcl is still faster on my machine. I use oggenc2 32bit sse3 from [a href='index.php?act=findpost&pid=784966']here[/a] and foobar converts a flac around 49x while your compile is at 42x.

  • LigH
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #167
Your machine. Aha.

We all know your machine.

Oh, no, this is your first post, so how could we?

Hint: http://hwinfo.com/
  • Last Edit: 07 July, 2012, 05:53:10 AM by LigH
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum

Ogg Vorbis acceleration project
Reply #168
It's a core2duo laptop with a P8600+ 4gb ram on win7.

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #169
Although the version by lvqcl is still faster on my machine. I use oggenc2 32bit sse3 from [a href='index.php?act=findpost&pid=784966']here[/a] and foobar converts a flac around 49x while your compile is at 42x.


Try LancerSSE2_OLD build. It is faster than other versions (except x64).

Ogg Vorbis acceleration project
Reply #170
With johns lancer sse2 old i get the same speed like using your sse3 version. 

Ogg Vorbis acceleration project
Reply #171
Out of curiosity i tested all 32bit oggenc2 compiles again and here are the results:


John33:

sse  35.69x
sse2 38.40x
sse3 38.60x

sse2old 47.19x


lvqcl:

sse  38.80x
sse2 47.94x
sse3 47.73x


I'm not familiar with compiling, so i wonder why there is such a huge step in speed from sse to sse2 while sse2 and sse3 are on the same level?

  • LigH
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #172
This effect doesn't belong to the "Compiling" as such (the C compiler only translates the source routines which are not very CPU optimized; the in-depth CPU instruction set optimization is more efficiently done via manual Assembler code).

The efficiency boost between different instruction sets depends on the algorithm to be optimized and the differences between the instruction sets. So specifically for the Vorbis encoding, SSE2 seems to introduce very useful new instructions (relative to SSE only), but the new instructions in SSE3 (relatively to SSE2 only) are only marginal for the Vorbis algorithms.
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum

Ogg Vorbis acceleration project
Reply #173
The efficiency boost between different instruction sets depends on the algorithm to be optimized and the differences between the instruction sets. So specifically for the Vorbis encoding, SSE2 seems to introduce very useful new instructions (relative to SSE only), but the new instructions in SSE3 (relatively to SSE2 only) are only marginal for the Vorbis algorithms.

Thanks for clarifying.

Is it the reason there is no sse4 compile, because it introduces too little useful instructions compared to sse3 as well?

  • Raimu
  • [*]
Ogg Vorbis acceleration project
Reply #174
Quote
Is it the reason there is no sse4 compile, because it introduces too little useful instructions compared to sse3 as well?


I was under the impression the reason is more along the lines of SSE4* being an umbrella term for a clustermess of very different instruction sets some of which only work on newish Intel CPUs and others only on newish AMD CPUs and all of which only can be effectively optimized for on pretty new and specific compilers.