Skip to main content

Topic: Ogg Vorbis acceleration project (Read 133242 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
Ogg Vorbis acceleration project
Reply #100
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.
  • Last Edit: 06 August, 2010, 06:24:11 PM by RazorBoy143

Ogg Vorbis acceleration project
Reply #101
is it possible to make a universal encoder, which could recognize what SSE commands your processor supports?

  • forart.eu
  • [*][*]
Ogg Vorbis acceleration project
Reply #102
Not only !

DarkWave Studio automatically select x86/x64 and correct SSE* instructions to use...

  • [JAZ]
  • [*][*][*][*][*]
Ogg Vorbis acceleration project
Reply #103
@forat.eu:  I guess you are aware that you cannot go with an "if" statement everytime you have to decide if you should use one technology or another, right?

With this, I mean that for an application to properly and automatically (or even manually via a setting) switch between using SSE or not, and not lose all gains the programmer has to write full independent paths for those operations.  ("If" statements do really slow things).

Yet, this also means lots of work for the programmer, since he has to do manually what a compiler does automatically for him.
I guess the best way to do so for a multi-file program would be to have different dll's, each one with the same code, but compiled for different processors (by the compiler). The main program could choose to load one dll or another depending on where it is run. Anyway, this is not a perfect solution.


Also, when you throw in x86/x64, are you talking of an installer, or an application??  If it is an installer, the point is moot, since here we were talking about an executable program.

The only program that I've seen doing a sort of "Universal binary" for x86 and x64 is  Microsoft's (or Mark russinovich's) Process Explorer.  Version 11 of this software, when run on an x64 machine, extracts a x64 binary from itself and then executes that file  (the downloaded file sizes 3.7MB. The x64 file sizes 950KB).  So if you add up, there's clearly more space used than just having two separate files. It just makes it more handy when you're talking of small files.

Ogg Vorbis acceleration project
Reply #104
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.

Try the SSE2 compile here: (oggenc2.7z - http://www.hydrogenaudio.org/forums/index....mp;#entry668288 ; aoTuV beta 5.7 vorbis encoder with some parts of Lancer project ). I have been able to use this with dBpoweramp in a Windows XP computer with only up to SSE2 processor support.

Also, upon checking upon dBpoweramp's website, it seems that they've updated the available Ogg Vorbis lancer encoder installer for dBpoweramp to the b5.7 2009-03-03 version, from the previous  b5 2006-10-24 version they had in the site a few months ago ( http://forum.dbpoweramp.com/showthread.php?t=18713 ). I do now know the source or compiler of this one, though.
  • Last Edit: 14 August, 2010, 10:22:29 AM by galacticninja

  • AshenTech
  • [*][*]
Ogg Vorbis acceleration project
Reply #105
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip 


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(


try compat mode and set it to vista, see if that helps.


I don't understand. Could you be more specific?

2 ways to do this one is listed here

http://www.sevenforums.com/tutorials/316-c...ility-mode.html

http://lifehacker.com/5466628/learn-to-use...with-older-apps

hope this helps.

  • demi
  • [*]
Ogg Vorbis acceleration project
Reply #106
I tried ur build.

TEST SETUP:
CPU: AMD Athlon II X4 (208*14)
OS: Win7 64bit
Encoder: BS; (LancerMod [20100720](SSE3) based on aoTuV b5d [20090301])

I had convert flac to ogg q5. It seems generate correct files but cpu usage is abnormal.
I ran 4 encoder simultaneously, each process consume around 5% of cpu time.
So 4process consume just 20% CPU time. 80% is free.
It looks like lack of IO perf, but I think it doesnt matter cuz I tested on free 7200prm hdd.
SSE2 version also bring this problem.

In addition, I tested another encoder 'BS; (LancerMod [20091214](SSE3) based on aoTuV b5d [20090301])'
It works great and faster than john's earlier build. Peak speed up to 150x, fantastic!

I hope it will help john's work

Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:

SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip
SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip
SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip

I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise.

Feedback and experience with these would be welcome.

TIA.
  • Last Edit: 19 August, 2010, 12:10:27 AM by demi

Ogg Vorbis acceleration project
Reply #107
I've just tried accelerated oggenc on my new Core i3 . Here is short results:

Oggenc2.85 using aoTuVb5.7 P4 version - 36.79x
oggenc2.85-aoTuVb5.7-Lancer - 58.14x

Windows 7 x32, Core i3 530 @ 2.94GHz,  2x2 Gb DDR3-1333

Great speedup, thanks for your work


P.S. Maybe this is a stupid question but is it possible to use SSE4.1/4.2 optimizations that are available with latest Intel CPU's?
  • Last Edit: 16 September, 2010, 03:46:25 PM by Steve Forte Rio

  • AlexDDR
  • [*]
Ogg Vorbis acceleration project
Reply #108
Is there a version of aotuv b5.7? oggenc or vorbis.dll with SSE3 mt (multi thread), it seems to only find the normal version

  • IgorC
  • [*][*][*][*][*]
Ogg Vorbis acceleration project
Reply #109
Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933

lvqcl builds have no issues.

Ogg Vorbis acceleration project
Reply #110
I would love an updated enhanced ogg encoder too. The latest libogg and all that and SSE3 and SSE4. What would be even better would be a multicore & sse4 version. Regards

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #111
Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933

lvqcl builds have no issues.

Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

(I wonder why algorithms in this file are so sensitive to optimizations made by ICC)
  • Last Edit: 05 March, 2011, 09:55:48 AM by lvqcl

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #112
Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

Note2: the problem was in the code
Code: [Select]
    e->mdct_win[i]=sin(i/(n-1.)*M_PI);
    e->mdct_win[i]*=e->mdct_win[i];

ICC at highest optimization level doesn't generate code for the second line... Replacing it with the following code solves this problem:

Code: [Select]
    float t = sin(i/(n-1.)*M_PI);
    e->mdct_win[i] = t*t;
  • Last Edit: 07 March, 2011, 08:52:23 AM by lvqcl

  • robert
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #113
Is this a known bug of the Intel compiler? Did it print some warnings about unsafe optimizations used, so that one has the chance to see the problem coming? I guess, I'll have to do some code review of LAME, looking out for similar potential problems.

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #114
Quote
Did it print some warnings about unsafe optimizations used

Don't see any.

But I also noticed that "Interprocedural Optimization" option was set to Multi-File (/Qipo). Changing this option for envelope.c  to Single-File (/Qip) solves this problem, too.

icl.exe: Version 12.0.2.154 Build 20110112

Added: http://software.intel.com/en-us/forums/sho...ead.php?t=62095 -- "Bug in Intel C++ compiler when using option /Qipo ... Intel C++ v11.0.066"


Added [20110505]: The bug still exists in IntelĀ® C++ Composer XE 2011 Update 3 (icl.exe Version 12.0.3.175 Build 20110309)
  • Last Edit: 07 May, 2011, 01:17:40 PM by lvqcl

  • Isayama
  • [*]
Ogg Vorbis acceleration project
Reply #115
Hi everyone,

Sorry for re-upping this nearly one year old post, but I was wondering, related to this thread (aTuVbeta6.02): where can I download the fastest (and latest) SSE3 (or SSE2, or even SSE4.1  ) 64-bit version of accelerated oggenc2? lvqcl's one is not online anymore (from this post). Has any new advancement been realized in that field?

Thanks anyway for all those interesting discussions.

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #116
AoTuV b6.03 compiled with ICC 12.1: [ Specified attachment is not available ]
32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

[ Specified attachment is not available ]
  • Last Edit: 06 February, 2012, 01:00:33 PM by lvqcl

  • skamp
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #117
Thank you very much for the updated binaries. With the Win64 SSE3 binary under linux with wine, I get 59x, versus 37x with my natively compiled aotuv binary.
  • Last Edit: 05 February, 2012, 07:20:25 PM by skamp
See my profile for measurements, tools and recommendations.

  • forart.eu
  • [*][*]
Ogg Vorbis acceleration project
Reply #118
Dunno if can help in any way, but here's an Eric Gur (Processor Client Application Engineer @ Intel Corp.) reply to my message about MT library:

Quote
For threading I recommend using Intel's free TBB library. It's very fast, cross platform, simple to use and has an important feature - malloc replacement.
I used it in a previous project - 1M lines of code, multithreaded application on Linux x64. Just the malloc replament boosted performance by 3x without changing any code (1 line in the makefile).


BTW, there are a number of malloc replacements available, including this and one from Google...
  • Last Edit: 07 February, 2012, 04:54:56 AM by forart.eu

  • Isayama
  • [*]
Ogg Vorbis acceleration project
Reply #119
AoTuV b6.03 compiled with ICC 12.1

Many thanks lvqcl for your quick and effective answer! I don't know anything yet about compiling, but I think I'll start giving it a shot... I've seen you gave your optimization options in another thread, so I'll start with that :-)
Sorry to ask, but is there any storage site or ftp server where you upload your compiles, or do you do it on an on-demand basis? :-)

Thanks again anyway, now I can encode an album in ogg in no time, which was kind of a problem so far.
Good continuation and cheers for the help!

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Ogg Vorbis acceleration project
Reply #120
TWIMC -- aoTuV b5.7 compiled with ICC 12.1. [ Specified attachment is not available ]

  • vinnie97
  • [*][*][*][*]
Ogg Vorbis acceleration project
Reply #121
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit).  I receive an "Unsupported Compression Method" error when attempting to decompress.  Any clues?

  • saratoga
  • [*][*][*][*][*]
Ogg Vorbis acceleration project
Reply #122
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit).  I receive an "Unsupported Compression Method" error when attempting to decompress.  Any clues?


Are you running a version of 7zip before  9.04 ?  if so, update, as thats when LZMA2 support was added.

  • OggY68
  • [*]
Ogg Vorbis acceleration project
Reply #123
AoTuV b6.03 compiled with ICC 12.1: (Attachment Link)
32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

(Attachment Link)


Thanks, managed to compile your sources using GCC with SSE3 acceleration as shared libraries (libvorbis, libvorbisenc and libvorbisfile) natively on my linux box.  Encoding a CD to ogg takes now 30 seconds less on my old Athlon X2 4600XP.  But, unfortunately oggdec and ogg1234 cannot decode anymore with the new libvorbisfile lib.  After the method ov_raw_seek is called the programs exit with a "Segmentation Fault".  After this method is called for the first time, the data members of vorbis_info like mode, rate, ... show only junk numbers like "-1223863434" hence the programs crash  ...
  • Last Edit: 05 April, 2012, 03:51:59 PM by OggY68

  • LigH
  • [*][*][*]
Ogg Vorbis acceleration project
Reply #124
I am simply amazed about the speed gain!

I compared different encoders, and for Ogg Vorbis, specifically, several specific builds, encoding a whole CD image (697 MB) on an AMD Phenom II X6 1045T, 2700 MHz. Times taken with ptime; best of 3 consecutive runs.



Germany uses a decimal comma. Basic oggenc2 builds are from RareWares.

That shouldn't leave any doubt that Ogg Vorbis, fine tuned by Lancer, is now probably the practically most efficient audio encoder, regarding a weighted relation between quality efficiency and speed efficiency. The FhG AAC encoder is close, but lacks of bitrate tuning flexibility (quite a large gap between VBR presets 4 and 5, targeting at 128 or 192 kbps).
  • Last Edit: 28 April, 2012, 08:08:05 AM by LigH
http://forum.gleitz.info - das deutsche doom9/Gleitz-Forum