Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Ogg Vorbis acceleration project (Read 190051 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Ogg Vorbis acceleration project

Reply #100
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.

Ogg Vorbis acceleration project

Reply #101
is it possible to make a universal encoder, which could recognize what SSE commands your processor supports?

Ogg Vorbis acceleration project

Reply #102
Not only !

DarkWave Studio automatically select x86/x64 and correct SSE* instructions to use...
F.O.R.A.R.T. npo

Ogg Vorbis acceleration project

Reply #103
@forat.eu:  I guess you are aware that you cannot go with an "if" statement everytime you have to decide if you should use one technology or another, right?

With this, I mean that for an application to properly and automatically (or even manually via a setting) switch between using SSE or not, and not lose all gains the programmer has to write full independent paths for those operations.  ("If" statements do really slow things).

Yet, this also means lots of work for the programmer, since he has to do manually what a compiler does automatically for him.
I guess the best way to do so for a multi-file program would be to have different dll's, each one with the same code, but compiled for different processors (by the compiler). The main program could choose to load one dll or another depending on where it is run. Anyway, this is not a perfect solution.


Also, when you throw in x86/x64, are you talking of an installer, or an application??  If it is an installer, the point is moot, since here we were talking about an executable program.

The only program that I've seen doing a sort of "Universal binary" for x86 and x64 is  Microsoft's (or Mark russinovich's) Process Explorer.  Version 11 of this software, when run on an x64 machine, extracts a x64 binary from itself and then executes that file  (the downloaded file sizes 3.7MB. The x64 file sizes 950KB).  So if you add up, there's clearly more space used than just having two separate files. It just makes it more handy when you're talking of small files.

Ogg Vorbis acceleration project

Reply #104
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.

Try the SSE2 compile here: (oggenc2.7z - http://www.hydrogenaudio.org/forums/index....mp;#entry668288 ; aoTuV beta 5.7 vorbis encoder with some parts of Lancer project ). I have been able to use this with dBpoweramp in a Windows XP computer with only up to SSE2 processor support.

Also, upon checking upon dBpoweramp's website, it seems that they've updated the available Ogg Vorbis lancer encoder installer for dBpoweramp to the b5.7 2009-03-03 version, from the previous  b5 2006-10-24 version they had in the site a few months ago ( http://forum.dbpoweramp.com/showthread.php?t=18713 ). I do now know the source or compiler of this one, though.

Ogg Vorbis acceleration project

Reply #105
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip 


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(


try compat mode and set it to vista, see if that helps.


I don't understand. Could you be more specific?

2 ways to do this one is listed here

http://www.sevenforums.com/tutorials/316-c...ility-mode.html

http://lifehacker.com/5466628/learn-to-use...with-older-apps

hope this helps.

Ogg Vorbis acceleration project

Reply #106
I tried ur build.

TEST SETUP:
CPU: AMD Athlon II X4 (208*14)
OS: Win7 64bit
Encoder: BS; (LancerMod [20100720](SSE3) based on aoTuV b5d [20090301])

I had convert flac to ogg q5. It seems generate correct files but cpu usage is abnormal.
I ran 4 encoder simultaneously, each process consume around 5% of cpu time.
So 4process consume just 20% CPU time. 80% is free.
It looks like lack of IO perf, but I think it doesnt matter cuz I tested on free 7200prm hdd.
SSE2 version also bring this problem.

In addition, I tested another encoder 'BS; (LancerMod [20091214](SSE3) based on aoTuV b5d [20090301])'
It works great and faster than john's earlier build. Peak speed up to 150x, fantastic!

I hope it will help john's work

Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:

SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip
SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip
SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip

I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise.

Feedback and experience with these would be welcome.

TIA.

Ogg Vorbis acceleration project

Reply #107
I've just tried accelerated oggenc on my new Core i3 . Here is short results:

Oggenc2.85 using aoTuVb5.7 P4 version - 36.79x
oggenc2.85-aoTuVb5.7-Lancer - 58.14x

Windows 7 x32, Core i3 530 @ 2.94GHz,  2x2 Gb DDR3-1333

Great speedup, thanks for your work


P.S. Maybe this is a stupid question but is it possible to use SSE4.1/4.2 optimizations that are available with latest Intel CPU's?
🇺🇦 Glory to Ukraine!

Ogg Vorbis acceleration project

Reply #108
Is there a version of aotuv b5.7? oggenc or vorbis.dll with SSE3 mt (multi thread), it seems to only find the normal version


Ogg Vorbis acceleration project

Reply #110
I would love an updated enhanced ogg encoder too. The latest libogg and all that and SSE3 and SSE4. What would be even better would be a multicore & sse4 version. Regards

Ogg Vorbis acceleration project

Reply #111
Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933

lvqcl builds have no issues.

Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

(I wonder why algorithms in this file are so sensitive to optimizations made by ICC)

 

Ogg Vorbis acceleration project

Reply #112
Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

Note2: the problem was in the code
Code: [Select]
    e->mdct_win[i]=sin(i/(n-1.)*M_PI);
    e->mdct_win[i]*=e->mdct_win[i];

ICC at highest optimization level doesn't generate code for the second line... Replacing it with the following code solves this problem:

Code: [Select]
    float t = sin(i/(n-1.)*M_PI);
    e->mdct_win[i] = t*t;

Ogg Vorbis acceleration project

Reply #113
Is this a known bug of the Intel compiler? Did it print some warnings about unsafe optimizations used, so that one has the chance to see the problem coming? I guess, I'll have to do some code review of LAME, looking out for similar potential problems.

Ogg Vorbis acceleration project

Reply #114
Quote
Did it print some warnings about unsafe optimizations used

Don't see any.

But I also noticed that "Interprocedural Optimization" option was set to Multi-File (/Qipo). Changing this option for envelope.c  to Single-File (/Qip) solves this problem, too.

icl.exe: Version 12.0.2.154 Build 20110112

Added: http://software.intel.com/en-us/forums/sho...ead.php?t=62095 -- "Bug in Intel C++ compiler when using option /Qipo ... Intel C++ v11.0.066"


Added [20110505]: The bug still exists in Intel® C++ Composer XE 2011 Update 3 (icl.exe Version 12.0.3.175 Build 20110309)

Ogg Vorbis acceleration project

Reply #115
Hi everyone,

Sorry for re-upping this nearly one year old post, but I was wondering, related to this thread (aTuVbeta6.02): where can I download the fastest (and latest) SSE3 (or SSE2, or even SSE4.1  ) 64-bit version of accelerated oggenc2? lvqcl's one is not online anymore (from this post). Has any new advancement been realized in that field?

Thanks anyway for all those interesting discussions.

Ogg Vorbis acceleration project

Reply #116
AoTuV b6.03 compiled with ICC 12.1: [attachment=6898:oggenc2_ICC12.1.7z]
32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

[attachment=6899:sources_.7z]

Ogg Vorbis acceleration project

Reply #117
Thank you very much for the updated binaries. With the Win64 SSE3 binary under linux with wine, I get 59x, versus 37x with my natively compiled aotuv binary.

Ogg Vorbis acceleration project

Reply #118
Dunno if can help in any way, but here's an Eric Gur (Processor Client Application Engineer @ Intel Corp.) reply to my message about MT library:

Quote
For threading I recommend using Intel's free TBB library. It's very fast, cross platform, simple to use and has an important feature - malloc replacement.
I used it in a previous project - 1M lines of code, multithreaded application on Linux x64. Just the malloc replament boosted performance by 3x without changing any code (1 line in the makefile).


BTW, there are a number of malloc replacements available, including this and one from Google...
F.O.R.A.R.T. npo

Ogg Vorbis acceleration project

Reply #119
AoTuV b6.03 compiled with ICC 12.1

Many thanks lvqcl for your quick and effective answer! I don't know anything yet about compiling, but I think I'll start giving it a shot... I've seen you gave your optimization options in another thread, so I'll start with that :-)
Sorry to ask, but is there any storage site or ftp server where you upload your compiles, or do you do it on an on-demand basis? :-)

Thanks again anyway, now I can encode an album in ogg in no time, which was kind of a problem so far.
Good continuation and cheers for the help!

Ogg Vorbis acceleration project

Reply #120
TWIMC -- aoTuV b5.7 compiled with ICC 12.1. [attachment=6938:oggenc2_..._aotuv57.7z]

Ogg Vorbis acceleration project

Reply #121
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit).  I receive an "Unsupported Compression Method" error when attempting to decompress.  Any clues?

Ogg Vorbis acceleration project

Reply #122
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit).  I receive an "Unsupported Compression Method" error when attempting to decompress.  Any clues?


Are you running a version of 7zip before  9.04 ?  if so, update, as thats when LZMA2 support was added.

Ogg Vorbis acceleration project

Reply #123
AoTuV b6.03 compiled with ICC 12.1: [attachment=6898:oggenc2_ICC12.1.7z]
32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

[attachment=6899:sources_.7z]


Thanks, managed to compile your sources using GCC with SSE3 acceleration as shared libraries (libvorbis, libvorbisenc and libvorbisfile) natively on my linux box.  Encoding a CD to ogg takes now 30 seconds less on my old Athlon X2 4600XP.  But, unfortunately oggdec and ogg1234 cannot decode anymore with the new libvorbisfile lib.  After the method ov_raw_seek is called the programs exit with a "Segmentation Fault".  After this method is called for the first time, the data members of vorbis_info like mode, rate, ... show only junk numbers like "-1223863434" hence the programs crash  ...

Ogg Vorbis acceleration project

Reply #124
I am simply amazed about the speed gain!

I compared different encoders, and for Ogg Vorbis, specifically, several specific builds, encoding a whole CD image (697 MB) on an AMD Phenom II X6 1045T, 2700 MHz. Times taken with ptime; best of 3 consecutive runs.



Germany uses a decimal comma. Basic oggenc2 builds are from RareWares.

That shouldn't leave any doubt that Ogg Vorbis, fine tuned by Lancer, is now probably the practically most efficient audio encoder, regarding a weighted relation between quality efficiency and speed efficiency. The FhG AAC encoder is close, but lacks of bitrate tuning flexibility (quite a large gap between VBR presets 4 and 5, targeting at 128 or 192 kbps).