HydrogenAudio

Lossy Audio Compression => Ogg Vorbis => Ogg Vorbis - General => Topic started by: Dukers on 2009-08-26 18:20:42

Title: Ogg Vorbis acceleration project
Post by: Dukers on 2009-08-26 18:20:42
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)

The last release is based on aoTuV Beta5. Anyone knows the status os the project, if there are plans to a next release?
Title: Ogg Vorbis acceleration project
Post by: The_Sven on 2009-11-15 15:38:04
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)

The last release is based on aoTuV Beta5. Anyone knows the status os the project, if there are plans to a next release?


As far as I know, it's dead
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-11-15 16:16:17
I wonder is it possible to determine the most time-consuming routines and manually port them from lancer to current aotuv?
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-11-17 17:41:09
I'm trying to mix "aotuv5.7+bs1" codec and lancer. Only 55...60% speed increase 

There are several routines that were changed from beta5 so I cannot accelerate them with lancer code.
Title: Ogg Vorbis acceleration project
Post by: IgorC on 2009-11-17 18:31:21
I'm trying to mix "aotuv5.7+bs1" codec and lancer. Only 55...60% speed increase 

There are several routines that were changed from beta5 so I cannot accelerate them with lancer code.

55-60% is very good speed boost. Can you post your code and binaries?

Some results for dual core:
Lancer 20061110 sse3+MT: 10 seconds
aotuv5.7b-P4: 7 seconds

That's 40-50% of speed gain.

Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-11-17 19:33:57
Currently I'm trying to make OPENMP work. Some changes made between b5 and b5.7 interfere with OPENMP support code. My current version just doesn't support multi-threading.

I tested several encoders with foobar2000 as a frontend (dual-core processor, but only 1 encoder process at a time).

Lancer SSE3:  60.79x realtime
Lancer SSE3+MT: 84.00x realtime

Oggenc2 (aotuv5.7) from RareWares: 29.16x realtime

b5.7+bs1 patch (MSVC): 26.11x realtime
b5.7+bs1 patch+lancer (MSVC): 41.07x realtime

I will upload the code and exe files one of these days.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-11-18 18:29:17
I decided not to add OpenMP support, it's too complicated for me. And, 2 parallel single-threaded encoders are still better than 1 multi-threaded.

Patches and Win32 MSVC compiles: http://www.hydrogenaudio.org/forums/index....showtopic=76272 (http://www.hydrogenaudio.org/forums/index.php?showtopic=76272)
Title: Ogg Vorbis acceleration project
Post by: IgorC on 2009-11-19 03:35:55
Here is +45% speed gain.
Thank you,lvqcl

I didn't find any issue with this version until moment.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2009-12-01 13:43:24
OK, folks, I finally got round to compiling under ICL 11.1.048. The results are from my development system which is an E4300 cpu clocked to 3GHz with 4GB OCZ Reaper in a Gigabyte EP45-DS3 running XP Pro SP3.

Code: [Select]
F:\testogg>oggenc2 -q 7 10.wav
Opening with wav module: WAV file reader
Encoding "10.wav" to
         "10.ogg"
at quality 7.00
        [100.0%] [ 0m00s remaining] /

Done encoding file "10.ogg"

        File length:  4m 15.0s
        Elapsed time: 0m 07.0s
        Rate:         36.5048
        Average bitrate: 228.0 kb/s


F:\testogg>oggenc2 -q 7 10.wav
Opening with wav module: WAV file reader
Encoding "10.wav" to
         "10.ogg"
at quality 7.00
        [ 99.6%] [ 0m00s remaining] \

Done encoding file "10.ogg"

        File length:  4m 15.0s
        Elapsed time: 0m 04.0s
        Rate:         63.8833
        Average bitrate: 228.0 kb/s


F:\testogg>

The first run is using the regular P4 ICL compile of aoTuV5.7 that's on Rarewares, the second is using lvqcl's Lancer-style patch. I'll leave you to do the maths, but the speed gain is somewhat alarming!! 

So, I suppose you will want to try this out? You can d/l from here: http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancer.zip)

If this appears stable, and I'd like feedback regarding this, then I'll make it generally available on Rarewares.
Title: Ogg Vorbis acceleration project
Post by: Alexxander on 2009-12-01 14:49:35
Quick test on Windows 7 64 bits on E6600 using foobar2000 v1.0 beta2 with pipelining and times indicated by foobar:

Converted 4 CD Box Set encoded at FLAC -8:
oggenc2.85-aoTuVb5.7-P4 with q5.0 -> 3:39 (219 secs)
oggenc2.85-aoTuVb5.7-Lancer with q5.0 -> 2:14 (134 secs)

Converted 5 CD Box Set encoded at FLAC -8:
oggenc2.85-aoTuVb5.7-P4 with q5.0 -> 5:11 (311 secs)
oggenc2.85-aoTuVb5.7-Lancer with q5.0 -> 3:23 (203 secs)

Thanks to all for this fantastic speedup!
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-01 17:46:04
Core2Duo E4300, --quality 5, 44.1/16/2ch:

unaccelerated version from RW: 29,9x realtime (=>100%);
accelerated (my MSVS9 compile): 44,0x realtime (~147%);
accelerated (ICL 11.1.048 by john33): 48,8x realtime (~163%);

//Lancer 20061110 (based on b5): 62,9x realtime (~210%).


Yes, ICL compile is faster than MSVS one.    Old Lancer is even faster, but further acceleration is beyond my knowledge. To anybody: feel free to accelerate it further.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-01 18:19:56
Two more suggestions:

1) To have more consistent date format, I think it is better to define CPDATE macro to e.g. "20091201"

2) oggenc2.exe says that it is "based on aoTuV exp-bs1" (and that isn't true). It was my mistake in the very first version of the patch... Current version reports "based on aoTuV b5d [20090301]". Please fix it.
Title: Ogg Vorbis acceleration project
Post by: skamp on 2009-12-01 20:57:17
Is the patch supposed to make it faster on Windows only? Last time I compiled it under linux, it didn't make vorbis encoding any faster.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-01 21:01:50
Don't forget to define __SSE__, __SSE2__, __SSE3__ macros.

...At least The_Sven compiled  aoTuV-beta5 (http://www.hydrogenaudio.org/forums/index.php?showtopic=76184) for Linux with success.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2009-12-02 15:26:38
Two more suggestions:

1) To have more consistent date format, I think it is better to define CPDATE macro to e.g. "20091201"

2) oggenc2.exe says that it is "based on aoTuV exp-bs1" (and that isn't true). It was my mistake in the very first version of the patch... Current version reports "based on aoTuV b5d [20090301]". Please fix it.

Fixed with a new upload available on the same link.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-02 18:28:39
I found a strange problem... The difference between various compiles is usually insignificant, but not always.

When (input samplerate = 22050 or 24000 Hz) AND (encoding quality > 5.000) there is noticeable difference between ICL and MSVS compiles.

For example: input file is 22.05kHz/2ch; quality is 8. => "Nominal" bitrate is 138.4 kbps.

P3/P4/Lancer from RW (ICL): 116.1 kbps
venc from Aoyumi (MINGW32?): 116.1 kbps
generic version from RW (MSVS 9): 136.3 kbps
my compiles (MSVS 9): 136.3 kbps.

(For 24kHz/-q8  the difference is 123.0 kbps for ICL vs 144.5 kbps for MSVS).
Title: Ogg Vorbis acceleration project
Post by: skamp on 2009-12-02 23:59:26
Don't forget to define __SSE__, __SSE2__, __SSE3__ macros.

Code: [Select]
$ gcc -dumpversion
4.4.2
$ CFLAGS="-march=i686 -msse -msse2 -msse3 -mssse3 -mfpmath=sse -pipe -D__SSE__ -D__SSE2__ -D__SSE3__" ./configure --prefix=/usr --disable-oggtest && make
Code: [Select]
make  all-recursive
make[1]: Entering directory `/tmp/aotuv-b5.7_20090301'
Making all in m4
make[2]: Entering directory `/tmp/aotuv-b5.7_20090301/m4'
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/tmp/aotuv-b5.7_20090301/m4'
Making all in include
make[2]: Entering directory `/tmp/aotuv-b5.7_20090301/include'
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301'
Making all in vorbis
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301/include/vorbis'
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301/include/vorbis'
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301/include'
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[3]: Nothing to be done for `all-am'.
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301/include'
make[2]: Leaving directory `/tmp/aotuv-b5.7_20090301/include'
Making all in vq
make[2]: Entering directory `/tmp/aotuv-b5.7_20090301/vq'
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/tmp/aotuv-b5.7_20090301/vq'
Making all in lib
make[2]: Entering directory `/tmp/aotuv-b5.7_20090301/lib'
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301'
Making all in modes
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301/lib/modes'
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib/modes'
Making all in books
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301/lib/books'
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301'
Making all in coupled
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301/lib/books/coupled'
make[5]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[5]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[4]: Nothing to be done for `all'.
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib/books/coupled'
Making all in uncoupled
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301/lib/books/uncoupled'
make[5]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[5]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[4]: Nothing to be done for `all'.
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib/books/uncoupled'
Making all in floor
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301/lib/books/floor'
make[5]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[5]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[4]: Nothing to be done for `all'.
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib/books/floor'
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301/lib/books'
make[5]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[5]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make[4]: Nothing to be done for `all-am'.
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib/books'
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib/books'
make[3]: Entering directory `/tmp/aotuv-b5.7_20090301/lib'
make[4]: Entering directory `/tmp/aotuv-b5.7_20090301'
make[4]: Leaving directory `/tmp/aotuv-b5.7_20090301'
/bin/sh ../libtool --tag=CC  --mode=compile gcc -DHAVE_CONFIG_H -I. -I.. -I../include   -O20 -ffast-math -mno-ieee-fp -D_REENTRANT -fsigned-char -Wdeclaration-after-statement -march=i686 -msse -msse2 -msse3 -mssse3 -mfpmath=sse -pipe -D__SSE__ -D__SSE2__ -D__SSE3__ -DUSE_MEMORY_H -MT mdct.lo -MD -MP -MF .deps/mdct.Tpo -c -o mdct.lo mdct.c
mkdir .libs
 gcc -DHAVE_CONFIG_H -I. -I.. -I../include -O20 -ffast-math -mno-ieee-fp -D_REENTRANT -fsigned-char -Wdeclaration-after-statement -march=i686 -msse -msse2 -msse3 -mssse3 -mfpmath=sse -pipe -D__SSE__ -D__SSE2__ -D__SSE3__ -DUSE_MEMORY_H -MT mdct.lo -MD -MP -MF .deps/mdct.Tpo -c mdct.c  -fPIC -DPIC -o .libs/mdct.o
In file included from mdct.c:49:
xmmlib.h:54: error: expected '=', ',', ';', 'asm' or '__attribute__' before '_MM_ALIGN16'
xmmlib.h:63: warning: data definition has no type or storage class
xmmlib.h:71: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:72: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:73: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:74: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:75: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:76: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:77: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:78: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:79: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:80: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:81: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:82: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:83: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:84: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:85: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:86: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:87: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:88: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:89: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:90: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:91: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:93: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:94: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:95: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:96: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:97: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:98: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:99: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h:100: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'extern'
xmmlib.h: In function '_mm_todB_ps':
xmmlib.h:112: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'float'
xmmlib.h:115: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'float'
xmmlib.h:119: error: expected ';' before 'U'
xmmlib.h:120: error: 'U' undeclared (first use in this function)
xmmlib.h:120: error: (Each undeclared identifier is reported only once
xmmlib.h:120: error: for each function it appears in.)
xmmlib.h:120: error: 'PABSMASK' undeclared (first use in this function)
xmmlib.h: In function '_mm_untnorm_ps':
xmmlib.h:142: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
xmmlib.h:146: error: 'PCS_RRRR' undeclared (first use in this function)
mdct.c: In function 'mdct_init':
mdct.c:117: error: 'PCS_RNRN' undeclared (first use in this function)
mdct.c:143: error: 'PCS_RRNN' undeclared (first use in this function)
mdct.c:144: error: 'PCS_RNNR' undeclared (first use in this function)
mdct.c:165: error: 'PCS_NNRR' undeclared (first use in this function)
mdct.c:215: error: 'PCS_NRNR' undeclared (first use in this function)
mdct.c:267: error: 'PCS_RRRR' undeclared (first use in this function)
mdct.c: In function 'mdct_butterfly_8':
mdct.c:426: error: 'PCS_NRRN' undeclared (first use in this function)
mdct.c:427: error: 'PCS_NNRR' undeclared (first use in this function)
mdct.c: In function 'mdct_butterfly_16':
mdct.c:461: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:462: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:463: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:464: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c: In function 'mdct_butterfly_32':
mdct.c:535: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:536: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:537: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:538: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:539: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:540: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:541: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c:542: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'const'
mdct.c: In function 'mdct_butterfly_generic':
mdct.c:949: error: 'PCS_RNRN' undeclared (first use in this function)
mdct.c: In function 'mdct_bitreverse':
mdct.c:1194: error: 'PCS_RNRN' undeclared (first use in this function)
mdct.c:1202: error: 'PFV_0P5' undeclared (first use in this function)
mdct.c: In function 'mdct_backward':
mdct.c:1302: error: 'PFV_0' undeclared (first use in this function)
mdct.c:1440: error: 'PCS_RRRR' undeclared (first use in this function)
make[3]: *** [mdct.lo] Error 1
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/tmp/aotuv-b5.7_20090301/lib'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/tmp/aotuv-b5.7_20090301'
make: *** [all] Error 2
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-03 16:00:49
I must admit that linux is not my favorite OS and I didn't have it installed on my HDD.

Maybe The_Sven can help you more than me... At least you can take his versions of xmmlib.c and xmmlib.h.
Here they are, extracted from vorbis-lancer-gcc-HEAD.tar.gz: [attachment=5532:xmmlib.zip]
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-03 17:56:32
Fixed with a new upload available on the same link.


Great. Yet it still writes "BS; LancerMod(SSE3) [Dec  2 2009] (based on aoTuV b5d [20090301])" to *.ogg files (although it's not a big problem  ).
Title: Ogg Vorbis acceleration project
Post by: john33 on 2009-12-03 18:31:18
Fixed with a new upload available on the same link.


Great. Yet it still writes "BS; LancerMod(SSE3) [Dec  2 2009] (based on aoTuV b5d [20090301])" to *.ogg files (although it's not a big problem  ).

Ah, I changed it in oggenc2/main.c, but not in vorbis/lib/info.c!  I'll make a similar amendment.
Title: Ogg Vorbis acceleration project
Post by: skamp on 2009-12-04 00:46:35
After replacing xmmlib.*, I get this:

Code: [Select]
gcc -D_V_SELFTEST -O20 -ffast-math -mno-ieee-fp -D_REENTRANT -fsigned-char -Wdeclaration-after-statement -march=i686 -msse -msse2 -msse3 -mssse3 -mfpmath=sse -pipe -D__SSE__ -D__SSE2__ -D__SSE3__ -DUSE_MEMORY_H -o test_sharedbook test_sharedbook-sharedbook.o  -lm  
test_sharedbook-sharedbook.o: In function `vorbis_book_clear':
sharedbook.c:(.text+0x346): undefined reference to `xmm_free'
sharedbook.c:(.text+0x355): undefined reference to `xmm_free'
sharedbook.c:(.text+0x364): undefined reference to `xmm_free'
sharedbook.c:(.text+0x373): undefined reference to `xmm_free'
sharedbook.c:(.text+0x382): undefined reference to `xmm_free'
test_sharedbook-sharedbook.o:sharedbook.c:(.text+0x3c1): more undefined references to `xmm_free' follow
test_sharedbook-sharedbook.o: In function `_make_words':
sharedbook.c:(.text+0x4e4): undefined reference to `xmm_malloc'
sharedbook.c:(.text+0x62b): undefined reference to `xmm_free'
sharedbook.c:(.text+0x72c): undefined reference to `xmm_free'
test_sharedbook-sharedbook.o: In function `_book_unquantize':
sharedbook.c:(.text+0x93a): undefined reference to `xmm_calloc'
test_sharedbook-sharedbook.o: In function `vorbis_book_init_decode':
sharedbook.c:(.text+0x125a): undefined reference to `xmm_malloc'
sharedbook.c:(.text+0x12b6): undefined reference to `xmm_free'
sharedbook.c:(.text+0x12d7): undefined reference to `xmm_malloc'
sharedbook.c:(.text+0x131a): undefined reference to `xmm_malloc'
sharedbook.c:(.text+0x13ac): undefined reference to `xmm_calloc'
test_sharedbook-sharedbook.o: In function `vorbis_staticbook_destroy':
sharedbook.c:(.text+0x4b9): undefined reference to `xmm_free'
collect2: ld returned 1 exit status
make[3]: *** [test_sharedbook] Error 1
make[3]: Leaving directory `/tmp/aotuv-b5.7_20090301-lancer/lib'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/tmp/aotuv-b5.7_20090301-lancer/lib'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/tmp/aotuv-b5.7_20090301-lancer'
make: *** [all] Error 2

Note that The_Sven's version compiles successfully (with CFLAGS='-O3 -march=i686 -msse -mfpmath=sse -pipe -fno-strict-aliasing').
Title: Ogg Vorbis acceleration project
Post by: skamp on 2009-12-04 01:49:26
Even after fixing compiling errors, oggenc fails with "undefined symbol: PCS_NRNR". Recompiling vorbis-tools also fails with several errors regarding "undefined reference to…".
Title: Ogg Vorbis acceleration project
Post by: john33 on 2009-12-04 12:50:18
Fixed with a new upload available on the same link.


Great. Yet it still writes "BS; LancerMod(SSE3) [Dec  2 2009] (based on aoTuV b5d [20090301])" to *.ogg files (although it's not a big problem  ).

Duly amended with a new version on the same link.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-04 15:57:05
Quote
Even after fixing compiling errors, oggenc fails with "undefined symbol: PCS_NRNR". Recompiling vorbis-tools also fails with several errors regarding "undefined reference to…".


Well, PCS_NRNR is defined in xmmlib.c and used in mdct.c and smallft.c... I think you should add "xmmlib.c" and "xmmlib.h" to "libvorbis_la_SOURCES" variable in aoTuV makefile (as it done in this compile (http://www.hydrogenaudio.org/forums/index.php?showtopic=76184))

Quote
Duly amended with a new version on the same link.


Thanks!
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2009-12-10 10:48:39
Here is an x64 compile of oggenc2.85 with FLAC support. Currently this is based on the standard 1.2.3 libs, but if some kind people would like to test this and confirm whether it's OK, or not, I'll go ahead and build an aoTuV version.

http://www.rarewares.org/files/ogg/oggenc2.85-1.2.3-x64.zip (http://www.rarewares.org/files/ogg/oggenc2.85-1.2.3-x64.zip)

Any chance to have an x64 compile of this Lancer mod too ?

Title: Ogg Vorbis acceleration project
Post by: john33 on 2009-12-11 12:05:32
Here is an x64 compile of oggenc2.85 with FLAC support. Currently this is based on the standard 1.2.3 libs, but if some kind people would like to test this and confirm whether it's OK, or not, I'll go ahead and build an aoTuV version.

http://www.rarewares.org/files/ogg/oggenc2.85-1.2.3-x64.zip (http://www.rarewares.org/files/ogg/oggenc2.85-1.2.3-x64.zip)

Any chance to have an x64 compile of this Lancer mod too ?

I'm afraid not as it would require heavy modification of the Lancer code which is something I have neither the time, nor the skills to do!
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-11 17:01:41
I don't have x64 OS yet... so I cannot even test my x64 compile, but here it is (MSVS9 compile, SSE3): [obsolete; removed]

Maybe it even works.  It lacks FLAC and SRC support, though.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2009-12-12 11:00:36
I don't have x64 OS yet... so I cannot even test my x64 compile, but here it is (MSVS9 compile, SSE3): [attachment=5536:oggenc2_x64_test.7z]

Maybe it even works.  It lacks FLAC and SRC support, though.

Cool, it encodes correctly (on WinXP64 SP2 - E2200 @ 2,35 GHz - 1 Gb).

Now testing speeds...
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2009-12-12 12:29:39
Very fast graphic reply of tests (i'll explain better tomorrow):

(http://www.forart.it/_audiotest/encoders.png)
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2009-12-13 20:25:06
OK, here's the encoders explaination:



Hope that inspires...
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2009-12-14 17:03:30
john33, I tested your Lancer compile with the following input:
8...48kHz/16 bit/mono or stereo, -q2, -q1, ... -q10;
44.1kHz/16bit/stereo, -q2, -q1.9, -q1.8, ...-q9.9, -q10.

Everything is fine  but its ENCODE_VENDOR_STRING ends with "\n" again
Title: Ogg Vorbis acceleration project
Post by: john33 on 2009-12-14 17:18:41
john33, I tested your Lancer compile with the following input:
8...48kHz/16 bit/mono or stereo, -q2, -q1, ... -q10;
44.1kHz/16bit/stereo, -q2, -q1.9, -q1.8, ...-q9.9, -q10.

Everything is fine  but its ENCODE_VENDOR_STRING ends with "\n" again

Amended, recompiled and uploaded on same link. Hopefully all is now well!!
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2009-12-15 07:55:03
More details about my above encoding tests:


Ask me for any other doubt/question.
Title: Ogg Vorbis acceleration project
Post by: Compact Dick on 2009-12-19 08:55:52
I'll leave you to do the maths, but the speed gain is somewhat alarming!!

Alarming in the most pleasing manner possible

Quote
If this appears stable, and I'd like feedback regarding this, then I'll make it generally available on Rarewares.

I've been using your builds for a few weeks now, everything's performing as expected. Thanks, john33 [and lvqcl] for doing the hard work. I'm loving it
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-01-04 07:02:34
cant wait to see somebody do an assimbly enhanced version for x64 cpu's, my athlon II 435@phenom II b35 3.5gz loves 64bit
Title: Ogg Vorbis acceleration project
Post by: Dukers on 2010-02-03 01:35:22
I completely forgot about this topic....

Fantastic job lvqcl and john33! 
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-06-12 08:10:34
lvqcl if you still need an x64 os get ahold of me, I may be able to help you out there, I would really like to see a nice well tested/tweaked x64 version of vorbis(not that i dont like the x64 one that we already have, but the more the better)
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-06-12 12:40:55
As you can see from the picture in post #29, Lancer doesn't benefit from 64-bitness.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-06-15 09:03:11
No Lancer x64 in the above tests !!!

BTW, according to my tests, x64 build can achieve a similar performances compared to Lancer.
So it would be really interesting to encode with an x64 + Lancer optimizations Vorbis build...

It would also be very interesting to compare compilers' performances, IMHO.
Title: Ogg Vorbis acceleration project
Post by: ilikedirtthe2nd on 2010-06-15 15:04:50
nice improvements, but i had a slightly hard time figuring this thead out... could you put the stuff that works up on rarewares.org? maybe with a litte explanation of the different versions.

thanks
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-06-15 15:30:07
No Lancer x64 in the above tests !!!

  Buuuuut... you said
oggenc264 is the last build posted here by lvqcl;

I think you mean the attachment in the post #27. And this is -- Lancer x64 build.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-06-16 11:22:07
I think you mean the attachment in the post #27. And this is -- Lancer x64 build.

I haven't noticed that it was Lancer build !!!
BTW, if so, it's not a good build 'cause it MUST achieve at least the same performances of x86 build (Lancer has ASM optimizations...).

A question: it's builded from the same oggenc2LM patched sources ?
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-06-16 15:06:50
john33's oggenc2LM was compiled with Intel (optimizing) compiler, and mine - with MSVS compiler. That's the only difference.

Also, x64 MSVS compile (aka oggenc264) is slightly faster than x86 MSVS compile (oggenc2SSE2/oggenc2SSE3).
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-03 22:45:11
@john33: Lame x64 officially available @ RareWares, what about Vorbis ???
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-04 08:34:25
@john33: Lame x64 officially available @ RareWares, what about Vorbis ???

Essentially because the compiles that I've produced work fine on XP x64, but instantly crash on Windows 7 x64 and I haven't yet managed to figure out why!! If anyone has any ideas, please fire away.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-07-04 09:00:34
Once I encountered a problem (with debug compile) with the following code in encode.c:

Code: [Select]
        /* Next 3 lines added to add padding bytes into comment header for tagging space.
         */
        header_comments.packet = realloc(header_comments.packet, header_comments.bytes + opt->padding);
        memset(header_comments.packet + header_comments.bytes, 0, opt->padding);
        header_comments.bytes += opt->padding;


And oggenc2 stopped crashing in debug mode after removing this code.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-04 15:02:40
Once I encountered a problem (with debug compile) with the following code in encode.c:

Code: [Select]
        /* Next 3 lines added to add padding bytes into comment header for tagging space.
         */
        header_comments.packet = realloc(header_comments.packet, header_comments.bytes + opt->padding);
        memset(header_comments.packet + header_comments.bytes, 0, opt->padding);
        header_comments.bytes += opt->padding;


And oggenc2 stopped crashing in debug mode after removing this code.

You're absolutely right!  Although, why this should be OK in XP x64 and not Windows 7 x64 beats me.

Anyway, the following download is a standard 1.3.1 lib compile for x64 and does work on XP x64 and Win 7 x64:

http://www.rarewares.org/files/oggenc2.87-1.3.1-x64.zip (http://www.rarewares.org/files/oggenc2.87-1.3.1-x64.zip)

Regarding the above, I have obviously removed the 'Comment Padding' option. Unless there are any tagging programs out there that will add/amend vorbis comments via update-in-place, and I don't believe there are, then this was always of little value anyway.

Edit: If there are no reports of any issues with this, then I'll add it to the normal Rarewares ogg page.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-07-04 15:28:32
IIRC...  the address that stored in header_comments.packet is also stored in some other variable. So if realloc doesn't change this address then oggenc works but if it returns different address -- oggenc crashes.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-04 15:48:55
IIRC...  the address that stored in header_comments.packet is also stored in some other variable. So if realloc doesn't change the address oggenc works but if it returns different address -- oggenc crashes.

Makes sense. Thanks very much for the assist.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-08 13:27:38
If there are no reports of any issues with this, then I'll add it to the normal Rarewares ogg page.


Just tested under both x64 Windows (7 and XP), it works.

Another question: what about merging lancer optimizations ?
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-08 15:57:50
If there are no reports of any issues with this, then I'll add it to the normal Rarewares ogg page.


Just tested under both x64 Windows (7 and XP), it works.

Another question: what about merging lancer optimizations ?

Thanks for letting me know.

Unfortunately, the Lancer mods require inline assembler support and that's a no-no in x64.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-09 09:45:41
oggenc264 is the last build posted here by lvqcl;

I think you mean the attachment in the post #27. And this is -- Lancer x64 build.

...I'm confused... 
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-09 10:25:22
oggenc264 is the last build posted here by lvqcl;

I think you mean the attachment in the post #27. And this is -- Lancer x64 build.

...I'm confused... 

Well, using the Lancer mods as used for the 32 bit build, the builds fail for 64 bit with both the MSVC and Intel compilers for the reason stated. Quite what the Lancer x64 build really is, I'd be interested to know.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-07-09 17:00:47
Well, using the Lancer mods as used for the 32 bit build, the builds fail for 64 bit with both the MSVC and Intel compilers for the reason stated. Quite what the Lancer x64 build really is, I'd be interested to know.


64-bit patch was uploaded here: http://www.hydrogenaudio.org/forums/index....st&p=668288 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=76272&view=findpost&p=668288)
(it's patch64.7z (http://www.hydrogenaudio.org/forums/index.php?act=attach&type=post&id=5537) file)
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-09 17:23:52
Well, using the Lancer mods as used for the 32 bit build, the builds fail for 64 bit with both the MSVC and Intel compilers for the reason stated. Quite what the Lancer x64 build really is, I'd be interested to know.


64-bit patch was uploaded here: http://www.hydrogenaudio.org/forums/index....st&p=668288 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=76272&view=findpost&p=668288)
(it's patch64.7z (http://www.hydrogenaudio.org/forums/index.php?act=attach&type=post&id=5537) file)

Thank you kindly.  I'll get to this over the weekend, hopefully.

I guess I really should pay closer attention!!
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-10 09:11:48
all of these encoders fail on files that are 6+hours long(not sure how far below 6 hours you need to go for them not to fail)

Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-15 17:21:23
Lancer x64 build now available here: http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancerx64.zip)

If there are no problems, I'll probably post these on Rarewares.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-18 00:53:52
task manager shows this exe being 32bit.........
Title: Ogg Vorbis acceleration project
Post by: LordWarlock on 2010-07-18 10:17:26
Yeah, it's 32bit. It runs on my 32bit Windows, so it's not just a bad detection.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-18 11:13:03
Yep, sorry guys!  Had the wrong file in it. Same link for the correct file now.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-18 20:51:58
Thanks john33, sad to say but at least for me this encoder crashes with dbpoweramp/mediacoder/textaloud3 when attempting to test, windows dosnt even say the encoder crashed it just dosnt work....not sure why.....

went back to the other x64 lancer build (By the other guy) and it works......
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-18 21:26:17
Thanks john33, sad to say but at least for me this encoder crashes with dbpoweramp/mediacoder/textaloud3 when attempting to test, windows dosnt even say the encoder crashed it just dosnt work....not sure why.....

went back to the other x64 lancer build (By the other guy) and it works......

Hmmm, OK, odd, it works here under Windows 7 x64 Ultimate and XP Pro x64 but I'll take a look at it. Thanks for the feedback.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-18 22:58:08
humm, strange, 7 x64 here as well, with or without compat settings it gives me same problem....frustrating, will mess with it some more soon, quite odd im having problems but your not......
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-19 00:42:40
Code: [Select]
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\02 Live For Him.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\02 Live For Him.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\03 You Should Know.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\03 You Should Know.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\01 Intro.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\01 Intro.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\07 Time to Play.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\07 Time to Play.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\08 Open Your Eyes.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\08 Open Your Eyes.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\04 Above.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\04 Above.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\05 Original Superman.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\05 Original Superman.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\06 Guess Who's Won.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\06 Guess Who's Won.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\09 Something Real.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\09 Something Real.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\11 Reaching Out.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\11 Reaching Out.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\13 All Day Everyday.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\13 All Day Everyday.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\10 Unity.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\10 Unity.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\12 Galactic Groove.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\12 Galactic Groove.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
 
Error converting to ogg vorbis (aoTuV SSE), 'C:\Users\Bain2k9\Desktop\Pillar - Above\14 Father.flac' to 'C:\Users\Bain2k9\Desktop\Pillar - Above\14 Father.ogg'
  Error writing audio data to StdIn Pipe  [clEncoder::EncodeBlock]
same kind of issue with other apps i try and use with it.....any ideas?

Mind you, the "ogg vorbis (aoTuV SSE)" is the version I replace for testing.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-19 09:29:43
OK, while I can find nothing that would be causing a problem, there is a fresh compile on the same link. Nothing has changed except that I tidied up the compilation a little. This was compiled on a 32 bit XP Pro system with x64 target using the Intel 11.1.065 compiler. While the compiler is also installed on the 7 x64 system, it is not installed on the XP x64 system. There are no dependencies on the Intel dlls so I'm not sure where any problem may lie.

I have tested this on Win 7 x64 Ultimate and XP Pro x64 with wav input, direct flac input (this version does support direct flac input) and piped input from the flac decoder. It worked fine, gave the same output and ran at exactly the same speed on both systems.

Anyone else tried this with, or without, issues?
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-19 10:21:03
Anyone else tried this with, or without, issues?


Sorry, I'll can test only tomorrow...

BTW i suggest you to ask for testers @ http://www.start64.com/forum/ (http://www.start64.com/forum/) 
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-19 14:58:36
all of these encoders fail on files that are 6+hours long(not sure how far below 6 hours you need to go for them not to fail)

At the expense of asking the obvious, I take it that you are using the "--ignorelength" option?
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-19 15:32:12
will check that when i get home from work, I was able to test the current build of your x64 encoder, and it strangely still fails, I will test more at home after work, just not sure why this is failing, no im not trying direct flac to vorbis at the moment.

Will there be a build that can do direct flac to vorbis in the future tho?


this version fails for me
http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancerx64.zip)

this version works fine
http://www.hydrogenaudio.org/forums/index....ost&id=5536 (http://www.hydrogenaudio.org/forums/index.php?act=attach&type=post&id=5536)
found in post #27
http://www.hydrogenaudio.org/forums/index....st&p=672818 (http://www.hydrogenaudio.org/forums/index.php?showtopic=74345&view=findpost&p=672818)


and yes, I have tested with direct .wav to vorbis, same issue
not sure why, I wont be able to test at work(no encoding apps setup at the office)
Title: Ogg Vorbis acceleration project
Post by: [JAZ] on 2010-07-19 19:24:12
@john33: The file currently at http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancerx64.zip) , works here with a Windows 7 x64, and is correctly seen as a 64bit application (i.e. no "*32" shown in the task manager).

Code: [Select]
E:\>oggenc2.exe --help
OggEnc v2.85x64 (LancerMod [20100719](SSE3) based on aoTuV b5d [20090301])
(c) 2000-2005 Michael Smith <msmith@xiph.org>
& portions by John Edwards <john.edwards33@ntlworld.com>

Code: [Select]
E:\>oggenc2.exe PSY\DIGDREAMz.mod.wav
Opening with wav module: WAV file reader
Encoding "PSY\DIGDREAMz.mod.wav" to
         "PSY\DIGDREAMz.mod.ogg"
at quality 3,00
        [ 97,1%] [ 0m00s remaining] /

Done encoding file "PSY\DIGDREAMz.mod.ogg"

        File length:  1m 37,0s
        Elapsed time: 0m 04,0s
        Rate:         24,2700
        Average bitrate: 113,5 kb/s



Mmm..... "SSE3".  All x64 processors support SS3, right?
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-07-19 19:42:23
Quote
Mmm..... "SSE3". All x64 processors support SS3, right?


http://en.wikipedia.org/wiki/SSE3 (http://en.wikipedia.org/wiki/SSE3) : "In April 2005, AMD introduced a subset of SSE3 in revision E (Venice and San Diego) of their Athlon 64 CPUs."

=>Early Athlons 64 (prior to revision E) don't support SSE3.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-19 19:55:56
Hmmm, may be that's the issue. Would it be worth compiling up to SSE2 and seeing if that resolves it?
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-20 03:56:51
wouldnt think that would be the problem, My cpu supports sse4a(PhenomII 1055t)
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-20 09:04:59
Just tested on my office PC (Intel E2200/XP 64) and perfectly works with Foobar 1.1
Title: Ogg Vorbis acceleration project
Post by: robert on 2010-07-20 10:29:15
Hmmm, may be that's the issue. Would it be worth compiling up to SSE2 and seeing if that resolves it?



wouldnt think that would be the problem, My cpu supports sse4a(PhenomII 1055t)


Just a thought, maybe limiting to SSE3 will resolve the issue, as AMD's SSE4 instructions don't seem to have much in common with Intel's SSE4 instructions. (IIRC  POPCNT being the only one in common)
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-20 12:32:28
Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:

SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64.zip)
SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-SSE2.zip)
SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-SSE.zip)

I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise.

Feedback and experience with these would be welcome.

TIA.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-20 14:42:05
OK, I just succesfully tested ALL builds both on XP Pro and 7 Ultimate @ 64 bits.

CPU: Sempron 140 unlocked @ AMD Athlon II X2 440
XP64: foobar 1.0.3 / q5
7 64: foobar 1.1 / q0

Source information:
-----------------------
File: Krless - Ecce Krless.flac
Duration : 1:03:45.307 (168696024 samples)
Sample Rate : 44100 Hz
Channels : 2
Bits Per Sample : 16
Bitrate : 957 kbps
Codec : FLAC
Encoding : lossless
Tool : reference libFLAC 1.1.4 20070213
Embedded Cuesheet : no
Audio MD5 : BE924B523EF3E6281357C403BBA6650B

Faster version *seems* (human impression) SSE2 < SSE3 < SSE.

EDIT: ok, just tested all 3 encoders on 96 KHz / 24 bit source and works on XP64/foobar 1.0.3/q5

Source information:
-----------------------
Duration : 31:46.000 (182976000 samples)
Sample Rate : 96000 Hz
Channels : 2
Bits Per Sample : 24
Bitrate : 4608 kbps
Codec : PCM
Encoding : lossless

Speed *seems* (human impression again) exactly the opposite (btw source is not compressed): SSE < SSE2=SSE3.

...i'll investigate better this next week.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-20 14:55:27
Excellent, thanks.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-20 19:58:24
Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:

SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64.zip)
SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-SSE2.zip)
SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-SSE.zip)

I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise.

Feedback and experience with these would be welcome.

TIA.


will test when i get home.

will these give the same quality as the aoTuV 5.7 builds, or would there need to be another build based on aoTuV 5.7?
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-20 20:33:27
These are all based on 5.7.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-22 01:19:51
this is just very odd, every time i try and run them either mediacoder crashes or dbpoweramp errors, yet the older compile still works fine.....
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-22 16:24:46
OK, i made a quick test @ q0 with different bitrate sources (24/96 and 16/44), here's results (on AMD):

SSE < SSE3 = SSE2

SSE3 is slightly faster than SSE2 (which is the slower), both are slower - 2/3 sec. in any encoding - than SSE which is obviously the winner.

I'll test same files on different processor to understand if Intel compiler cheats.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-22 18:19:56
this is just very odd, every time i try and run them either mediacoder crashes or dbpoweramp errors, yet the older compile still works fine.....

It is, indeed. 

Given that the several versions do appear to run on AMD cpus, it's difficult to come to any conclusion. I'll produce a VC compile in the next day, or so, and it will be interesting to see whether that runs. If you also have problems with that, it would seem that the problem lies somewhere in your set up.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-22 18:24:57
OK, i made a quick test @ q0 with different bitrate sources (24/96 and 16/44), here's results (on AMD):

SSE < SSE3 = SSE2

SSE3 is slightly faster than SSE2 (which is the slower), both are slower - 2/3 sec. in any encoding - than SSE which is obviously the winner.

I'll test same files on different processor to understand if Intel compiler cheats.

Thanks.  I know Intel is supposed to be removing that bias but I don't actually know whether it is supposed already to have happened, or whether it will be at some future time. 

As mentioned above, on my own systems (a variety of Intel CPUs - q6600s, e6600, e6420 and e4300, all overclocked to varying degrees), there is really little to choose between the compiles for speed.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2010-07-22 20:55:06
BTW there is a small utility called iccpatch that removes any check for intel/non-intel processor in executable files.

Found 3 versions of it:

http://freearc.org/download/testing/iccpatch.rar (http://freearc.org/download/testing/iccpatch.rar)
http://lunatics.kwsn.net/downloads/iccpatch-windows0.7z (http://lunatics.kwsn.net/downloads/iccpatch-windows0.7z)
and www . mediafire . com/?b7dprscd53sx4t3

But I don't expect significant difference...
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-23 05:18:56
BTW there is a small utility called iccpatch that removes any check for intel/non-intel processor in executable files.

Found 3 versions of it:

http://freearc.org/download/testing/iccpatch.rar (http://freearc.org/download/testing/iccpatch.rar)
http://lunatics.kwsn.net/downloads/iccpatch-windows0.7z (http://lunatics.kwsn.net/downloads/iccpatch-windows0.7z)
and www . mediafire . com/?b7dprscd53sx4t3

But I don't expect significant difference...


any instructions on how to use this?
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-23 09:22:51
any instructions on how to use this?


I found this "How I (an Idiot) applied the AMD Patch..." guide:

Quote
  • I opened explore and made new folder in C: called Patch.
  • I downloaded the ICC progam and used the 7z unzipper from chickens site and unzipped it into C:\\Patch
  • I downloaded the SaH_5.15_KWSN_SSE3-Intel_Ben-Joe_2.0_B.7z and unzipped it into C:\\Patch
  • I clicked run and typed cmd.
  • Somehow I got it to C:
  • I typed CD patch and got to C:\\Patch
  • I then typed iccpatch.exe SaH_5.15_KWSN_SSE3-Intel_Ben-Joe_2.0_B.exe (the space between the iccpatch.exe and SaH seems to be very important)then clicked enter.
  • it said something about 10 instances and I closed it.
  • I closed boinc and 2 instances of Cli.exe in task manager.
  • I renamed a copy of Boinc to Boinc1.
  • I opened Boinc\\projects and deleted the app with the seti Icon.
  • I copied the SaH_5.15_KWSN_SSE3_Ben-Joe_2.0_B with the seti Icon, the SaH_5.15_KWSN_SSE3_Ben-Joe_2.0_B PDB file and one called SaH_5.15_KWSN_SSE3_Ben-Joe_2.0_B.exe~ into the original boinc folder and said YES to all.
  • closed all windows and restarted boinc with no problems.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-23 16:16:09
This version of the patcher comes with instructions, etc.

http://encode.ru/attachment.php?attachment...mp;d=1274432954 (http://encode.ru/attachment.php?attachmentid=1304&d=1274432954)

I have not tried this so cannot warrant whether it is helpful, or otherwise.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-24 11:00:36
OK, did a quick test on Intel E2200/XP64 (work PC) and speed results seems more correct:

SSE3 < SSE2 < SSE

SSE3 is the winner, SSE2 is slightly slower than it and SSE is the loser (8 sec. difference !).

I honestly don't know if it's the compiler "bug", but i suggest a compilers' shootout (GCC vs. ICC vs. MSC)
Title: Ogg Vorbis acceleration project
Post by: Steve Forte Rio on 2010-07-24 11:18:15
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-24 11:28:28
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancer.zip) 
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-24 12:21:16
MSVC SSE3 compile here: http://www.rarewares.org/files/ogg/oggenc2...ancerx64-VC.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-VC.zip)

This has no FLAC support, but otherwise runs OK. On one of my systems (XP x64), speedwise, it's mid way between the regular oggenc2x64 and the full ICL SSE3 Lancer x64 build.

Let me know if this runs, or not!
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-07-25 00:45:22
MSVC SSE3 compile here: http://www.rarewares.org/files/ogg/oggenc2...ancerx64-VC.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-VC.zip)


Quick test: it works (at least with 16/44 source) and is slower - about 8/9 secs. - than ICL on my AMD.

Would love to test GCC (MinGW (http://www.mingw.org/)) build too !

...and "someone" uses Orc (http://code.entropywave.com/projects/orc/) too...
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-07-25 08:18:53
MSVC SSE3 compile here: http://www.rarewares.org/files/ogg/oggenc2...ancerx64-VC.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-VC.zip)

This has no FLAC support, but otherwise runs OK. On one of my systems (XP x64), speedwise, it's mid way between the regular oggenc2x64 and the full ICL SSE3 Lancer x64 build.

Let me know if this runs, or not!


crashes with dbpoweramp even with .wav files, again, no clue why, everything else has been working, and the older x64 builds work great(just have to set them to vista compat mode to keep them from crashing)
Title: Ogg Vorbis acceleration project
Post by: Lear on 2010-07-25 12:54:35
I made some speed tests on a full CD using the recently posted builds, as well as one without the Lancer optimizations (using aoTuV 5.7). The tests were done on a Core2 Quad Q9550 running 64-bit Vista:

Code: [Select]
Stadard, x86              31,6343x
Lancer, x86               58,1845x
Lancer, x64, MSVC SSE3    55,2260x
Lancer, x64, SSE          55,2260x
Lancer, x64, SSE2         62,6603x
Lancer, x64, SSE3         63,8889x

The files from the x86 builds had a slightly higher bitrate compared to the x64 builds (108.8 vs 107 kb/s).
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-25 15:27:27
MSVC SSE3 compile here: http://www.rarewares.org/files/ogg/oggenc2...ancerx64-VC.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-VC.zip)

This has no FLAC support, but otherwise runs OK. On one of my systems (XP x64), speedwise, it's mid way between the regular oggenc2x64 and the full ICL SSE3 Lancer x64 build.

Let me know if this runs, or not!


crashes with dbpoweramp even with .wav files, again, no clue why, everything else has been working, and the older x64 builds work great(just have to set them to vista compat mode to keep them from crashing)

I'm sorry to see that you still have problems with these builds. I'd like to try to help more but I really am completely out of ideas. If anyone has any other suggestions or ideas, please let's hear them!
Title: Ogg Vorbis acceleration project
Post by: john33 on 2010-07-25 15:28:49
Thanks to those have provided feedback and test results. Lear's results pretty much align with my own.
Title: Ogg Vorbis acceleration project
Post by: Lear on 2010-07-25 16:34:29
I'm sorry to see that you still have problems with these builds. I'd like to try to help more but I really am completely out of ideas. If anyone has any other suggestions or ideas, please let's hear them!

Finding the problem without a debugger can be difficult. Seems like you can get something similar to core dumps on Windows, using WinDbg (see here (http://wiki.zimbra.com/wiki/Creating_a_Core_Dump_from_a_Running_Process_using_WinDbg) for a short description). Haven't used WinDbg myself though.
Title: Ogg Vorbis acceleration project
Post by: RazorBoy143 on 2010-08-03 19:55:53
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancer.zip) 


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-08-05 02:10:55
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancer.zip) 


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(


try compat mode and set it to vista, see if that helps.
Title: Ogg Vorbis acceleration project
Post by: RazorBoy143 on 2010-08-06 22:06:59
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancer.zip) 


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(


try compat mode and set it to vista, see if that helps.


I don't understand. Could you be more specific?
Title: Ogg Vorbis acceleration project
Post by: RazorBoy143 on 2010-08-06 22:25:58
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.
Title: Ogg Vorbis acceleration project
Post by: Fool_on_the_hill on 2010-08-07 05:50:17
is it possible to make a universal encoder, which could recognize what SSE commands your processor supports?
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2010-08-12 12:28:20
Not only !

DarkWave Studio (http://www.experimentalscene.com/software/darkwave-studio/) automatically select x86/x64 and correct SSE* instructions to use...
Title: Ogg Vorbis acceleration project
Post by: [JAZ] on 2010-08-12 19:27:49
@forat.eu:  I guess you are aware that you cannot go with an "if" statement everytime you have to decide if you should use one technology or another, right?

With this, I mean that for an application to properly and automatically (or even manually via a setting) switch between using SSE or not, and not lose all gains the programmer has to write full independent paths for those operations.  ("If" statements do really slow things).

Yet, this also means lots of work for the programmer, since he has to do manually what a compiler does automatically for him.
I guess the best way to do so for a multi-file program would be to have different dll's, each one with the same code, but compiled for different processors (by the compiler). The main program could choose to load one dll or another depending on where it is run. Anyway, this is not a perfect solution.


Also, when you throw in x86/x64, are you talking of an installer, or an application??  If it is an installer, the point is moot, since here we were talking about an executable program.

The only program that I've seen doing a sort of "Universal binary" for x86 and x64 is  Microsoft's (or Mark russinovich's) Process Explorer.  Version 11 of this software, when run on an x64 machine, extracts a x64 binary from itself and then executes that file  (the downloaded file sizes 3.7MB. The x64 file sizes 950KB).  So if you add up, there's clearly more space used than just having two separate files. It just makes it more handy when you're talking of small files.
Title: Ogg Vorbis acceleration project
Post by: galacticninja on 2010-08-14 15:21:33
Nevermind. I just found out my system's processor doesn't support SSE3. I'd have to have one that's WinXP/SSE/SSE2/foobar2000 compatible. So if something like that isn't avaliable, I know the guys over at dBpoweramp have their own aoTuvb.5.7/Lancer builds for use with their program, so that would be the only way, for me, to get a stable, up-to-date Lancer build to use.

Try the SSE2 compile here: (oggenc2.7z - http://www.hydrogenaudio.org/forums/index....mp;#entry668288 (http://www.hydrogenaudio.org/forums/index.php?showtopic=76272&st=0&p=668288&#entry668288) ; aoTuV beta 5.7 vorbis encoder with some parts of Lancer project ). I have been able to use this with dBpoweramp in a Windows XP computer with only up to SSE2 processor support.

Also, upon checking upon dBpoweramp's website, it seems that they've updated the available Ogg Vorbis lancer encoder installer for dBpoweramp to the b5.7 2009-03-03 version, from the previous  b5 2006-10-24 version they had in the site a few months ago ( http://forum.dbpoweramp.com/showthread.php?t=18713 (http://forum.dbpoweramp.com/showthread.php?t=18713) ). I do now know the source or compiler of this one, though.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2010-08-17 17:55:29
Where can I download the fastest (and latest) SSE3 32-bit version of accelerated oggenc2?

It's still this one:
http://www.rarewares.org/files/ogg/oggenc2...b5.7-Lancer.zip (http://www.rarewares.org/files/ogg/oggenc2.85-aoTuVb5.7-Lancer.zip) 


I hope Steve had better luck getting this to work, because all I got was that infamous "Windows has to shut down this encoder because something's wrong it" message when I tried to use it in foobar2000 :-(


try compat mode and set it to vista, see if that helps.


I don't understand. Could you be more specific?

2 ways to do this one is listed here

http://www.sevenforums.com/tutorials/316-c...ility-mode.html (http://www.sevenforums.com/tutorials/316-compatibility-mode.html)

http://lifehacker.com/5466628/learn-to-use...with-older-apps (http://lifehacker.com/5466628/learn-to-use-windows-7s-compatibility-mode-with-older-apps)

hope this helps.
Title: Ogg Vorbis acceleration project
Post by: demi on 2010-08-19 04:46:14
I tried ur build.

TEST SETUP:
CPU: AMD Athlon II X4 (208*14)
OS: Win7 64bit
Encoder: BS; (LancerMod [20100720](SSE3) based on aoTuV b5d [20090301])

I had convert flac to ogg q5. It seems generate correct files but cpu usage is abnormal.
I ran 4 encoder simultaneously, each process consume around 5% of cpu time.
So 4process consume just 20% CPU time. 80% is free.
It looks like lack of IO perf, but I think it doesnt matter cuz I tested on free 7200prm hdd.
SSE2 version also bring this problem.

In addition, I tested another encoder 'BS; (LancerMod [20091214](SSE3) based on aoTuV b5d [20090301])'
It works great and faster than john's earlier build. Peak speed up to 150x, fantastic!

I hope it will help john's work

Thanks for the feedback and suggestions. In the hope of resolving this, here are three compiles, this time with oggenc2.87:

SSE3 - http://www.rarewares.org/files/ogg/oggenc2...7-Lancerx64.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64.zip)
SSE2 - http://www.rarewares.org/files/ogg/oggenc2...cerx64-SSE2.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-SSE2.zip)
SSE - http://www.rarewares.org/files/ogg/oggenc2...ncerx64-SSE.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb5.7-Lancerx64-SSE.zip)

I have to say that for standard length song tracks, ie., approx. 4 mins, there seems to be negligible speed difference between them on a q6600 @ 3.2GHz and 8GB DDR2 although any difference will no doubt be more apparent on a longer encoding exercise.

Feedback and experience with these would be welcome.

TIA.
Title: Ogg Vorbis acceleration project
Post by: Steve Forte Rio on 2010-09-16 20:45:40
I've just tried accelerated oggenc on my new Core i3 . Here is short results:

Oggenc2.85 using aoTuVb5.7 P4 version - 36.79x
oggenc2.85-aoTuVb5.7-Lancer - 58.14x

Windows 7 x32, Core i3 530 @ 2.94GHz,  2x2 Gb DDR3-1333

Great speedup, thanks for your work


P.S. Maybe this is a stupid question but is it possible to use SSE4.1/4.2 optimizations that are available with latest Intel CPU's?
Title: Ogg Vorbis acceleration project
Post by: AlexDDR on 2010-10-07 23:29:11
Is there a version of aotuv b5.7? oggenc or vorbis.dll with SSE3 mt (multi thread), it seems to only find the normal version
Title: Ogg Vorbis acceleration project
Post by: IgorC on 2011-01-02 20:39:12
Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933 (http://www.hydrogenaudio.org/forums/index.php?showtopic=85933)

lvqcl builds have no issues.
Title: Ogg Vorbis acceleration project
Post by: punkrockdude on 2011-01-22 21:14:31
I would love an updated enhanced ogg encoder too. The latest libogg and all that and SSE3 and SSE4. What would be even better would be a multicore & sse4 version. Regards
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2011-03-05 14:52:26
Two samples have audible distortion with Lancer encoder (john33 builds)

http://www.hydrogenaudio.org/forums/index....showtopic=85933 (http://www.hydrogenaudio.org/forums/index.php?showtopic=85933)

lvqcl builds have no issues.

Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

(I wonder why algorithms in this file are so sensitive to optimizations made by ICC)
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2011-03-07 13:47:46
Note: this issue can be fixed if optimization parameter for envelope.c is set to O1 (tested on Intel C++ Compiler XE 12.0).

Note2: the problem was in the code
Code: [Select]
    e->mdct_win[i]=sin(i/(n-1.)*M_PI);
    e->mdct_win[i]*=e->mdct_win[i];

ICC at highest optimization level doesn't generate code for the second line... Replacing it with the following code solves this problem:

Code: [Select]
    float t = sin(i/(n-1.)*M_PI);
    e->mdct_win[i] = t*t;
Title: Ogg Vorbis acceleration project
Post by: robert on 2011-03-08 09:19:17
Is this a known bug of the Intel compiler? Did it print some warnings about unsafe optimizations used, so that one has the chance to see the problem coming? I guess, I'll have to do some code review of LAME, looking out for similar potential problems.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2011-03-08 10:05:12
Quote
Did it print some warnings about unsafe optimizations used

Don't see any.

But I also noticed that "Interprocedural Optimization" option was set to Multi-File (/Qipo). Changing this option for envelope.c  to Single-File (/Qip) solves this problem, too.

icl.exe: Version 12.0.2.154 Build 20110112

Added: http://software.intel.com/en-us/forums/sho...ead.php?t=62095 (http://software.intel.com/en-us/forums/showthread.php?t=62095) -- "Bug in Intel C++ compiler when using option /Qipo ... Intel C++ v11.0.066"


Added [20110505]: The bug still exists in Intel® C++ Composer XE 2011 Update 3 (icl.exe Version 12.0.3.175 Build 20110309)
Title: Ogg Vorbis acceleration project
Post by: Isayama on 2012-02-05 19:05:52
Hi everyone,

Sorry for re-upping this nearly one year old post, but I was wondering, related to this thread (aTuVbeta6.02 (http://www.hydrogenaudio.org/forums/index.php?showtopic=88342&st=25&p=754771&#entry754771)): where can I download the fastest (and latest) SSE3 (or SSE2, or even SSE4.1  ) 64-bit version of accelerated oggenc2? lvqcl's one is not online anymore (from this post (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=32932&view=findpost&p=753991)). Has any new advancement been realized in that field?

Thanks anyway for all those interesting discussions.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-02-05 20:53:44
AoTuV b6.03 compiled with ICC 12.1: [attachment=6898:oggenc2_ICC12.1.7z]
32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

[attachment=6899:sources_.7z]
Title: Ogg Vorbis acceleration project
Post by: skamp on 2012-02-06 00:10:59
Thank you very much for the updated binaries. With the Win64 SSE3 binary under linux with wine, I get 59x, versus 37x with my natively compiled aotuv binary.
Title: Ogg Vorbis acceleration project
Post by: forart.eu on 2012-02-07 09:09:12
Dunno if can help in any way, but here's an Eric Gur (Processor Client Application Engineer @ Intel Corp.) reply to my message about MT library:

Quote
For threading I recommend using Intel's free TBB library (http://threadingbuildingblocks.org/). It's very fast, cross platform, simple to use and has an important feature - malloc replacement.
I used it in a previous project - 1M lines of code, multithreaded application on Linux x64. Just the malloc replament boosted performance by 3x without changing any code (1 line in the makefile).


BTW, there are a number of malloc replacements available, including this (http://sourceforge.net/projects/nedmalloc/) and one from Google (http://code.google.com/p/gperftools/?redir=1)...
Title: Ogg Vorbis acceleration project
Post by: Isayama on 2012-02-08 15:08:32
AoTuV b6.03 compiled with ICC 12.1

Many thanks lvqcl for your quick and effective answer! I don't know anything yet about compiling, but I think I'll start giving it a shot... I've seen you gave your optimization options in another thread, so I'll start with that :-)
Sorry to ask, but is there any storage site or ftp server where you upload your compiles, or do you do it on an on-demand basis? :-)

Thanks again anyway, now I can encode an album in ogg in no time, which was kind of a problem so far.
Good continuation and cheers for the help!
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-02-20 16:37:58
TWIMC -- aoTuV b5.7 compiled with ICC 12.1. [attachment=6938:oggenc2_..._aotuv57.7z]
Title: Ogg Vorbis acceleration project
Post by: vinnie97 on 2012-03-28 22:06:49
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit).  I receive an "Unsupported Compression Method" error when attempting to decompress.  Any clues?
Title: Ogg Vorbis acceleration project
Post by: saratoga on 2012-03-28 23:23:31
Not sure what I'm doing wrong, but 7zip (32-bit) is unable to open the above 7z archive in Win7 (64-bit).  I receive an "Unsupported Compression Method" error when attempting to decompress.  Any clues?


Are you running a version of 7zip before  9.04 ?  if so, update, as thats when LZMA2 support was added.
Title: Ogg Vorbis acceleration project
Post by: OggY68 on 2012-04-05 20:51:21
AoTuV b6.03 compiled with ICC 12.1: [attachment=6898:oggenc2_ICC12.1.7z]
32 bit: SSE, SSE2, SSE3;
64 bit: SSE2, SSE3.

[attachment=6899:sources_.7z]


Thanks, managed to compile your sources using GCC with SSE3 acceleration as shared libraries (libvorbis, libvorbisenc and libvorbisfile) natively on my linux box.  Encoding a CD to ogg takes now 30 seconds less on my old Athlon X2 4600XP.  But, unfortunately oggdec and ogg1234 cannot decode anymore with the new libvorbisfile lib.  After the method ov_raw_seek is called the programs exit with a "Segmentation Fault".  After this method is called for the first time, the data members of vorbis_info like mode, rate, ... show only junk numbers like "-1223863434" hence the programs crash  ...
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-04-28 12:55:30
I am simply amazed about the speed gain!

I compared different encoders, and for Ogg Vorbis, specifically, several specific builds, encoding a whole CD image (697 MB) on an AMD Phenom II X6 1045T, 2700 MHz. Times taken with ptime; best of 3 consecutive runs.

(http://www.ligh.de/pics/OggSpeed.png)

Germany uses a decimal comma. Basic oggenc2 builds are from RareWares.

That shouldn't leave any doubt that Ogg Vorbis, fine tuned by Lancer, is now probably the practically most efficient audio encoder, regarding a weighted relation between quality efficiency and speed efficiency. The FhG AAC encoder is close, but lacks of bitrate tuning flexibility (quite a large gap between VBR presets 4 and 5, targeting at 128 or 192 kbps).
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-04-29 06:13:00
@ john33 & lvqcl:

Would you be able to release DLLs as well (both the one and the four)? There are people who still use e.g. BeSweet based tools.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-06-09 20:24:02
I don't know how to do them, especially 4-dlls version... I managed to compile libvorbis.dll but it just crashes.
Title: Ogg Vorbis acceleration project
Post by: punkrockdude on 2012-06-10 00:32:51
Is it possible to compile this in Linux/Ubuntu? I have used the following successfully in Ubuntu and is there a modification to include these optimizations?

Code: [Select]
sudo apt-get build-dep libvorbis

mkdir ogg_build && cd ogg_build

wget http://www.geocities.jp/aoyoume/aotuv/source_code/libvorbis-aotuv_b6.03.tar.bz2

tar -xvjf libvorbis-aotuv_b6.03.tar.bz2

cd ~/ogg_build/aotuv-b6.03_20110424 && chmod +x configure

./configure --disable-shared && make

sudo make install

Regards.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-01 12:15:40
Better late than never, I guess! 

Three sets of 'Lancer' builds of the various dlls - aoTuVb6.03. The '4' dlls, SSE3 version has been tested with CDex and appears to be OK but all the remainder are untested. Try at your own risk and while I would be interested in feedback from the point of view of generally making these available at Rarewares, I won't pretend to know where to look to fix any problem that may exist. 

OggVorbils dlls:
SSE: http://www.rarewares.org/files/ogg/oggvorb...3_LancerSSE.zip (http://www.rarewares.org/files/ogg/oggvorbis-dllsaoTuVb6.03_LancerSSE.zip)
SSE2: http://www.rarewares.org/files/ogg/oggvorb..._LancerSSE2.zip (http://www.rarewares.org/files/ogg/oggvorbis-dllsaoTuVb6.03_LancerSSE2.zip)
SSE3: http://www.rarewares.org/files/ogg/oggvorb..._LancerSSE3.zip (http://www.rarewares.org/files/ogg/oggvorbis-dllsaoTuVb6.03_LancerSSE3.zip)

Libvorbis.dll:
SSE: http://www.rarewares.org/files/ogg/libvorb...3_LancerSSE.zip (http://www.rarewares.org/files/ogg/libvorbisaoTuVb6.03_LancerSSE.zip)
SSE2: http://www.rarewares.org/files/ogg/libvorb..._LancerSSE2.zip (http://www.rarewares.org/files/ogg/libvorbisaoTuVb6.03_LancerSSE2.zip)
SSE3: http://www.rarewares.org/files/ogg/libvorb..._LancerSSE3.zip (http://www.rarewares.org/files/ogg/libvorbisaoTuVb6.03_LancerSSE3.zip)

Vorbis.dll (for HeadAC3he)
SSE: http://www.rarewares.org/files/ogg/vorbis_...3_LancerSSE.zip (http://www.rarewares.org/files/ogg/vorbis_dll-aoTuVb6.03_LancerSSE.zip)
SSE2: http://www.rarewares.org/files/ogg/vorbis_..._LancerSSE2.zip (http://www.rarewares.org/files/ogg/vorbis_dll-aoTuVb6.03_LancerSSE2.zip)
SSE3: http://www.rarewares.org/files/ogg/vorbis_..._LancerSSE3.zip (http://www.rarewares.org/files/ogg/vorbis_dll-aoTuVb6.03_LancerSSE3.zip)

Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-01 14:04:26
Thank you, I will report them privately to the interested person and reply soon...

I believe those versions "for HeadAC3he" are for a HeadAC3he version so outdated that noone still has it, the last "current" versions I have (0.24a8, 0.24a13, 0.25a3) should be compatible to either of the default builds...
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-01 14:19:16
Thank you, I will report them privately to the interested person and reply soon...

Thanks.
I believe those versions "for HeadAC3he" are for a HeadAC3he version so outdated that noone still has it, the last "current" versions I have (0.24a8, 0.24a13, 0.25a3) should be compatible to either of the default builds...

Ah, OK, perhaps it's time to consign them to history, then?
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-01 14:39:09
Ah, no, this "vorbis.dll" is compatible to HeadAC3he 0.24a8 (http://www.ligh.de/software/HeadAC3he_0.24a8.zip). All three variants are working well on a Phenom-II X6.

HeadAC3he 0.24a13 (http://www.ligh.de/software/HeadAC3he_0.24a13.rar) uses a "bridge" library (hVorbis.dll) to be compatible to the usual builds. It uses the "quartet". The SSE variant seems to work; the SSE2 variant doesn't start correctly due to a missing "svml_dispmd.dll" - same for the SSE3 variant. This DLL appears to be related to libmmd.

Same for HeadAC3he 0.25a3 (http://www.ligh.de/software/HeadAC3he_0.25a3.rar) (password to unpack = 'nodelay') - SSE works, SSE2 and SSE3 miss "svml_dispmd.dll".
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-01 14:56:21
Not only SVML_DISPMD.DLL but also LIBMMD.DLL.

BTW, 4-DLL version works with CDex correctly here. Also, libvorbis.dll works with AIMP (after renaming to aimp_libvorbis.dll)
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-01 15:00:23
BeSweet 1.5b31 supports both libvorbis.dll and the Ogg Vorbis quartet.

All three libvorbis variants work well with BeSweet. The SSE quartet works too, missing MD DLLs are complained three times for the SSE2 and SSE3 quartets.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-01 15:31:19
I just noticed that Vorbis DLLs optimized for P4/Athlon64 (they are available at http://www.rarewares.org/ogg-libraries.php (http://www.rarewares.org/ogg-libraries.php)) also require libmmd.dll
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-01 15:38:53
The requirement of libmmd.dll is rather common for several builds on RareWares which are compiled with one of the Intel C++ compilers.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-01 15:49:51
Thanks for the feedback. I've just recompiled the HeadAC3he and libvorbis dlls to include the correct svml_dispmd.lib. I've uploaded on the same links.

Edit: Just recompiled and uploaded again.
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-01 16:42:46
Are you certain that you did not just do the opposite of the wanted? What worked before, does not anymore.

In contrast to the "libmmd.dll", I don't have this "svml_dispmd.dll" anywhere. RareWares doesn't offer it for download on the "Others" page.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-01 16:57:50
Are you certain that you did not just do the opposite of the wanted? What worked before, does not anymore.

In contrast to the "libmmd.dll", I don't have this "svml_dispmd.dll" anywhere. RareWares doesn't offer it for download on the "Others" page.

You may be right!  But, while I sort that out, the 32 bit ICL dlls required are here:

http://www.rarewares.org/files/ICL_dlls.zip (http://www.rarewares.org/files/ICL_dlls.zip)

EDIT: OK, I've recompiled and uploaded yet again!! Hopefully got it right this time.  All on the original links.
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-01 20:23:22
SSE DLLs require the libmmd.dll, SSE2 and SSE3 DLLs require the svml_dispmd.dll in addition.

If they are in system32, everything works well.
Title: Ogg Vorbis acceleration project
Post by: Raimu on 2012-07-02 19:05:07
I can confirm that SSE3 DLLs work fine but require svml_dispmd.dll!
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-02 19:15:56
That seems to be as a consequence of the optimisations used. If someone knows better, I'm more than willing to listen.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-02 20:13:38
Try "Multi-threaded (/MT)" option instead of "Multi-threaded DLL (/MD)" -- for all projects in your solution, of course. Then msvcr100, libmmd, and svml_disp libraries should be linked statically.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-02 21:44:41
Try "Multi-threaded (/MT)" option instead of "Multi-threaded DLL (/MD)" -- for all projects in your solution, of course. Then msvcr100, libmmd, and svml_disp libraries should be linked statically.

Thanks very much, I'll give it a try.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-02 22:14:53
OK, thanks to lvqcl, all new compiles of the dlls are now on the links above with the dependencies removed.
Title: Ogg Vorbis acceleration project
Post by: akapuma on 2012-07-02 23:40:24
Thank you very much.

Time for transcoding 1h30min (movie soundtrack) on my PC:

old libvorvis.dll: 4min57sec
new libvorvis.dll: 2min53sec

Best regards

akapuma
Title: Ogg Vorbis acceleration project
Post by: Raimu on 2012-07-03 13:00:30
Yes, thank you, everyone. For myself I confirm an encoding speed increase from rate 24.1x to 29.97x on my old AMD CPU when facing off the new x64/SSE3 builds against Rarewares' latest AoTuV6 x64 build encoding a longish film soundtrack. Speed increase for dll versions is in the same ballpark, though lower for my CPU.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-03 14:28:26
Thanks for all the feedback and assistance, I'll post these over at Rarewares in the next few hours.
Title: Ogg Vorbis acceleration project
Post by: Raimu on 2012-07-03 19:55:05
Thanks for all the feedback and assistance, I'll post these over at Rarewares in the next few hours.


@john33, any interest in also posting the cli encoder binary at Rarewares?
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2012-07-04 02:45:41
I was just wondering if we could get a compile using
the Open64 compiler found at http://developer.amd.com/tools/open64/pages/default.aspx (http://developer.amd.com/tools/open64/pages/default.aspx) for testing.
there is also another version at
http://www.open64.net/download/open64-4x-releases.html (http://www.open64.net/download/open64-4x-releases.html)

some reports put the code it produces as being on par with ICC.

I would try and compile with it, but, to be very honest.....I really really suck with programming and compiling....couldnt keep GCC working on windows to save my life(kept breaking on me)

I would just like to see what a more cpu agnostic compiler(thats rated pretty high by its users) would give us vs ICC's results.

thanks for your efforts, its good to see vorbis still getting some love
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-04 13:21:01
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.
Title: Ogg Vorbis acceleration project
Post by: punkrockdude on 2012-07-04 13:43:47
Anyone that can give some guidance on how to compile this under Linux (Ubuntu)? Regards.
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-04 14:20:29
A few more statistics, transcoding 01:42:28 h of a 5.1 AC3 on a Phenom-II X4 945 using BeSweet with DPL-II downmix and fixed gain (to avoid including the normalization pass):

Generic
06:42 (686)
06:04 (P4)

Lancer
04:30 (SSE)
03:51 (SSE2)
03:50 (SSE3)

The gap between generic and extreme optimization is quite impressive. And even the gap between SSE and SSE2 is still remarkable. But after all, decoding and downmixing takes its time too, so a certain degree of saturation is expectable.
Title: Ogg Vorbis acceleration project
Post by: Brazil2 on 2012-07-04 14:22:33
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.

Thanks but unfortunately, and unlike your previous (http://www.rarewares.org/files/ogg/oggenc2.87-1.3.3-P4.zip) builds (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb6.03-P4.zip), it's not running anymore on older OSes pre-XP SP2 on which VC2010 runtimes can't be installed

But this might be helpfull: http://mulder.googlecode.com/svn/trunk/Uti...rLib/README.txt (http://mulder.googlecode.com/svn/trunk/Utils/EncodePointerLib/README.txt)
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-04 14:25:17
There are reasons why such old OS are deprecated. An excuse would be running them offline.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-04 14:25:32
Thanks but unfortunately, and unlike your previous (http://www.rarewares.org/files/ogg/oggenc2.87-1.3.3-P4.zip) builds (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb6.03-P4.zip), it's not running anymore on older OSes pre-XP SP2 on which VC2010 runtimes can't be installed

But this might be helpfull: http://mulder.googlecode.com/svn/trunk/Uti...rLib/README.txt (http://mulder.googlecode.com/svn/trunk/Utils/EncodePointerLib/README.txt)

OK, what optimisation does your CPU support?
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-04 14:40:31
It's rather a question of PE-building and linking than of CPU optimizations, john33. Not the CPU is the limit, but the OS and its set of supported Windows API functions.
Title: Ogg Vorbis acceleration project
Post by: Brazil2 on 2012-07-04 14:43:32
OK, what optimisation does your CPU support?

MMX, SSE, SSE2, SSE3, SSSE3 and I'm usually using your P4 optimized builds.
Thank you
Title: Ogg Vorbis acceleration project
Post by: Steve Forte Rio on 2012-07-04 16:26:23
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.


Hi, John. But what is the difference between your new compiles and this (http://audiophilesoft.ru/commandline/oggenc_aotuv/oggenc_lancer_b6.03.7z)?

I don't remember where I got it, but it was more than one year ago and actually this is also OggEnc v2.87 LancerMod(SSE3) based on aoTuV b6.03 [20110424]. Could you clarify?
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-04 16:49:15
OK, what optimisation does your CPU support?

MMX, SSE, SSE2, SSE3, SSSE3 and I'm usually using your P4 optimized builds.
Thank you

Try this: http://www.rarewares.org/files/ogg/oggenc2...cerSSE2_OLD.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb6.03-LancerSSE2_OLD.zip)
and perhaps you could let me know if it's OK?
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-04 16:52:54
@john33, any interest in also posting the cli encoder binary at Rarewares?

Done.


Hi, John. But what is the difference between your new compiles and this (http://audiophilesoft.ru/commandline/oggenc_aotuv/oggenc_lancer_b6.03.7z)?

I don't remember where I got it, but it was more than one year ago and actually this is also OggEnc v2.87 LancerMod(SSE3) based on aoTuV b6.03 [20110424]. Could you clarify?

I couldn't say with any certainty, but probably the only difference from looking at the size of the executables is that I don't think they were compiled with the libsamplerate resampler.
Title: Ogg Vorbis acceleration project
Post by: Brazil2 on 2012-07-04 17:05:29
Try this: http://www.rarewares.org/files/ogg/oggenc2...cerSSE2_OLD.zip (http://www.rarewares.org/files/ogg/oggenc2.87-aoTuVb6.03-LancerSSE2_OLD.zip)
and perhaps you could let me know if it's OK?

Brilliant! Works like a charm, thanks a lot
Code: [Select]
G:\Test\>oggenc2 -h
OggEnc v2.87 (LancerMod(SSE2) based on aoTuV b6.03 [20110424])
(c) 2000-2005 Michael Smith <msmith@xiph.org>
& portions by John Edwards <john.edwards33@ntlworld.com>
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-04 17:14:48
My versions of oggenc2.exe doesn't include SRC and FLAC libraries and I commented out all relevant options and calls.

@john33: in your compiles these options are disabled too    I think it's not what you want, and 3 source files with re-enabled options are attached to the post.
Title: Ogg Vorbis acceleration project
Post by: Raimu on 2012-07-04 18:08:40
Quote
Hi, John. But what is the difference between your new compiles and this (http://audiophilesoft.ru/commandline/oggenc_aotuv/oggenc_lancer_b6.03.7z)?
I don't remember where I got it, but it was more than one year ago and actually this is also OggEnc v2.87 LancerMod(SSE3) based on aoTuV b6.03 [20110424]. Could you clarify?

Some tests (out of interest) on my PC reveal that john33's current binaries are slightly but noticably faster than these in your link, in the very least.
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-04 18:58:39
My versions of oggenc2.exe doesn't include SRC and FLAC libraries and I commented out all relevant options and calls.

@john33: in your compiles these options are disabled too    I think it's not what you want, and 3 source files with re-enabled options are attached to the post.

Thanks, but the versions at Rarewares have these enabled.

EDIT: I just realised that the options were disabled in the oggenc2 code!  I had enabled the inclusion of the libs in the compiles and hadn't checked the code!
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-04 19:27:04
All of the above oggenc2 compiles have been updated at Rarewares. Sorry for the confusion!
Title: Ogg Vorbis acceleration project
Post by: xconstellationx on 2012-07-07 10:46:16
All of the above oggenc2 compiles have been updated at Rarewares. Sorry for the confusion!


Great work, thanks a lot.

Although the version by lvqcl is still faster on my machine. I use oggenc2 32bit sse3 from [a href='index.php?act=findpost&pid=784966']here[/a] and foobar converts a flac around 49x while your compile is at 42x.
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-07 10:52:28
Your machine. Aha.

We all know your machine.

Oh, no, this is your first post, so how could we?

Hint: http://hwinfo.com/ (http://hwinfo.com/)
Title: Ogg Vorbis acceleration project
Post by: xconstellationx on 2012-07-07 11:01:38
It's a core2duo laptop with a P8600+ 4gb ram on win7.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-07 14:03:35
Although the version by lvqcl is still faster on my machine. I use oggenc2 32bit sse3 from [a href='index.php?act=findpost&pid=784966']here[/a] and foobar converts a flac around 49x while your compile is at 42x.


Try LancerSSE2_OLD build. It is faster than other versions (except x64).
Title: Ogg Vorbis acceleration project
Post by: xconstellationx on 2012-07-07 16:24:15
With johns lancer sse2 old i get the same speed like using your sse3 version. 
Title: Ogg Vorbis acceleration project
Post by: xconstellationx on 2012-07-08 10:53:16
Out of curiosity i tested all 32bit oggenc2 compiles again and here are the results:


John33:

sse  35.69x
sse2 38.40x
sse3 38.60x

sse2old 47.19x


lvqcl:

sse  38.80x
sse2 47.94x
sse3 47.73x


I'm not familiar with compiling, so i wonder why there is such a huge step in speed from sse to sse2 while sse2 and sse3 are on the same level?
Title: Ogg Vorbis acceleration project
Post by: LigH on 2012-07-08 11:55:01
This effect doesn't belong to the "Compiling" as such (the C compiler only translates the source routines which are not very CPU optimized; the in-depth CPU instruction set optimization is more efficiently done via manual Assembler code).

The efficiency boost between different instruction sets depends on the algorithm to be optimized and the differences between the instruction sets. So specifically for the Vorbis encoding, SSE2 seems to introduce very useful new instructions (relative to SSE only), but the new instructions in SSE3 (relatively to SSE2 only) are only marginal for the Vorbis algorithms.
Title: Ogg Vorbis acceleration project
Post by: xconstellationx on 2012-07-08 13:20:20
The efficiency boost between different instruction sets depends on the algorithm to be optimized and the differences between the instruction sets. So specifically for the Vorbis encoding, SSE2 seems to introduce very useful new instructions (relative to SSE only), but the new instructions in SSE3 (relatively to SSE2 only) are only marginal for the Vorbis algorithms.

Thanks for clarifying.

Is it the reason there is no sse4 compile, because it introduces too little useful instructions compared to sse3 as well?
Title: Ogg Vorbis acceleration project
Post by: Raimu on 2012-07-08 15:59:22
Quote
Is it the reason there is no sse4 compile, because it introduces too little useful instructions compared to sse3 as well?


I was under the impression the reason is more along the lines of SSE4* being an umbrella term for a clustermess of very different instruction sets some of which only work on newish Intel CPUs and others only on newish AMD CPUs and all of which only can be effectively optimized for on pretty new and specific compilers.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-08 17:56:38
I took my old compiles and also compiled the sources with SSE4.1 instruction set. Then I took an album (57m 37s) and encoded it. Results (encoding time, in seconds):

32 bit:
SSE  - 89.5 s
SSE2 - 72.2 s
SSE3 - 73.2 s
SSE4.1 - 72.1 s

64 bit:
SSE2 - 67.1 s
SSE3 - 66.5 s
SSE4.1 - 66.1 s

BTW, I also tested original Lancer from http://homepage3.nifty.com/blacksword/index.htm (http://homepage3.nifty.com/blacksword/index.htm)

SSE  - 57.7 s
SSE2 - 53.2 s
SSE3 - 53.1 s

SSE2 MT - 35.9 s (71.6 s total process time)
SSE3 MT - 36.2 s (72.3 s total process time)
Title: Ogg Vorbis acceleration project
Post by: eahm on 2012-07-12 00:22:05
Pink Floyd - The Wall 1:21:09 (CPU specs: http://i46.tinypic.com/110lytc.png) (http://i46.tinypic.com/110lytc.png))

(q5.0)

oggenc2.87-1.3.3-generic: Total encoding time: 0:29.874, 162.99x realtime

oggenc2.87-1.3.3-x64: Total encoding time: 0:17.067, 285.30x realtime

oggenc2.87-aoTuVb6.03-generic: Total encoding time: 0:31.949, 152.40x realtime

oggenc2.87-aoTuVb6.03-x64: Total encoding time: 0:18.236, 267.01x realtime

oggenc2.87-aoTuVb6.03-LancerSSE: Total encoding time: 0:18.330, 265.64x realtime

oggenc2.87-aoTuVb6.03-LancerSSE2: Total encoding time: 0:18.096, 269.07x realtime

oggenc2.87-aoTuVb6.03-LancerSSE3: Total encoding time: 0:18.439, 264.07x realtime

oggenc2.87-aoTuVb6.03-LancerSSE3_x64: Total encoding time: 0:13.994, 347.95x realtime

-

qaac_1.38 (V82): Total encoding time: 0:20.374, 238.99x realtime

NeroAACCodec-1.5.1 (Q0.48): Total encoding time: 0:19.313, 252.12x realtime

lame3.99.5 (V2): Total encoding time: 0:19.344, 251.71x realtime

lame3.99.5-64 (V2): Total encoding time: 0:17.643, 275.98x realtime
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-14 12:47:39
It seems that older versions (that require Intel DLLs) were faster than current. Also, SSE2_OLD version is as fast as before. (misconfigured compiler?..)
Title: Ogg Vorbis acceleration project
Post by: john33 on 2012-07-14 14:06:08
It seems that older versions (that require Intel DLLs) were faster than current. Also, SSE2_OLD version is as fast as before. (misconfigured compiler?..)

The SSE2_OLD is actually configured in the same way as the SSE2 compile. Both are ICL 12.1 compiles. The only difference is that the OLD was compiled within VS2008 and the other within VS2010.
Title: Ogg Vorbis acceleration project
Post by: Brazil2 on 2012-07-15 11:20:19
The only difference is that the OLD was compiled within VS2008 and the other within VS2010.

Heh, newer doesn't always mean better
Title: Ogg Vorbis acceleration project
Post by: Destroid on 2012-07-22 10:07:31
After finally getting the 64-bit machine operational, I have some benches to add here (running i5-2600K [OC], 8GB, WinXP x64 SP2):

OggEnc 2.87 aoTuV b6.03 Lancer builds (-q 3 [126.2 kbps])
SSE2 ........ 74.13x
SSE2 old ... 98.84x
SSE3 ........ 74.13x
SSE3 x64 .. 107.83x
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-22 12:31:03
Could you test also SSE version?
Title: Ogg Vorbis acceleration project
Post by: IgorC on 2012-07-22 14:58:55
After finally getting the 64-bit machine operational, I have some benches to add here (running i5-2600K [OC], 8GB, WinXP x64 SP2):

i5 2500k  or i7 2600k?
Title: Ogg Vorbis acceleration project
Post by: Destroid on 2012-07-22 18:23:02
Ok, the results using Vorbis -q3 running on i5-2500K (OC'ed), 8GB, WinXP x64 SP2.

The encode rate and bitrate numbers come from the binaries themselves. To help make sense of which version I was running I included the binaries' date-stamps.
Code: [Select]
john33 (OggEnc 2.87 aoTuV b6.03, average bitrate 126.2 kbps, all builds Lancer except *)
----------------------------------------------------------------------------------------
Generic*    40.2072    05/04/2011
P4*        65.8952    05/04/2011
SSE        71.8857    07/04/2012
SSE2        74.1321    07/04/2012
SSE2 (old)  98.8428    07/04/2012
SSE3        74.1321    07/04/2012
SSE3 x64  107.8285    07/04/2012


lvqcl Lancer (OggEnc 2.87 aoTuV b5.7, average bitrate 118.6 kbps)
------------------------------------------------------------------
SSE        87.8602    02/20/2012
SSE2      118.6113    02/20/2012
SSE2 x64  131.7904    02/20/2012
SSE3      112.9632    02/20/2012
SSE3 x64  124.8540    02/20/2012

Blacksword Lancer (OggEnc 2.83, aoTuV b5, average bitrate 118.0 kbps)
---------------------------------------------------------------------
SSE      120.112743    11/09/2006
SSE2      131.673327    11/09/2006
SSE2 MT  200.035978    11/09/2006
SSE3      131.330713    11/09/2006
SSE3 MT  199.766456    11/09/2006
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-22 18:33:25
oggenc "02/20/2012" is based on aoTuV 5.7 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=74345&view=findpost&p=786849), not 6.03 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=74345&view=findpost&p=784966)
Title: Ogg Vorbis acceleration project
Post by: Destroid on 2012-07-22 18:47:48
 Fixed.

I got pretty confused with this many binaries.
Title: Ogg Vorbis acceleration project
Post by: Destroid on 2012-07-22 20:02:51
(Time expired editing above post while typing...)

FYI, I noticed the SSE binary in the aoTuV b5.7 bundle (post #121) declared itself b6.03. Because the bitrate is consistent with the other compiles in the bundle I'm guessing it's just a typo.

It's worth mentioning that Vorbis doesn't have drastic bitrate variance when compiler options change. When I posted the average bitrate in the above table it really means each individual file had the same bitrate, respective of the builder+compiler.
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-07-22 20:27:17
Quote
FYI, I noticed the SSE binary in the aoTuV b5.7 bundle (post #121) declared itself b6.03.

...That's strange. Thank you for the info.
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2012-10-15 19:50:46
I was wondering if somebody whos got gcc could give us a compile with AVX or AVX+sse4/4.2a to test on bulldozer based chips.

apparently they fixed the AVX support with newer gcc versions http://www.primegrid.com/forum_thread.php?id=3912 (http://www.primegrid.com/forum_thread.php?id=3912)

would be interesting if it where possible to see how AVX would effect encode speeds if at all  (cant wait for them to also have FMA support)

thanks in advance to whoever gives us an avx build to play with
Title: Ogg Vorbis acceleration project
Post by: AwoK on 2012-10-21 06:09:41
Are the AoTuV b6.03 sources from post #117 (http://www.hydrogenaudio.org/forums/index.php?showtopic=74345&view=findpost&p=784966) mixed with Lancer?

For decoding only, are there major quality/bugfix differences between the current libvorbis/vorbisfile and what the latest "original" Lancer was based on (from late 2006)?
Title: Ogg Vorbis acceleration project
Post by: lvqcl on 2012-10-21 08:07:38
Are the AoTuV b6.03 sources from post #117 (http://www.hydrogenaudio.org/forums/index.php?showtopic=74345&view=findpost&p=784966) mixed with Lancer?


Yes, but I tested only encoding part of these sources, and I'm not sure that decoding works properly.


For decoding only, are there major quality/bugfix differences between the current libvorbis/vorbisfile and what the latest "original" Lancer was based on (from late 2006)?


See http://svn.xiph.org/trunk/vorbis/CHANGES (http://svn.xiph.org/trunk/vorbis/CHANGES)  (original Lancer is based on libvorbis 1.1.2).
Title: Ogg Vorbis acceleration project
Post by: AwoK on 2012-10-21 17:32:57
It seems there are decoding related changes since then. Also things like "Corrections to the specification" or "Fix a numerical instability in the edge extrapolation filter" which I don't know if are relevant or not.  So staying with an old version isn't a good idea.

How did you do the merge? Comparing Lancer against AoTuV 5, then AoTuV 5 to 6, or libvorbis (and co.) 1.1.2 to 1.3.3, and patching in  in unchanged parts?

Did you try creating new SSE code? If there are clear cases of calculations done on arrays, is it necessarily difficult?
Title: Ogg Vorbis acceleration project
Post by: AshenTech on 2013-05-28 06:49:06
any chance of a compile with FMA support? 

we cant get an intel compiler build with AVX support because intels compiler blocks non-intel chips from getting AVX code paths sadly, but FMA could have a beneficial effect on encode performance I would think
Title: Ogg Vorbis acceleration project
Post by: Steve Forte Rio on 2014-06-01 12:33:10
Can we hope to see Oggenc Lancer Build based on aoTuV beta6.03 unified with Xiph.Org's libvorbis1.3.4 (http://www.geocities.jp/aoyoume/aotuv/)?
Title: Ogg Vorbis acceleration project
Post by: hidn on 2014-06-01 14:49:12
Quote
Ogg Vorbis acceleration project, Is it dead?

Yes. As vorbis himself. Xiph developing another codec "opus". Welcome, new abandonware. For people who support vorbis nothing remains except "thanks". Fate of many open source projects, lack of interest.

Tuning is boring? Start new project.
Title: Ogg Vorbis acceleration project
Post by: Steve Forte Rio on 2014-06-01 18:56:32
Quote
Ogg Vorbis acceleration project, Is it dead?

Yes. As vorbis himself. Xiph developing another codec "opus". Welcome, new abandonware. For people who support vorbis nothing remains except "thanks". Fate of many open source projects, lack of interest.

Tuning is boring? Start new project.



How about wide hardware support? I'm not sure if the opus will get it even in next 2 years. That's nonsense (to close Vorbis project). Also AFAIK Opus has more targeting onto low latency and low bitrate (of course not without a cost of quality loses for simple file encoding at medium bitrates).
Title: Ogg Vorbis acceleration project
Post by: skamp on 2014-06-01 19:42:02
1) Ogg Vorbis works very well in its current iteration;
2) Codec support these days is probably more software than "hardware".
Title: Re: Ogg Vorbis acceleration project
Post by: hidn on 2020-10-29 18:35:00
Quote
Ogg Vorbis acceleration project, Is it dead?
Yes. As vorbis himself. Xiph developing another codec "opus". Welcome, new abandonware. For people who support vorbis nothing remains except "thanks". Fate of many open source projects, lack of interest.

Tuning is boring? Start new project.

6 years later. What was done:
Code: [Select]
libvorbis 1.3.7 (2020-07-04) -- "Xiph.Org libVorbis I 20200704 (Reducing Environment)"

* Fix CVE-2018-10393 - out-of-bounds read encoding very low sample rates.
* Fix CVE-2017-14160 - out-of-bounds read encoding very low sample rates.
* Fix CVE-2018-10392 - out-of-bounds access encoding invalid channel count.
* Fix handling invalid bytes per sample arguments.
* Fix handling invalid channel count arguments.
* Fix invalid free on seek failure.
* Fix negative shift reading blocksize.
* Fix accepting unreasonable float32 values.
* Fix tag comparison depending on locale.
* Fix unnecessarily linking libm.
* Fix memory leak in test_sharedbook.
* Update Visual Studio projects for ogg library filename change.
* Distribute CMake build files with the source package.
* Remove unnecessary configure --target switch.
* Add gitlab CI support.
* Add OSS-Fuzz support.
* Build system and integration updates.

libvorbis 1.3.6 (2018-03-16) -- "Xiph.Org libVorbis I 20180316 (Now 100% fewer shells)"

* Fix CVE-2018-5146 - out-of-bounds write on codebook decoding.
* Fix CVE-2017-14632 - free() on unitialized data
* Fix CVE-2017-14633 - out-of-bounds read
* Fix bitrate metadata parsing.
* Fix out-of-bounds read in codebook parsing.
* Fix residue vector size in Vorbis I spec.
* Appveyor support
* Travis CI support
* Add secondary CMake build system.
* Build system fixes

libvorbis 1.3.5 (2015-03-03) -- "Xiph.Org libVorbis I 20150105 (⛄⛄⛄⛄)"

* Tolerate single-entry codebooks.
* Fix decoder crash with invalid input.
* Fix encoder crash with non-positive sample rates.
# Fix issues in vorbisfile's seek bisection code.
* Spec errata.
* Reject multiple headers of the same type.
* Various build fixes and code cleanup.

libvorbis 1.3.4 (2014-01-22) -- "Xiph.Org libVorbis I 20140122 (Turpakäräjiin)"

* Reduce codebook footprint in library code.
* Various build and documentation fixes.