Print Page - Ogg Vorbis optimized for speed

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-04 19:11:58

Some Japanese guys work on speed optimization of libvorbis by using SSE. Blacksword (or 637) launched an Ogg Vorbis acceleration project (in Japanese only) (http://homepage3.nifty.com/blacksword/) and releases oggenc binary and libvorbis patch (http://homepage3.nifty.com/blacksword/OggEnc_SSE_20041101ArcherB03.zip) based on libvorbis 1.1. This optimization includes SSE implementations of FFT, MDCT, windowing, channel coupling, sorting, psymodel, floor/residue encode, and so on. In my computer (Pentium IV 2.4GHz), ICL8.1 compiled oggenc binary of the optimized version (Archer Beta03) encodes at 23.4x while the one without optimization (ICL8.1 compiled but no SSE patches) does at 15.5x. Hence, this optimization archives ca. 1.5x speed gain.

Unlike GoGo-no-coder, it's not forking: he releases a patch for libvorbis source code without absolutely changing algorithm or data structure. This is very good for source code maintenance to keep up with up-to-date official libvorbis, but limits optimization possibility in some degree. Actually, the author says in readme.txt that there's little room left for optimization. So I think it's time for quality evaluation although this optimization is in development stage. After several bugs are found and fixed for the last week, bitrates are quite similar to the reference encoder for all quality values. If you find any bugs or quality degressions from official 1.1 one, please tell us.

Contributors are:
- Blacksword (or 637)'s SSE optimization (Japanese only) (http://homepage3.nifty.com/blacksword/): A number of functions in libvorbis are vectorized to take advantage of SSE instruction set as well as Opt-Sort and wuvorbis. For complete list of optimized functions, see readme.txt (in Japanese but you may easily find it) attached with the binary.
- Manuke's OptSort (http://www.cug.net/~manuke/vorbis-optsort-en.html): Optimization of qsort function that consumes 20% of compression processing time, by assuming that _vp_quantize_couple_sort and _vp_noise_normalize_sort functions in psy.c call qsort with 8 or 32 element. This accelerates the whole compression process by 10%.
- W.Dee's wuvorbisfile (Japanese only?) (http://kikyou.info/tvp/#side_product): wuvorbis.dll is a fast Ogg Vorbis decoder with SSE and 3DNow!, which is a part of KiriKiri software (useful for developing multi-media contents or adventure games). wuvorbis.dll decodes 1.4x-1.8x faster (SSE) and 1.5x-1.9x faster (3DNow!) than official libvorbis.

Happy encoding!

Title: Ogg Vorbis optimized for speed
Post by: dev0 on 2004-11-04 19:37:26

fefe (http://www.fefe.de/diffs/) was working on a (apparently buggy) SSE optimization of libvorbis too.
Do the optimizations only effect encoding or decoding as well?

Title: Ogg Vorbis optimized for speed
Post by: ilikedirtthe2nd on 2004-11-04 22:04:35

I archived almost 100% (rather 85%, actually ) speed incrase (against ICL 8.1 on AMD Athlon XP 1800+)

ICL 8.1: 9,8x
Optimized 18,0x.

Pretty good

Title: Ogg Vorbis optimized for speed
Post by: TedFromAccounting on 2004-11-05 00:22:26

Wow Now that is FAST. My results were similar to ilikedirtthe2nd's (actually a little better).

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-05 02:14:25

Quote

fefe (http://www.fefe.de/diffs/) was working on a (apparently buggy) SSE optimization of libvorbis too.
Do the optimizations only effect encoding or decoding as well?
[a href="index.php?act=findpost&pid=252028"][{POST_SNAPBACK}][/a]

Oh, I didn't know fefe's optimization. I'll check whether it benefits Blacksword's optimization.

IMHO this optimization effects on both encoding and decoding sides although optimized oggdec is not tested or released. Several functions for decodnig (e.g., vorbis_synthesis_blockin, mapping0_inverse, mdct_backward, etc.) are optimized too.

Title: Ogg Vorbis optimized for speed
Post by: QuantumKnot on 2004-11-06 01:05:49

Whoa, it's really fast

On my P4 2.4 GHz:

ICL compiled oggenc from rarewares: 13.2x
SSE optimised oggenc: 20.5x

Title: Ogg Vorbis optimized for speed
Post by: Bonzi on 2004-11-06 07:02:04

Pretty nice speedup here too:
oggenc from rarewares 10.4x
SSE optimized 15.3x

Title: Ogg Vorbis optimized for speed
Post by: Music Mixer on 2004-11-06 07:10:42

Hello!

Well, I have got an older machine (p3 700) and recieved a speedup from 4.4 to 9.3x realtime.

Have you guys tested the SSE2 optimized build at http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)
?

I wonder how big the speedup with this build is for p 4 and amd 64 cpus.

Title: Ogg Vorbis optimized for speed
Post by: Sebastian Mares on 2004-11-06 09:18:18

According to my tests...

ICL 8.1 Standard:

Code: [Select]

File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s

ICL 8.1 Pentium 4:

Code: [Select]

File length:  4m 58,0s
Elapsed time: 0m 17,0s
Rate:         17,5529
Average bitrate: 236,7 kb/s

SSE:

Code: [Select]

File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s

SSE2:

Code: [Select]

File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s

Tested with "Toto - Africa" on a Pentium 4 with 3.2 GHz, 512 MB RAM, running Windows XP Professional Service Pack 1.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2004-11-06 15:10:12

I got a good increase, too...

SSE2

Code: [Select]

        File length:  5m 23.0s
        Elapsed time: 0m 12.0s
        Rate:         26.9556
        Average bitrate: 175.3 kb/s

ILC 8.1

Code: [Select]

        File length:  5m 23.0s
        Elapsed time: 0m 19.0s
        Rate:         17.0246
        Average bitrate: 175.3 kb/s

But I can't seem to get it to work on FLAC files...

Code: [Select]

ERROR: Input file "01.flac" is not a supported format.

Am I missing something??

Thanks,

~esa

:edit: typo

Title: Ogg Vorbis optimized for speed
Post by: ilikedirtthe2nd on 2004-11-06 15:24:59

Quote

But I can't seem to get it to work on FLAC files...
Code: [Select]
ERROR: Input file "01.flac" is not a supported format.
Am I missing something??

Standard oggenc doesn't input lossless files directly. Only Oggenc2.3 from rarewares does.

Regards; ilikedirt

Title: Ogg Vorbis optimized for speed
Post by: dev0 on 2004-11-06 15:46:29

Quote

Quote
But I can't seem to get it to work on FLAC files...
Code: [Select]
ERROR: Input file "01.flac" is not a supported format.
Am I missing something??
Standard oggenc doesn't input lossless files directly. Only Oggenc2.3 from rarewares does.

Regards; ilikedirt
[a href="index.php?act=findpost&pid=252321"][{POST_SNAPBACK}][/a]

The standard oggenc supports FLAC input perfectly. It's a compile-time option AFAIK.

Title: Ogg Vorbis optimized for speed
Post by: john33 on 2004-11-06 15:52:14

Quote

The standard oggenc supports FLAC input perfectly. It's a compile-time option AFAIK.
[a href="index.php?act=findpost&pid=252328"][{POST_SNAPBACK}][/a]

It sure is.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2004-11-06 16:01:43

Quote

Standard oggenc doesn't input lossless files directly.

Quote

The standard oggenc supports FLAC input perfectly.

Well, I can't say that the issue is any clearer for me now...

Title: Ogg Vorbis optimized for speed
Post by: ilikedirtthe2nd on 2004-11-06 16:19:22

Quote

It's a compile-time option AFAIK.

That means, oggenc is able to input flac, if this is enabled when compiling. So: generally it is able to read flac, but this version is not.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2004-11-06 16:25:56

Quote

...oggenc is able to input flac, if this is enabled when compiling. So: generally it is able to read flac, but this version is not.

Ah... thank you for the clarification!

~esa

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-06 17:20:21

Quote

Have you guys tested the SSE2 optimized build at http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)
?

I wonder how big the speedup with this build is for p 4 and amd 64 cpus.
[a href="index.php?act=findpost&pid=252295"][{POST_SNAPBACK}][/a]

I could not find speed difference between SSE and SSE2 versions on my Pentium IV machine. Is there anybody who gets speed increase? The author wants to know the effect to determine whether if he should continue SSE2 version or not.

Quote

According to my tests...

ICL 8.1 Standard:

Code: [Select]
File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s
SSE:

Code: [Select]
File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s
[a href="index.php?act=findpost&pid=252297"][{POST_SNAPBACK}][/a]

Are SSE and SSE2 binaries your own builds? If so, don't forget to define a symbol __SSE__ to activate the optimization when compiling.

Title: Ogg Vorbis optimized for speed
Post by: Sebastian Mares on 2004-11-06 18:05:59

Quote

I got a good increase, too...

ILC 8.1

Code: [Select]

        File length:  5m 23.0s
        Elapsed time: 0m 12.0s
        Rate:         26.9556
        Average bitrate: 175.3 kb/s

SSE2

Code: [Select]

        File length:  5m 23.0s
        Elapsed time: 0m 19.0s
        Rate:         17.0246
        Average bitrate: 175.3 kb/s

But I can't seem to get it to work on FLAC files...

Code: [Select]

ERROR: Input file "01.flac" is not a supported format.

Am I missing something??

Thanks,

~esa
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=252317")

Huh? The ICL 8.1 compile is faster.

Quote

Quote
Have you guys tested the SSE2 optimized build at [a href="http://homepage3.nifty.com/blacksword/]http://homepage3.nifty.com/blacksword/[/url]
?

I wonder how big the speedup with this build is for p 4 and amd 64 cpus.
[a href="index.php?act=findpost&pid=252295"][{POST_SNAPBACK}][/a]

I could not find speed difference between SSE and SSE2 versions on my Pentium IV machine. Is there anybody who gets speed increase? The author wants to know the effect to determine whether if he should continue SSE2 version or not.

Quote
According to my tests...

ICL 8.1 Standard:

Code: [Select]
File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s
SSE:

Code: [Select]
File length:  4m 58,0s
Elapsed time: 0m 18,0s
Rate:         16,5778
Average bitrate: 236,7 kb/s
[a href="index.php?act=findpost&pid=252297"][{POST_SNAPBACK}][/a]
Are SSE and SSE2 binaries your own builds? If so, don't forget to define a symbol __SSE__ to activate the optimization when compiling.
[a href="index.php?act=findpost&pid=252344"][{POST_SNAPBACK}][/a]

Nope, they're not my own compiles.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2004-11-06 18:13:56

Quote

Huh? The ICL 8.1 compile is faster.

Whoops! No, that's a typo... I'll edit immediately...

Title: Ogg Vorbis optimized for speed
Post by: kjoonlee on 2004-11-06 18:17:53

OK, here are some partial translations:

OggEnc_SSE_20041101ArcherB03.zip
Changes regarding/surrounding comments
Improved low-bitrate quality

Current problems are:

When encoding at low bitrates, treble quality suffers, and size bloat occurs.
Could hang immediately on running, depending on the environment
Bugs due to changes to comment handling?

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-06 19:31:09

Quote

OK, here are some partial translations:

OggEnc_SSE_20041101ArcherB03.zip
Changes regarding/surrounding comments
Improved low-bitrate quality

Current problems are:
When encoding at low bitrates, treble quality suffers, and size bloat occurs.

Could hang immediately on running, depending on the environment

Bugs due to changes to comment handling?

[a href="index.php?act=findpost&pid=252353"][{POST_SNAPBACK}][/a]

Thanks for the translation. I think all of the current problems listed above are solved in Archer B03. These problems existed in Archer B02.

Title: Ogg Vorbis optimized for speed
Post by: QuantumKnot on 2004-11-07 01:55:39

IIRC, SSE2 is optimised for double point precision so maybe there isn't that much difference with SSE since libvorbis doesn't use many of them?

Title: Ogg Vorbis optimized for speed
Post by: Benjamin Lebsanft on 2004-11-07 08:04:11

Tested on my AMD64 3400+, 1GB RAM

ICL 8.1:

File length: 4m 27.0s
Elapsed time: 0m 14.0s
Rate: 19.1190
Average bitrate: 132.9 kb/s

ICL 8.1 (John33):

File length: 4m 27.0s
Elapsed time: 0m 11.0s
Rate: 24.3333
Average bitrate: 132.9 kb/s

SSE/SSE2 Optimized:

File length: 4m 27.0s
Elapsed time: 0m 08.0s
Rate: 33.4583
Average bitrate: 132.9 kb/s

SSE2 optimization doesn't change encoding speed

Title: Ogg Vorbis optimized for speed
Post by: john33 on 2004-11-07 09:41:41

As QK says, there's very little use of double precision in libvorbis, so the use of SSE2 optimisation is virtually a waste of effort.

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-07 09:54:01

Quote

IIRC, SSE2 is optimised for double point precision so maybe there isn't that much difference with SSE since libvorbis doesn't use many of them?
[a href="index.php?act=findpost&pid=252403"][{POST_SNAPBACK}][/a]

Quote

As QK says, there's very little use of double precision in libvorbis, so the use of SSE2 optimisation is virtually a waste of effort.
[a href="index.php?act=findpost&pid=252436"][{POST_SNAPBACK}][/a]

Actually, he expects higher quality (or speed) of float to integer and vice-versa conversion but, at the same time, doubts the effect. I'll tell him these results.

Title: Ogg Vorbis optimized for speed
Post by: Sebastian Mares on 2004-11-07 10:51:38

Quote

As QK says, there's very little use of double precision in libvorbis, so the use of SSE2 optimisation is virtually a waste of effort.
[a href="index.php?act=findpost&pid=252436"][{POST_SNAPBACK}][/a]

That explains why my SSE and SSE2 tests achieve the same result.

Title: Ogg Vorbis optimized for speed
Post by: Poromenos on 2004-11-08 10:31:12

OK, for the newb with no ability for critical thinking (me ), would you recommend switching to this version from "OggEnc v2.3 (libvorbis 1.1.0)"? I'd like to have the extra speed, but if it introduces bugs I can wait
I thought of encoding with both then comparing the files, but the size was a few bytes different and they were not identical (there were 80ish different bytes every Y bytes). What's that about?

Title: Ogg Vorbis optimized for speed
Post by: QuantumKnot on 2004-11-08 10:56:57

Quote

OK, for the newb with no ability for critical thinking (me ), would you recommend switching to this version from "OggEnc v2.3 (libvorbis 1.1.0)"? I'd like to have the extra speed, but if it introduces bugs I can wait
I thought of encoding with both then comparing the files, but the size was a few bytes different and they were not identical (there were 80ish different bytes every Y bytes). What's that about?
[a href="index.php?act=findpost&pid=252582"][{POST_SNAPBACK}][/a]

IMO, it's best to stick to the normal compile of oggenc. More testing is required.

Title: Ogg Vorbis optimized for speed
Post by: Sebastian Mares on 2004-11-08 12:26:00

I see no speed gain when compared to the Pentium 4 builds from RareWares.

Title: Ogg Vorbis optimized for speed
Post by: Jens Rex on 2004-11-08 13:20:06

I'd be more interested in decoder speedups - especially for portable devices. Vorbis playback in my Tungsten T3 eats battery like crazy.

Title: Ogg Vorbis optimized for speed
Post by: Gecko on 2004-11-11 21:49:45

Here's a late reply. I tested on two titles and the results look great. The sse2 version offers zero speed increase; the numbers are exactly the same. System: Athlon 64 3000. Turns out I've been previously using Ogg Vorbis 1.1 rc1 from rarewares. Oh well. Quality level is 5.

Die fantastischen Vier - Mein Schwert [hip hop-ish, CD rip]
1.1rc1 - 14,9936
sse/sse2 - 22,9893

G&M Project - Sunday Afternoon (Nu Nrg Mix) [trance, wav previously decoded from mpc q7]
1.1rc1 - 15,7454
sse/sse2 - 27,9919

I was evaluating if I should use ogg or mp3 on my soon to be shipped iRiver , so I will do a lot of transcoding. I don't know if the fact that I am using an allready lossy source accounts for the speed increase.

These speeds even surpass mpc encoding (usually 22-23x)! Lame 3.96.1 clocks in at about 8x for aps and 17x for apfs.

BTW: version "Archer B04" is out, which is claimed to be even a bit faster.
edit2: well, not for me. Speeds are identical to B03.

Title: Ogg Vorbis optimized for speed
Post by: [solid] on 2004-11-12 00:05:16

how should i apply the patch? i get all hunks failed...
using linux, official libvorbis-1.1.0 and the same happens for both B03 and B04

Title: Ogg Vorbis optimized for speed
Post by: ak on 2004-11-12 09:36:07

I remeber trying to apply it, there were bunch of whitespace diffs, so try 'patch -l ...'

Oops, actually, it was the case with current svn.

For 1.1.0 running dos2unix on patch should do.

Title: Ogg Vorbis optimized for speed
Post by: [solid] on 2004-11-12 09:58:10

Quote

For 1.1.0 running dos2unix on patch should do.
[a href="index.php?act=findpost&pid=253316"][{POST_SNAPBACK}][/a]

oh crap it was that simple... haven't thought of that, thanks. compiling right now

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-12 20:44:56

Quote

I see no speed gain when compared to the Pentium 4 builds from RareWares.
[a href="index.php?act=findpost&pid=252596"][{POST_SNAPBACK}][/a]

Weird...

Quote

BTW: version "Archer B04" is out, which is claimed to be even a bit faster.
edit2: well, not for me. Speeds are identical to B03.
[a href="index.php?act=findpost&pid=253210"][{POST_SNAPBACK}][/a]

I got slight speed increase (23.73x) from B03 (23.37x).

Title: Ogg Vorbis optimized for speed
Post by: Benjamin Lebsanft on 2004-11-12 21:15:20

on the first run i got 38.2381x, on the second run 33.4583 which is the same as B03

Title: Ogg Vorbis optimized for speed
Post by: jg123 on 2004-11-15 16:53:00

It looks like the resample option is broken? I get a crash using the resample option on B04. I'm trying to resample a 16 kHZ stereo wav file to a -q0 44100 ogg.

Title: Ogg Vorbis optimized for speed
Post by: kuniklo on 2004-11-15 17:15:07

Does anyone have the sse optimizations in the form of a patch to 1.1?

I'd like to try building a linux binary of this.

Title: Ogg Vorbis optimized for speed
Post by: Bogalvator on 2004-11-15 18:02:27

The patch is the first file on the project web page:
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)

This is great stuff by the way, I hope that development / testing continues.

Title: Ogg Vorbis optimized for speed
Post by: maacruz on 2004-11-16 17:21:20

Quote

The patch is the first file on the project web page:
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)

This is great stuff by the way, I hope that development / testing continues.
[a href="index.php?act=findpost&pid=254091"][{POST_SNAPBACK}][/a]

I have tryed it and doesn't work for me.
It does compile after some editing, but both enconding and playback are badly broken.

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-17 08:18:39

Quote

It does compile after some editing, but both enconding and playback are badly broken.
[a href="index.php?act=findpost&pid=254355"][{POST_SNAPBACK}][/a]

Could you give a better description of "badly broken"? Actually I didn't complie B04, but my own compile (ICL8.1) of B03 worked fine.

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-17 15:30:38

Quote

It looks like the resample option is broken? I get a crash using the resample option on B04. I'm trying to resample a 16 kHZ stereo wav file to a -q0 44100 ogg.
[a href="index.php?act=findpost&pid=254062"][{POST_SNAPBACK}][/a]

Archer Beta05 is released mainly to solve this problem.
- Use of libogg 1.1.2 (version up from 1.1.1)
- Fixed a crash (16 byte-alignement exception) of resample/downmix routines in audio.c (for oggenc and oggdropXPd)
- Update build script for automake/autoconf
- Activated FLAC reading suport in oggenc, using FLAC 1.1.1 (ICL compile)

Quote

Quote
It does compile after some editing, but both enconding and playback are badly broken.
[a href="index.php?act=findpost&pid=254355"][{POST_SNAPBACK}][/a]

Could you give a better description of "badly broken"? Actually I didn't complie B04, but my own compile (ICL8.1) of B03 worked fine.
[a href="index.php?act=findpost&pid=254474"][{POST_SNAPBACK}][/a]

One thing I forget to mention. It is strongly recommended to use gcc 3.3. The patch does not work with gcc 3.4 and other versions.

Title: Ogg Vorbis optimized for speed
Post by: Benjamin Lebsanft on 2004-11-17 17:05:12

Could anybody please provide a linux binary. As my box is using gcc 3.4.3 I am not able to compile it on my own
Thanks

Title: Ogg Vorbis optimized for speed
Post by: maacruz on 2004-11-17 17:59:19

Quote

Quote
It looks like the resample option is broken? I get a crash using the resample option on B04. I'm trying to resample a 16 kHZ stereo wav file to a -q0 44100 ogg.
[a href="index.php?act=findpost&pid=254062"][{POST_SNAPBACK}][/a]

Archer Beta05 is released mainly to solve this problem.
- Use of libogg 1.1.2 (version up from 1.1.1)
- Fixed a crash (16 byte-alignement exception) of resample/downmix routines in audio.c (for oggenc and oggdropXPd)
- Update build script for automake/autoconf
- Activated FLAC reading suport in oggenc, using FLAC 1.1.1 (ICL compile)

Quote
Quote
It does compile after some editing, but both enconding and playback are badly broken.
[a href="index.php?act=findpost&pid=254355"][{POST_SNAPBACK}][/a]

Could you give a better description of "badly broken"? Actually I didn't complie B04, but my own compile (ICL8.1) of B03 worked fine.
[a href="index.php?act=findpost&pid=254474"][{POST_SNAPBACK}][/a]

One thing I forget to mention. It is strongly recommended to use gcc 3.3. The patch does not work with gcc 3.4 and other versions.
[a href="index.php?act=findpost&pid=254530"][{POST_SNAPBACK}][/a]

Hi nyaochi

I have tested right now B05 and it applyed and compiled cleanly, but it does have the same problem than B04.
It encodes, but the result is a big file which sounds as noise (using normal oggenc castanets2.ogg is 97247 bytes, using oggenc-sse it is 221705 bytes).
Playing normal ogg files doesn't work either, it sounds as noise too, and segfaults when reaching the end of the file. Vorbisgain segfaults when reaching the end of a file.

I'm on a suse 9.1 linux system, gcc 3.3.3, glibc 2.3.3, libogg 1.1.2, athlon xp 2600 (Barthon core).

This is the gdb output
(gdb) run castanets2.ogg
Starting program: /usr/bin/ogg123 castanets2.ogg
Reading symbols from /usr/lib/libvorbisfile.so.3...(no debugging symbols found)...done.
...
Dispositivo de sonido: Advanced Linux Sound Architecture (ALSA) output

[New Thread 1087495088 (LWP 27008)]
Reproduciendo: castanets2.ogg
Ogg Vorbis stream: 2 channel, 44100 Hz
Tiempo: 00:06,63 [00:00,00] de 00:06,63 ( 0,0 kbps) Búfer de Salida 0,0% (EOS (Fin de flujo))
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1077510816 (LWP 27005)]
0x402e51bd in _int_free () from /lib/tls/libc.so.6
(gdb) bt
#0 0x402e51bd in _int_free () from /lib/tls/libc.so.6
#1 0x402e55fb in free () from /lib/tls/libc.so.6
#2 0x400552ed in vorbis_comment_clear () from /usr/lib/libvorbis.so.0
#3 0x00000000 in ?? ()
#4 0x400386f0 in ?? () from /usr/lib/libvorbisfile.so.3
#5 0x08070590 in ?? ()
#6 0x00000001 in ?? ()
#7 0x40036a96 in ov_clear () from /usr/lib/libvorbisfile.so.3
...

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2004-11-17 19:53:08

Quote

Hi nyaochi

I have tested right now B05 and it applyed and compiled cleanly, but it does have the same problem than B04.
It encodes, but the result is a big file which sounds as noise (using normal oggenc castanets2.ogg is 97247 bytes, using oggenc-sse it is 221705 bytes).
Playing normal ogg files doesn't work either, it sounds as noise too, and segfaults when reaching the end of the file. Vorbisgain segfaults when reaching the end of a file.

I'm on a suse 9.1 linux system, gcc 3.3.3, glibc 2.3.3, libogg 1.1.2, athlon xp 2600 (Barthon core).
[a href="index.php?act=findpost&pid=254556"][{POST_SNAPBACK}][/a]

Thank you for the detailed information. I've just got an email from the author. He found a bug around ov_read function that probably causes your crash. He also told me that he doesn't use Makefile generated by configure script but uses Makefile in Win32_MinGW that is based on a converted project from MSVC to compile it by gcc version 3.3.1 (mingw special 20030804-1).

I suppose linux support of B05 is not enough/adequate at present. So we have to inspect what causes bitrate-bloat/noise problem. Although I have Fedora Core 1 with gcc 3.3.1, unfortunately I'm not familiar with linux programing and have little time to debug it now. The author recognizes this problem but anyone can solve this problem?

Title: Ogg Vorbis optimized for speed
Post by: Sebastian Mares on 2004-11-17 20:50:43

Quote

Quote
I see no speed gain when compared to the Pentium 4 builds from RareWares.
[a href="index.php?act=findpost&pid=252596"][{POST_SNAPBACK}][/a]

Weird...
[a href="index.php?act=findpost&pid=253429"][{POST_SNAPBACK}][/a]

In fact, the SSE/SSE2 optimized versions are slower by about 1x as seen here:

[a href="index.php?act=findpost&pid=252297"][{POST_SNAPBACK}][/a]

Title: Ogg Vorbis optimized for speed
Post by: vearutop on 2004-12-09 04:27:16

does anyone have binary aotuvb3 oggenc w/ sse patch applied?

Title: Ogg Vorbis optimized for speed
Post by: skamp on 2004-12-11 05:48:43

Quote

does anyone have binary aotuvb3 oggenc w/ sse patch applied?
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=258635")

I've uploaded linux binaries in this [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=29974]thread[/url].

Title: Ogg Vorbis optimized for speed
Post by: vearutop on 2004-12-15 04:15:23

thank you

do you have one for windows?

Title: Ogg Vorbis optimized for speed
Post by: QuantumKnot on 2004-12-15 04:22:18

Quote

thank you

do you have one for windows?
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=259785")

Have a look at this page

[a href="http://homepage3.nifty.com/blacksword/]http://homepage3.nifty.com/blacksword/[/url]

Title: Ogg Vorbis optimized for speed
Post by: vearutop on 2004-12-15 04:32:57

thnx

something wrong was with my eyes... i visited that page earlier, but haven't seen subj

Title: Ogg Vorbis optimized for speed
Post by: vearutop on 2004-12-15 04:45:22

strange thing...
i compressed track via standart aotuvb3 and w/ sse optimized one. then i decompressed them to waves. waves were not the same...

maybe this is not a very fair optimisation?

Title: Ogg Vorbis optimized for speed
Post by: rjamorim on 2004-12-15 12:51:00

Quote

i compressed track via standart aotuvb3 and w/ sse optimized one. then i decompressed them to waves. waves were not the same...

maybe this is not a very fair optimisation?
[a href="index.php?act=findpost&pid=259798"][{POST_SNAPBACK}][/a]

Even when you use different compilers for the same source code you can get different vorbis streams. So, it's to be expected that assembly optimizations will introduce differences.

It's up to the users to test and see if these differences are noticeable.

Title: Ogg Vorbis optimized for speed
Post by: bluesky on 2005-02-06 02:05:48

I used the build from this url (http://homepage3.nifty.com/blacksword/OggEnc_SSE_20041213ArcherB10.zip).

Here's my results with an Athlon XP 3200+:

Code: [Select]

ArcherB10 oggenc:

        File length:  5m 05.0s
        Elapsed time: 0m 11.0s
        Rate:         27.7879
        Average bitrate: 161.5 kb/s

rarewares icl oggenc:
        File length:  5m 05.0s
        Elapsed time: 0m 20.0s
        Rate:         15.2833
        Average bitrate: 152.5 kb/s

I didn't notice other people getting different bitrates out of their tests. I did a simple:

Code: [Select]

oggenc -q5 testl.wav

Ideas?

Title: Ogg Vorbis optimized for speed
Post by: QuantumKnot on 2005-02-06 02:11:55

Seems like a significant difference. Which specific ICL oggenc from rarewares did you use? There are a couple there.

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2005-02-06 10:02:14

Quote

Ideas?
[a href="index.php?act=findpost&pid=271241"][{POST_SNAPBACK}][/a]

The optimized binary (Archer Beta10) is based on aoTuV b3. Did you use aoTuV b3 binary at rarewares for comparison?

Title: Ogg Vorbis optimized for speed
Post by: bluesky on 2005-02-06 18:41:15

My mistake... correct data:

Code: [Select]

Done encoding file "testl.ogg"

        File length:  5m 05.0s
        Elapsed time: 0m 12.0s
        Rate:         25.4722
        Average bitrate: 161.5 kb/s


Done encoding file "testl.ogg"

        File length:  5m 05.0s
        Elapsed time: 0m 20.0s
        Rate:         15.2833
        Average bitrate: 161.5 kb/s

Title: Ogg Vorbis optimized for speed
Post by: Toe on 2005-02-06 21:00:52

Has any testing been done on these builds with regard to output quality? (ie no noticable differences vs regular 2.1)

Title: Ogg Vorbis optimized for speed
Post by: DarkAvenger on 2005-02-21 11:46:37

BTW, GCC 4.0 alpha snapshot from yesterday compiles the SSE version fine.

BTW, encoding time for a test file went down form 5.5 to 3.0 seconds on my Athlon XP...
Hmm, but on the other hand a non-SSE version compiled with gcc-3.4 only needs 4.3 seconds... so GCC 4.0 still needs a lot of work before it is ready for prime time.

OK, it seems using more conservative flags is better for gcc 4.0: Using

Code: [Select]

CFLAGS="-O2 -fweb -frename-registers -mno-ieee-fp -D_REENTRANT -fsigned-char -march=athlon-xp -mfpmath=sse -fomit-frame-pointer"

gcc4.0 is about as fast as gcc 3.4.3 w/o SSE.

Title: Ogg Vorbis optimized for speed
Post by: Emanuel on 2005-02-21 13:01:28

Do I dare asking John33 for an english OggdropXPd SSE optimized version when the time is ready? Would be like a dream for an Iriver H140 owner

Title: Ogg Vorbis optimized for speed
Post by: rjamorim on 2005-02-21 13:25:34

Quote

Do I dare asking John33 for an english OggdropXPd SSE optimized version when the time is ready? Would be like a dream for an Iriver H140 owner
[a href="index.php?act=findpost&pid=275610"][{POST_SNAPBACK}][/a]

I wonder if it would be a good idea, considering we still didn't see any listening tests comparing the optimized version versus the official one.

Title: Ogg Vorbis optimized for speed
Post by: john33 on 2005-02-21 13:35:05

Quote

Do I dare asking John33 for an english OggdropXPd SSE optimized version when the time is ready? Would be like a dream for an Iriver H140 owner
[a href="index.php?act=findpost&pid=275610"][{POST_SNAPBACK}][/a]

I'm not averse to the idea. I simply haven't managed to get a clean compile to work with yet!! I keep taking another look, and I'll continue to do so, but until then, it's a 'no can do'!

Title: Ogg Vorbis optimized for speed
Post by: Emanuel on 2005-02-21 14:02:44

Quote

I wonder if it would be a good idea, considering we still didn't see any listening tests comparing the optimized version versus the official one.
[a href="index.php?act=findpost&pid=275615"][{POST_SNAPBACK}][/a]

You're right. Although, with the short abx tests I did yesterday, I am very satisfied with the quality in the sse version vs aoTuV b3 and official 1.1 - so I would definetly use it. The speed gain compared to a possible difference between the versions (wich I couldn't abx) is reason enough EDIT: for me

Quote

I'm not averse to the idea. I simply haven't managed to get a clean compile to work with yet!! I keep taking another look, and I'll continue to do so, but until then, it's a 'no can do'!
[a href="index.php?act=findpost&pid=275616"][{POST_SNAPBACK}][/a]

Fully understandable, John. You're already doing a great job.

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-02-23 19:29:57

Quote

I'd be more interested in decoder speedups - especially for portable devices. Vorbis playback in my Tungsten T3 eats battery like crazy.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=252605")

Quote

Quote
[a href="http://www.fefe.de/diffs/]fefe[/url] was working on a (apparently buggy) SSE optimization of libvorbis too.
Do the optimizations only effect encoding or decoding as well?
[a href="index.php?act=findpost&pid=252028"][{POST_SNAPBACK}][/a]

Oh, I didn't know fefe's optimization. I'll check whether it benefits Blacksword's optimization.

IMHO this optimization effects on both encoding and decoding sides although optimized oggdec is not tested or released. Several functions for decodnig (e.g., vorbis_synthesis_blockin, mapping0_inverse, mdct_backward, etc.) are optimized too.
[a href="index.php?act=findpost&pid=252096"][{POST_SNAPBACK}][/a]

Hello,
I'm newb here, so please be patient

OK, my question is:
Can using these compiles (actually OggEnc_SSE_20041213ArcherB10) instead of "normal one" (I'm using aoTuV b3 - OggEnc Win32 version - from Aoyumi pages) speed up decoding of vorbis when played on portable player (iaudio M3 in my case) and cause lower energy consumption?
Increase of encoding speed isn't important for me, but if this will happend...

One offtopic subquestion : mp3 files plays almost gaplessly comparing to vorbis ones on my player (also when encoded with this SSE comp.) Is this because of slow decoding (with Tremor decoder?), is this a hardware issue (slow processor / vorbis requirements?) or is this a firmware issue (...)?

So maybe I'm totally off topic, maybe not? THX for your reactions.

Title: Ogg Vorbis optimized for speed
Post by: miscellanea on 2005-03-12 10:41:24

Quote

OK, my question is:
Can using these compiles (actually OggEnc_SSE_20041213ArcherB10) instead of "normal one" (I'm using aoTuV b3 - OggEnc Win32 version - from Aoyumi pages) speed up decoding of vorbis when played on portable player (iaudio M3 in my case) and cause lower energy consumption?
Increase of encoding speed isn't important for me, but if this will happend...
[a href="index.php?act=findpost&pid=276378"][{POST_SNAPBACK}][/a]

encoding and decoding is diffrent process (diffrent engine) so I think decoding speed is a matter of player.

Quote

One offtopic subquestion : mp3 files plays almost gaplessly comparing to vorbis ones on my player (also when encoded with this SSE comp.) Is this because of slow decoding (with Tremor decoder?), is this a hardware issue (slow processor / vorbis requirements?) or is this a firmware issue (...)?
[a href="index.php?act=findpost&pid=276378"][{POST_SNAPBACK}][/a]

hmm, Tremor is vorbis decoder so it may be because of mp3 decoders.
Originally mp3 can't play gaplessly (some players are cutting gap when playing, so it's like gapless playing).

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-12 12:13:48

Archer Release-Candidate 1 is out.

Title: Ogg Vorbis optimized for speed
Post by: miscellanea on 2005-03-12 12:17:57

Thanks. Now is the time to test again.

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-03-12 19:29:13

Quote

Archer Release-Candidate 1 is out.
[a href="index.php?act=findpost&pid=281497"][{POST_SNAPBACK}][/a]

Accidentally I found very strange (rare) bug in Archer RC1:
With one sample (Laurie Anderson / Big Science / song no. 08 - Let X=X) the encoder fail, but only when -q4 is used. (e.g. -q4,1; -q3 -q5 etc. makes no problem)
Lenght of the song is 3:54, when encoding fail, it ends at 3:27 (whole song till this point is encoded and tags are properly added). Doesn't matter if the source for encoding is wav or flac. It happend with EAC as well as with Foobar:

Code: [Select]

INFO (foo_clienc) : CLI encoder: C:\Program Files\Eac\Encoders\Vorbis\OggEnc_SSE_20050312ArcherRC1\oggenc.exe
INFO (foo_clienc) : Destination file: file://C:\Documents and Settings\Martin Radimecky\My Documents\My Music\OGG\Rock\Anderson, Laurie\Big Science\08 - Let X=X.ogg
INFO (foo_clienc) : Source file: file://C:\Documents and Settings\Martin Radimecky\My Documents\My Music\FLAC\Rock\Anderson, Laurie\Big Science\08 - Let X=X.flac
INFO (foo_clienc) : 44100Hz 32bps 2ch
ERROR (foo_clienc) : Writing to encoder failed
INFO (foo_clienc) : Encoding took 10828 milliseconds, speed 19.24x
INFO (CORE) : attempting to edit file info : file://C:\Documents and Settings\Martin Radimecky\My Documents\My Music\OGG\Rock\Anderson, Laurie\Big Science\08 - Let X=X.ogg
INFO (CORE) : file info update successful on : file://C:\Documents and Settings\Martin Radimecky\My Documents\My Music\OGG\Rock\Anderson, Laurie\Big Science\08 - Let X=X.ogg
ERROR (foo_diskwriter) : Conversion failed.

Encoding does not fail when another compiles (e.g. oggenc2.41-aoTuVb3P3 from RW or Aoyumi reference compile) are used.

The strangest thing is, that only the full lenght wav must be used to cause the fail. I tried to isolate just part of the sample which causes the fail for uploading it here, but it encodes without problems. Even when small part is cut off from very beginning of the wav, it encodes well. But when the whole wav is resaved, the problem stays the same.

(the whole sample in flac is 20 Mb, so I can't upload it here)

Edit: oops, I forgot this: OggEnc_SSE_20041213ArcherB10 perform without problems on this sample !?!

Edit 2: Anyway, encoding speed is amazing

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-03-12 21:31:39

I also have a WAV which fails to encode with RC1 (doesn't dump any error message, just creates a dummy 0 bytes big OGG file) but with previous versions encodes just fine.

Title: Ogg Vorbis optimized for speed
Post by: Zoom on 2005-03-12 22:22:29

I can confirm the bug here too:

Code: [Select]

Opening with wav module: WAV file reader
Encoding "Wilco - Spiders (Kidsmoke).wav" to
         "Wilco - Spiders (Kidsmoke).ogg"
at quality 4.00
        [ 52.3%] [ 0m08s remaining] \

The encoder cuts out at exactly that spot every time. The file is fine and playable even, however the encoder stops at that point. I would assume it is a sample problem, as the part of the song it fails in would probably pose a problem to the encoder. I'm not sure though. I did test the encoder on about 10 other files of varying length and genre. All of the other files encoded without fail.

I agree though, the speedup of this encoder over the standard ICL auTov encode is amazing on my A64 3500+ :

Code: [Select]

Opening with wav module: WAV file reader
Encoding "Death Cab for Cutie - Stability.wav" to
         "Death Cab for Cutie - Stability.ogg"
at quality 4.00
        [100.0%] [ 0m00s remaining] /

Done encoding file "Death Cab for Cutie - Stability.ogg"

        File length:  12m 21.0s
        Elapsed time: 0m 20.0s
        Rate:         37.0800
        Average bitrate: 116.6 kb/s

20 seconds for a twelve and half minute song, nice!

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-03-12 22:32:14

Quote

20 seconds for a twelve and half minute song, nice!
[a href="index.php?act=findpost&pid=281642"][{POST_SNAPBACK}][/a]

Well, but it's not applicable (specially to batch encoding) with such unpredictable results as posted above

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-18 13:41:51

Archer RC2 is out.

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-03-18 14:32:58

Quote

Archer RC2 is out.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=283272")

Regrettably, exactly the same problem (with the same sample) as RC1 detected here
(RC1 bug report can be found [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=29161&view=findpost&p=281593]here[/url])

Title: Ogg Vorbis optimized for speed
Post by: rjamorim on 2005-03-18 15:36:32

Quote

Regrettably, exactly the same problem (with the same sample) as RC1 detected here
(RC1 bug report can be found here (http://www.hydrogenaudio.org/forums/index.php?showtopic=29161&view=findpost&p=281593))
[a href="index.php?act=findpost&pid=283279"][{POST_SNAPBACK}][/a]

What's the point of posting the report at a forum the developer probably doesn't read?

If I were you I would send him an e-mail, and hope that he speaks at least some english.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-18 15:42:56

Quote

If I were you I would send him an e-mail, and hope that he speaks at least some english.

I just sent him an email referencing this thread and that specific post. It would probably be helpful if someone could supply a test file.

Edit: I finally found a file that I have that crashes the encoder. It's track 14 off the Pain of Salvation - Be album. Let's see.. crash at around 65,3% completed....

Edit 2: Very tricky to pin down. This track only crashes at -q 3 of the different qualities I tried.

Edit 3: Okay, running under debugger:

"oggenc_archer.exe The instruction at 0x0042D568 referenced memory at 0xBF4EB730. The memory could not be read."

Code: [Select]

.text:0042D512 cvtss2si ecx, [eax+edi*4+0Ch]
.text:0042D518 cvtss2si ebx, [eax+edi*4+8]
.text:0042D51E cvtss2si esi, [eax+edi*4+4]
.text:0042D524 cvtss2si eax, [eax+edi*4]
.text:0042D529 mov     edi, [esp+50h+var_20]
.text:0042D52D add     ecx, edi
.text:0042D52F add     ebx, edi
.text:0042D531 add     esi, edi
.text:0042D533 add     edi, eax
.text:0042D535 mov     eax, [esp+50h+var_18]
.text:0042D539 imul    eax, [edx+8]
.text:0042D53D mov     [esp+50h+var_20], edi
.text:0042D541 mov     edi, [esp+50h+var_14]
.text:0042D545 add     eax, [edi+ecx*4]
.text:0042D548 imul    eax, [edx+8]
.text:0042D54C add     eax, [edi+ebx*4]
.text:0042D54F imul    eax, [edx+8]
.text:0042D553 add     eax, [edi+esi*4]
.text:0042D556 imul    eax, [edx+8]
.text:0042D55A mov     edx, [esp+50h+var_20]
.text:0042D55E add     eax, [edi+edx*4]
.text:0042D561 mov     edx, [esp+50h+var_10]
.text:0042D565 mov     edx, [edx+8]
.text:0042D568 cmp     dword ptr [edx+eax*4], 0          <--------------
.text:0042D56C jle     loc_42D6FE
.text:0042D572 mov     edx, [ebp+arg_C]
.text:0042D575 mov     ecx, [edx+10h]

It's a function that starts at 0x42D2FC and takes four parameters. Seems to only get called explicitly from one place, but its address is taken twice, so it could called as a function pointer too?. I'm not familiar enough with the code to identify it any further, and I don't think I even have the tools to build the source.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-18 19:29:38

Alright, the author got back to me. I'm going to assume he won't mind me posting his reply here:

"This BBS is seen.
This problem occurs in local_book_besterror_dim8.
I received the data which a problem occurs by RC1.
The data can be normally encoded by RC2.

The samples of the data which a problem occurs are insufficient. "

I'm going to try and figure out if he's got the bandwidth to recieve "samples" from us or not, though it doesn't seem _too_ hard to reproduce if you've got one or two whole albums to encode at different quality settings.

Edit: Some potentially good news:

"Probably, it will be unnecessary.
I found the clear problem in local_book_besterror_dimX. "

Looking forward to RC3.

Edit 2: Got to run a pre-release of RC3 where the bug seems to have been fixed --- at least my only test-case is working now. Furthermore, this does not seem to have impaired encoding speed at the least. (I do however detect a slight change in file size)

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-03-19 00:01:36

Quote

What's the point of posting the report at a forum the developer probably doesn't read?
[a href="index.php?act=findpost&pid=283291"][{POST_SNAPBACK}][/a]

My point was to warn other people against using the compile, because I think it's just fortuity to find this bug (in case of RC1 I encoded 8 albums without any problems before) and give evidence that RC2 does not solve the problem (it was easy for me check this because I know the "wrong" sample). Maybe I'm naive, but it's so catchy to use compile like this...

Quote

If I were you I would send him an e-mail, and hope that he speaks at least some english.

Of course you are right. I wasn't quick enough (as posted above)

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-03-19 00:26:32

Quote

Edit 2: Got to run a pre-release of RC3 where the bug seems to have been fixed --- at least my only test-case is working now. Furthermore, this does not seem to have impaired encoding speed at the least. (I do however detect a slight change in file size)
[a href="index.php?act=findpost&pid=283381"][{POST_SNAPBACK}][/a]

Here the problem seems to be fixed too (with Oggenc_rc3_pre01.exe on the same sample like before). I've sent email to blacksword about this result too.
I'm looking forward to RC3

Title: Ogg Vorbis optimized for speed
Post by: DreamTactix291 on 2005-03-19 05:53:18

Archer RC3 is out.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-19 10:30:07

F:\wav\archer>oggenc_archer -v
OggEnc v1.1 (Archer RC1 based on AoTuV Beta03)

Still displaying the wrong version, but at least the files are tagged correctly. Odd that these are different strings.

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-03-19 11:51:47

Well, bad news I think - my WAV still doesn't encode with RC3 (outputs a 0 bytes dummy OGG). What I noticed is that sampling rate of that WAV is 32KHz, I tried with another 32KHz file and it didn't encode either, so it looks like a problem with this certain sampling-rate. I also tried some 22KHz, 44KHz, and 48KHz files and they encode fine.

EDIT: I checked with -q-2 only.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-19 12:52:22

I can confirm that 32KHz files don't work at all at negative quality settings:

Code: [Select]

.text:0041A53D mov     eax, [esp+60h+var_24]
.text:0041A541 mov     edi, [eax+esi*4]              ; <-------- crash at negative quality, 32KHz
.text:0041A544 movss   xmm1, dword ptr [ebx+edi*4]
.text:0041A549 mov     edx, [ebx+edi*4]
.text:0041A54C movss   xmm0, xmm1
.text:0041A550 mulss   xmm0, xmm0

Looks like the base register isn't set up correctly (eax).

Hopefully it's just a problem with the loader.

edit:

F:\wav\archer>oggenc_archer --resample 32000 -q -1 Posbe14.wav
Opening with wav module: WAV file reader
Resampling input from 44100 Hz to 32000 Hz
Encoding "Posbe14.wav" to "Posbe14.ogg" at quality -1,00
<crash>

No such luck.

Samplerates < 26000 and > 39999 == "works" (encoder doesn't crash).

Edit 2:

"Root cause has become clear.
I was not testing 32KHz wav file.

In this case, (loop count mod 16) was not zero in
_vp_noise_normalize.

This question is corrected by RC4." -- Mebius1

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-03-19 15:14:50

... and RC4 is out (http://homepage3.nifty.com/blacksword/).

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-03-19 19:17:15

Seems to work fine now

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2005-05-28 07:45:00

Bump for new version of Lancer 2005028 Release (Based on aotuv-pb4_20050412).

Title: Ogg Vorbis optimized for speed
Post by: rudefyet on 2005-05-28 08:00:47

oh great....you made me wet my pants again

EDIT: Encoding from a pipe appears to be broken in this release

Title: Ogg Vorbis optimized for speed
Post by: ilikedirtthe2nd on 2005-05-28 13:25:20

Speed increased slightly on my system (AMD XP 1800+):

from 19.8x to 20.4x (3% speedup).

("Archer RC4" against "Lancer")

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2005-05-28 14:20:00

Run with the input file disk-cache hot.

Archer -q 6: 22,8144x, Average bitrate: 199,3 kb/s
Lancer -q 6: 23,2464x, Average bitrate: 195,5 kb/s

These speeds are wicked fast, so fast that any improvement is basically unnecessary.

I'd be more interested in what could be done to the decoder. I fear that the next generation sound cards and game consoles will have _greatly_ accelerated hardware decoding and mixing of "mp3", which might slow down or even _revert_ vorbis adoption by game devs -- which I consider today the largest and most important "market" where vorbis is successfully competing.

It would be a shame to see that happen. :-(

Title: Ogg Vorbis optimized for speed
Post by: Latexxx on 2005-05-28 14:37:34

The next generation consoles won't ne pushing mp3. Microsoft will certainly push wma for Xbox titles and Sony its own Atrac3.

Title: Ogg Vorbis optimized for speed
Post by: de Mon on 2005-05-28 21:31:40

Quote

Speed increased slightly on my system (AMD XP 1800+):

from 19.8x to 20.4x (3% speedup).

("Archer RC4" against "Lancer")
[a href="index.php?act=findpost&pid=301129"][{POST_SNAPBACK}][/a]

Why is the bitrate different? Rounding errors? If so - are theese versions safe to use?

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-05-28 23:10:55

Quote

Quote
Speed increased slightly on my system (AMD XP 1800+):

from 19.8x to 20.4x (3% speedup).

("Archer RC4" against "Lancer")
[a href="index.php?act=findpost&pid=301129"][{POST_SNAPBACK}][/a]

Why is the bitrate different? Rounding errors? If so - are theese versions safe to use?
[a href="index.php?act=findpost&pid=301224"][{POST_SNAPBACK}][/a]

If you mean bitrate diference between Archer x Lancer, the reason of course is the different version of the encoder (AoTuv b3 x AoTuv pb4), otherwise i didn't find any bitrate or filesize difference between Lancer [20050528] x original AoTuV pb 4 [20050412] (on which Lancer is based)
BTW I love it. I didn't expect they will release it so quickly. WONDERFUL !!!

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-05-29 02:48:47

Quote

are theese versions safe to use?
[a href="index.php?act=findpost&pid=301224"][{POST_SNAPBACK}][/a]

We would need some listening tests to be sure...

Title: Ogg Vorbis optimized for speed
Post by: Bonzi on 2005-05-29 03:10:38

Quote

Quote
are theese versions safe to use?
[a href="index.php?act=findpost&pid=301224"][{POST_SNAPBACK}][/a]

We would need some listening tests to be sure...
[a href="index.php?act=findpost&pid=301284"][{POST_SNAPBACK}][/a]

Not really, if you can determine that it produces identical output as AoTuv pb4 then you only really need to perform listening tests on AoTuv pb4 or Lancer.

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-05-29 05:02:27

Quote

EDIT: Encoding from a pipe appears to be broken in this release
[a href="index.php?act=findpost&pid=301090"][{POST_SNAPBACK}][/a]

Pipe seems to work fine here.

Quote

Quote
Quote
are theese versions safe to use?
[a href="index.php?act=findpost&pid=301224"][{POST_SNAPBACK}][/a]

We would need some listening tests to be sure...
[a href="index.php?act=findpost&pid=301284"][{POST_SNAPBACK}][/a]

Not really, if you can determine that it produces identical output as AoTuv pb4 then you only really need to perform listening tests on AoTuv pb4 or Lancer.
[a href="index.php?act=findpost&pid=301289"][{POST_SNAPBACK}][/a]

Yeah, the point is that AoTuv & Archer/Lancer outputs are not identical. I don't know if due to different compilers or just the fact that SSE instructions are used, but they never were identical IIRC.

Title: Ogg Vorbis optimized for speed
Post by: rudefyet on 2005-05-29 05:04:04

the bitrates are identical

but the resulting files differ in filesize by a few bytes (between lancer and aotuv pb4), i can't explain why

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2005-05-29 07:42:57

Is different vendor strings may be the cause of it (the few bytes difference)?
And yeah, piping works well in this version.

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-05-29 09:18:29

Quote

Is different vendor strings may be the cause of it (the few bytes difference)?
[a href="index.php?act=findpost&pid=301336"][{POST_SNAPBACK}][/a]

Nope, actual audio data differs too.

Title: Ogg Vorbis optimized for speed
Post by: Gecko on 2005-05-29 10:18:26

But the differences are only sporadic.
If you do a wave subtraction, you will see large amounts of absolute silence and a number of spikes.
I raised the question here (http://www.hydrogenaudio.org/forums/index.php?showtopic=32764) but there was no final answer.

Title: Ogg Vorbis optimized for speed
Post by: Vax on 2005-06-07 21:14:52

the size of aoTuV pre-beta4 [20050412] is 1.36 Mo
and the size of Lancer [20050528] is 401 Ko
why is there a such big difference of size?

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2005-06-07 23:27:10

Lancer is probably packed with UPX or something and aoTuV is not.

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2005-06-08 01:40:54

Quote

Lancer is probably packed with UPX or something and aoTuV is not.
[a href="index.php?act=findpost&pid=304424"][{POST_SNAPBACK}][/a]

Lancer's DLL's are pretty big too, 3 meg unpacked, 450kb zipped. Beats me why they're that inflated.

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2005-06-16 22:13:09

Current Lancer (Archer) version is stable release? Why changed name of this tune?

Title: Ogg Vorbis optimized for speed
Post by: jorsol on 2005-06-16 22:47:25

The change of the name form Archer to Lancer is because it uses the AoTuV pre-beta 4, the Archer versions uses Beta 3... thats why the change of the name, or at least I suppose that... it is pretty stable but it maybe have various bugs to be fixed... and in the other hand it uses a pre-beta which maybe have some others problems...

Title: Ogg Vorbis optimized for speed
Post by: yong on 2005-06-21 16:27:21

new version is out,
Lancer 20050621(Based on aotuv-b4_20050617)

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2005-06-21 18:45:51

Many thanks for the heads up. =)
Will test it out in a moment...

Title: Ogg Vorbis optimized for speed
Post by: Biont on 2005-06-27 09:39:12

Almost a week has passed. Any test results?

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2005-06-27 11:58:54

Quote

Almost a week has passed. Any test results?[a href="index.php?act=findpost&pid=309178"][{POST_SNAPBACK}][/a]

I use it regularly without any problems (mostly q3 & q4), the speed is fantastic. Also I've noticed that it uses only about 90% CPU when running on my AMD 2700+ machine, as Archer spent full power.

Title: Ogg Vorbis optimized for speed
Post by: aspifox on 2005-06-27 12:45:29

Quote

I use it regularly without any problems (mostly q3 & q4), the speed is fantastic. Also I've noticed that it uses only about 90% CPU when running on my AMD 2700+ machine, as Archer spent full power.
[a href="index.php?act=findpost&pid=309198"][{POST_SNAPBACK}][/a]

Ooh. That probably implies that it's fast enough that it's actually sometimes blocking on IO. Impressive!

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2005-06-27 14:21:59

Quote

Almost a week has passed. Any test results?
[a href="index.php?act=findpost&pid=309178"][{POST_SNAPBACK}][/a]

Well, compared to the Lancer that uses aoTuV pre-beta 4, no real noticeable difference.
U might wanna check that out a few posts (or pages?) back about previous Lancer's performance.
To the Archer, it's just like Josef said.

Anyway, the speed gain (from the auTuVb4 compiles found in Rarewares.org) on slower PIII systems is adequate.
(I tested it on a PIII 600MHz)
No numbers yet (since I kinda forgot...), but I think it was about 1.15x to 1.30x faster.
As always, cmiiw. =)

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2005-06-28 12:17:15

From http://www.tom.womack.net/x86FAQ/faq_features.html (http://www.tom.womack.net/x86FAQ/faq_features.html)

Quote

For the P3, Intel skimped somewhat on the implementation, using only a two-wide ALU, so the average performance of SSE and 3DNow will be the same - I've constructed sequences of instructions which are faster on 3DNow. It's possible they'll use a four-wide one on later chips, which would make SSE roughly twice as fast as 3DNow.

This is also why P3's are notoriously poor on certain games such as UT2003/4.

Title: Ogg Vorbis optimized for speed
Post by: Tropican on 2005-07-09 17:39:05

New Lancer build based on the aoTuVb4 library merged with libvorbis 1.1.1. Previous was based on aoTuVb4 with libvorbis 1.1.0

http://homepage3.nifty.com/blacksword/index_e.htm (http://homepage3.nifty.com/blacksword/index_e.htm) as always

Title: Ogg Vorbis optimized for speed
Post by: wjdashwood on 2005-08-07 23:04:57

Just tried the latest version to replace the built in Foobar and it's well over twice as fast, especially when converting from FLAC. Amazing!

Title: Ogg Vorbis optimized for speed
Post by: judfilm on 2005-08-10 14:53:25

Just finished a test with besweet. Encoding time dropped from 1:54 to 1:13 (mins:secs).

Title: Ogg Vorbis optimized for speed
Post by: judfilm on 2005-08-15 03:24:40

FYI - New 'Lancer' builds of oggdropXPd v1.8.6 and libvorbis.dll

http://homepage3.nifty.com/blacksword/index_e.htm (http://homepage3.nifty.com/blacksword/index_e.htm) as always

Title: Ogg Vorbis optimized for speed
Post by: de Mon on 2005-09-04 21:14:37

I have a question. Will Lancer and P-III optimized (from rarewares.org) versions work and have any gain on Celeron 128kb cache (not Tualatin)?

Title: Ogg Vorbis optimized for speed
Post by: jorsol on 2005-09-05 01:16:44

Quote

I have a question. Will Lancer and P-III optimized (from rarewares.org) versions work and have any gain on Celeron 128kb cache (not Tualatin)?
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=324999")

Only if your Celeron support SSE.... try to use a program like [a href="http://www.cpuid.org/cpuz.php]cpuz[/url] to see if it have SSE instruction.

Title: Ogg Vorbis optimized for speed
Post by: zver on 2005-09-05 01:39:45

Got a question guys
I did some encodings with foobar0.8.3 and about 25 different songs,using built in vorbis which is 1.1 and using lancer which is merged 111 and aotuvb4 and im getting increase on all samples by 5-10% in size,actuaally 20 samples were 10% and rest was between5-10%.It was a classic rock songs,using p4 and xp-sp2.
It is a quite faster which is nice,but what confusing me is that on beginning of the thread all tests shows the same bitrate-i encoded at q5 everything was default from foobar

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2005-09-05 04:01:19

By 'built-in', did u mean the official vorbis libraries?
The difference is probably caused by the different tunings used.
The official one doesn't use AoTuV b4 tunings yet.
[span style='font-size:8pt;line-height:100%'](is it still b2 or something...? I forgot...)[/span]

Title: Ogg Vorbis optimized for speed
Post by: zver on 2005-09-06 00:09:40

Quote

By 'built-in', did u mean the official vorbis libraries?
The difference is probably caused by the different tunings used.
The official one doesn't use AoTuV b4 tunings yet.
[span style='font-size:8pt;line-height:100%'](is it still b2 or something...? I forgot...)[/span]
[a href="index.php?act=findpost&pid=325060"][{POST_SNAPBACK}][/a]

I meant the one which comes by default in foobar allready configured in diskwriter
mrq and foobar reports it as 1.1.
Both were encoded with default preferencies-q5 and no other parametars

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2005-09-06 01:02:29

The bitrate difference you're seeing is aotuv vs 1.1, lancer may change bitrates compared to the regular aotuv, but only by a tiny bit.

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2005-09-06 08:33:46

Quote

I meant the one which comes by default in foobar allready configured in diskwriter
mrq and foobar reports it as 1.1.
Both were encoded with default preferencies-q5 and no other parametars
[a href="index.php?act=findpost&pid=325272"][{POST_SNAPBACK}][/a]

Exactly.
I don't think that version (1.1.0) already use AoTuV b4 tunings.
So if the resulting file size difference is quite big, it's probably 'cause of different tunings used.

Title: Ogg Vorbis optimized for speed
Post by: de Mon on 2005-09-08 18:20:32

Quote

Quote
I have a question. Will Lancer and P-III optimized (from rarewares.org) versions work and have any gain on Celeron 128kb cache (not Tualatin)?
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=324999")

Only if your Celeron support SSE.... try to use a program like [a href="http://www.cpuid.org/cpuz.php]cpuz[/url] to see if it have SSE instruction.
[a href="index.php?act=findpost&pid=325044"][{POST_SNAPBACK}][/a]

Yes, CPUZ says my CPU has SSE, I tried 'Lancer oggenc' and it is realy faster than P-III compile.
However I also tried 'Lancer OggDropXPd' and it doesn't work.
When I drop wav's in it nothing happens. Does anybody know why?
My PC is Intel Celeron 1000 MHz (not Tualatin) with Windows 98SE. If anybody have Windows 98SE installed - please check - does 'Lancer OggDropXPd' work?
Thanks.

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2005-09-09 09:19:51

Quote

My PC is Intel Celeron 1000 MHz (not Tualatin) with Windows 98SE.

Try these updates (http://www.msfn.org/board/index.php?showforum=91)...

Title: Ogg Vorbis optimized for speed
Post by: toot on 2005-09-15 11:40:11

Nice speed incrase here on AMD x2 4400+

aoTuVb4 - no enhancements (oggenc)
File length: 72m 28.0s
Elapsed time: 5m 07.0s
Rate: 14.1643
Average bitrate: 192.2 kb/s

aoTuVb4 - SEE version (oggenc2)
File length: 72m 28.0s
Elapsed time: 4m 16.0s
Rate: 16.9861
Average bitrate: 192.2 kb/s

aoTuVb4 - SEE2 version (oggenc2)
File length: 72m 28.0s
Elapsed time: 3m 30.0s
Rate: 20.7068
Average bitrate: 192.2 kb/s

lancer20050709 (oggenc2)
File length: 72m 28.00s
Elapsed time: 2m 3.66s
Rate: 35.1655
Average bitrate: 192.2 kb/s

Title: Ogg Vorbis optimized for speed
Post by: de Mon on 2005-10-22 00:08:46

Is this optimizing done via implementing SSE and SSE2 only, or also via assembling some parts of code?
Can such optimizing work be done with Ogg Vorbis decoder?

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2005-10-22 20:06:46

I recall reading vorbis was all x87 code, that is, floating point, but not accelerated.

What the ICC compiler does is a process called autovectorisation, it's a very clever piece of software that examines routines and attempts to implement them using the faster SSE(2) instructions. At least, that is how i understand it.

What lancer does is replace certain standard routines in vorbis with hand written SSE implementations. This is not assembly (i think), but it is vectorisation (making use of SSE) done by a human.

The SSE instruction set works at a lower precision than the regular x87 instructions, but i don't think that's ever reduced sound quality in a noticeable way.

I'm not an expert on this, but i hope this explanation is accurate enough to answer your questions.

There is also an accelerated vorbis decoder, look at the first post of this topic.

Quote

- W.Dee's wuvorbisfile (Japanese only?) (http://kikyou.info/tvp/#side_product): wuvorbis.dll is a fast Ogg Vorbis decoder with SSE and 3DNow!, which is a part of KiriKiri software (useful for developing multi-media contents or adventure games). wuvorbis.dll decodes 1.4x-1.8x faster (SSE) and 1.5x-1.9x faster (3DNow!) than official libvorbis.

Babelfish translation (http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=ja_en&url=http%3A%2F%2Fkikyou.info%2Ftvp)
I'll see if i can get this to work and bench it.
EDIT: can't make much sense of the japanese even with babelfish, the .dll supplied at least doesn't work as a regular vorbisfile.dll or vorbis.dll.

@toot - very impressive speedup, imagine if it were multithreading!

Title: Ogg Vorbis optimized for speed
Post by: yong on 2005-11-18 15:37:28

Lancer 20051118 is out
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)

Title: Ogg Vorbis optimized for speed
Post by: Garf on 2005-11-18 15:48:48

Quote

I recall reading vorbis was all x87 code, that is, floating point, but not accelerated.

What the ICC compiler does is a process called autovectorisation, it's a very clever piece of software that examines routines and attempts to implement them using the faster SSE(2) instructions. At least, that is how i understand it.

What lancer does is replace certain standard routines in vorbis with hand written SSE implementations. This is not assembly (i think), but it is vectorisation (making use of SSE) done by a human.

The SSE instruction set works at a lower precision than the regular x87 instructions, but i don't think that's ever reduced sound quality in a noticeable way.

I'm not an expert on this, but i hope this explanation is accurate enough to answer your questions.
[a href="index.php?act=findpost&pid=336587"][{POST_SNAPBACK}][/a]

Basically nothing you said was correct.

1) autovectorisation is not the same as using SSE or SSE2 instructions
2) hand written SSE implementations are assembly or intrinsics
3) hand written (or automatically generated) SSE does *not* imply vectorisation
4) SSE or SSE2 does not automatically imply lower precision than floating point.

Title: Ogg Vorbis optimized for speed
Post by: Gecko on 2005-11-18 16:24:59

Quote

Lancer 20051118 is out
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)
[a href="index.php?act=findpost&pid=342886"][{POST_SNAPBACK}][/a]

Fantastic! Thanks to all people involved!

Title: Ogg Vorbis optimized for speed
Post by: yong on 2005-11-18 17:04:47

Here is a small Ogg Vorbis CLI encoder speed comparison between John33 and Lancer builds:

Code: [Select]

long_code_here = ';

oggenc2.6-aoTuVb4.5generic.exe
Elapsed time: 0m 11.0s
Rate:         10.0169
Average bitrate: 151.0 kb/s

oggenc2.6-aoTuVb4.5P4.exe
Elapsed time: 0m 07.0s
Rate:         15.7409
Average bitrate: 151.0 kb/s

OggEnc_SSE_20041213ArcherB10.exe
Elapsed time: 0m 05.0s
Rate:         22.0373
Average bitrate: 148.3 kb/s
 
OggEnc_SSE_20050320ArcherRC4.exe
Elapsed time: 0m 05.0s
Rate:         22.0373
Average bitrate: 148.3 kb/s
 
oggenc2_lancer20050528_1.exe
Elapsed time: 0m 4.44s
Rate:         24.8335
Average bitrate: 141.0 kb/s

oggenc2_lancer20050621.exe
Elapsed time: 0m 4.30s
Rate:         25.6426
Average bitrate: 151.0 kb/s

oggenc2_lancer20050709.exe
Elapsed time: 0m 4.23s
Rate:         26.0242
Average bitrate: 151.0 kb/s

oggenc2_lancer20051118.exe
Elapsed time: 0m 4.27s
Rate:         25.8290
Average bitrate: 151.0 kb/s

Test environment:
Pentium4 2.4GHZ, Windows XP SP2, 512MB ddr266 sdram.
Test with 18.5 MB, 44.1khz, stereo, 1min 50sec audio file, and -q4 switch.

NOTE: result above might not accurate...

Title: Ogg Vorbis optimized for speed
Post by: Garf on 2005-11-18 19:20:29

Quote

Quote
I recall reading vorbis was all x87 code, that is, floating point, but not accelerated.

What the ICC compiler does is a process called autovectorisation, it's a very clever piece of software that examines routines and attempts to implement them using the faster SSE(2) instructions. At least, that is how i understand it.

What lancer does is replace certain standard routines in vorbis with hand written SSE implementations. This is not assembly (i think), but it is vectorisation (making use of SSE) done by a human.

The SSE instruction set works at a lower precision than the regular x87 instructions, but i don't think that's ever reduced sound quality in a noticeable way.

I'm not an expert on this, but i hope this explanation is accurate enough to answer your questions.
[a href="index.php?act=findpost&pid=336587"][{POST_SNAPBACK}][/a]

Basically nothing you said was correct.

1) autovectorisation is not the same as using SSE or SSE2 instructions
2) hand written SSE implementations are assembly or intrinsics
3) hand written (or automatically generated) SSE does *not* imply vectorisation
4) SSE or SSE2 does not automatically imply lower precision than floating point.
[a href="index.php?act=findpost&pid=342888"][{POST_SNAPBACK}][/a]

To explain:

3DNow, SSE, SSE2 are alternate instruction sets for floating point processing. These instruction sets have some major advantages over the old x87 mode:

1) They have register based access, instead of stack based
2) They have the *possibility* to operate on 2 or 4 values at the same time (vectorisation)

SSE and 3DNow have 32 bit accuracy, SSE2 has 64 bit accuracy. x87 has 32 or 64 bit accuracy and a possibility (that shouldn't be used and I'm pretty sure vorbis doesn't use it!) to do 80 bit accuracy arithmetic.

Using these instruction sets can be done in the following manner: code for them manually (in assembler or with instrinsics), use a compiler that can use the SSE(2) instructions for floating point instead of x87, or use a compiler than can *vectorize* computations for SSE/SSE2.

Currently (besides manually writing in assembly), ICC is the best at vectorization, and some very recent GCC's have the possibility too. MSVC2005 and older GCC's have the possibility to generate SSE(2) floating point instructions (without vectorisation).

Title: Ogg Vorbis optimized for speed
Post by: DreamTactix291 on 2005-11-19 04:44:19

Quote

Lancer 20051118 is out
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)
[a href="index.php?act=findpost&pid=342886"][{POST_SNAPBACK}][/a]

For some reason on the site right now the links are crossed out and removed.

EDIT: I see why now. aoTuV b4.51 came out.

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2005-11-19 07:40:09

Quote

Lancer 20051118 is out
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=342886")

really ?

[a href="http://homepage3.nifty.com/blacksword/index_e.htm]Ogg Vorbis acceleration project[/url]

Title: Ogg Vorbis optimized for speed
Post by: vinnie97 on 2005-11-19 07:56:46

they must've pulled it, maybe since 4.51 bugfix was released almost simultaneously.

Title: Ogg Vorbis optimized for speed
Post by: toot on 2005-11-19 09:48:09

Quote

they must've pulled it, maybe since 4.51 bugfix was released almost simultaneously.
[a href="index.php?act=findpost&pid=343048"][{POST_SNAPBACK}][/a]

It looks like it.. according to google's surprisingly legible translation..

November of 2005 19th

Release is discontinued to completion of the aoTu V beta4.51 base.

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2005-11-19 13:34:02

Thanks for the explanation, Garf. Reading mostly about the use of 3DNow/SSE with regard to 3D work i didn't realise vectorisation was only a possibility. Or that x87 had precisions other than 80 bit.

Title: Ogg Vorbis optimized for speed
Post by: suur13 on 2005-11-21 21:59:52

OK, lancer_20051121 patches against aotuv4.51 are out.

Can I batch the source under the Linux ?

What would be the exact command and what I need (besides aotuv source) ?

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2005-11-25 01:09:21

I have downloaded Lancer_20051121 and tested the OggEnc2.exe. Here's the log of what I have done (_j33 is John33's compiled, _lancer is Lancer version):

Code: [Select]

D:\Music\!Reprocess>oggenc_j33 -q 2 --output=Mamma_Mia_j33.ogg "ABBA - Mamma Mia
.wav"
Opening with wav module: WAV file reader
Encoding "ABBA - Mamma Mia.wav" to
         "Mamma_Mia_j33.ogg"
at quality 2.00
        [ 99.7%] [ 0m00s remaining] -

Done encoding file "Mamma_Mia_j33.ogg"

        File length:  3m 32.0s
        Elapsed time: 0m 16.0s
        Rate:         13.3078
        Average bitrate: 101.4 kb/s


D:\Music\!Reprocess>oggenc_lancer -q 2 --output=Mamma_Mia_lancer.ogg "ABBA - Mam
ma Mia.wav"
Opening with wav module: WAV file reader
Encoding "ABBA - Mamma Mia.wav" to
         "Mamma_Mia_lancer.ogg"
at quality 2.00
        [ 99.7%] [ 0m00s remaining] -

Done encoding file "Mamma_Mia_lancer.ogg"

        File length:  3m 32.00s
        Elapsed time: 0m 8.78s
        Rate:         24.2484
        Average bitrate: 101.4 kb/s

Wow! It's amazingly fast (I use my brother's AthlonXP 2400+). However, the next step I took made me pause:

Code: [Select]

D:\Music\!Reprocess>dir M*.ogg
 Volume in drive D is Data
 Volume Serial Number is 20E6-C9A1

 Directory of D:\Music\!Reprocess

2005-11-25  01:10         2,701,832 Mamma_Mia_j33.ogg
2005-11-25  01:10         2,701,784 Mamma_Mia_lancer.ogg
               2 File(s)      5,403,616 bytes
               0 Dir(s)   2,493,083,648 bytes free

Whoa! Significant difference? Can't be because of different comment, no? I go check with EditPlus, and I think the files mostly are identical. So I decode both to WAVs and got the same size:

Code: [Select]

D:\Music\!Reprocess>dir *.wav
 Volume in drive D is Data
 Volume Serial Number is 20E6-C9A1

 Directory of D:\Music\!Reprocess

2005-11-15  02:42        37,560,040 ABBA - Mamma Mia.wav
2005-11-25  01:18        37,560,040 Mamma_Mia_j33.wav
2005-11-25  01:18        37,560,040 Mamma_Mia_lancer.wav
               4 File(s)    253,711,964 bytes
               0 Dir(s)   2,493,083,648 bytes free

Same as original. Not knowing what else to do, I try EAQUAL:

Code: [Select]

D:\Music\!Reprocess>eaqual -fref Mamma_Mia_j33.wav -ftest Mamma_Mia_lancer.wav

EAQUAL - Evaluation of Audio Quality
Version:        0.1.3alpha
Author:         Alexander Lerch, zplane.development
_______________________________________________________
Reference File:         Mamma_Mia_j33.wav
Test File:              Mamma_Mia_lancer.wav
Sample Rate:            44100
Number of Channels:     2

Press Escape to cancel...

Processed:              212.93 seconds of audio file
Time elapsed:   82.25

Resulting ODG:   0.11
Resulting DIX:   3.64

BandwidthRef    16082.5596
BandwidthTest   16082.5192
NMR             -34.2508
WinModDiff1     0.3679
ADB             -0.1596
EHS             0.0345
AvgModDiff1     0.1880
AvgModDiff2     0.3213
NoiseLoud       0.0132
MFPD            0.9995
RDF             0.0000

And it seems there are differences.

I tried listening to the results but to my ears they sound the same.

Anyone can shed a light as to why they differ?

[span style=\'font-size:8pt;line-height:100%\']EDIT: Changed CODE to CODEBOX that's all[/span]

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2005-11-25 01:34:37

Quote

Anyone can shed a light as to why they differ?

Read this page with machine translation if you really want to know the reason (the first item in Frequently Asked Questions)
http://homepage3.nifty.com/blacksword/readme_j.htm (http://homepage3.nifty.com/blacksword/readme_j.htm)

In short, SSE arithmetic has 32bit precision while FPU (i.e., without SSE optimization/compile) arithmetic has 80bit precision. The computational error in floating point arithmetic may make the difference but is so small that you probably cannot hear the difference. I bet you also get a difference between John33's compile and reference binary distributed by Aoyumi.

Title: Ogg Vorbis optimized for speed
Post by: Garf on 2005-11-25 06:56:29

Quote

In short, SSE arithmetic has 32bit precision while FPU (i.e., without SSE optimization/compile) arithmetic has 80bit precision.
[a href="index.php?act=findpost&pid=344771"][{POST_SNAPBACK}][/a]

SSE2 has 64 bit accuracy, and the FPU is generally used with only 64 bit accuracy (using 80 bit mode is not possible in a portable way, and as I said, vorbis is not doing it).

Note that AMD64/EM64T use SSE/SSE2 exclusively instead of the FPU.

But yes, in this case the difference is likely just minor rounding error. Note that positive ODG means that there is no audible difference (actually: encoded sample is better than the original, but that's a limitation in the way EAQUAL works).

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2005-11-25 12:47:25

Whoa! Thanks for the clarification I was afraid that Lancer optimizations is buggy and will degrade the output, but this puts my fear to rest. I am very amazed at the encoding speed increase and will change over to Lancer (oggenc, oggdrop, and libvorbis.dll).

One question: How do I decode the result of EAQUAL? Any pointer will be appreciated. Thanks a lot.

Title: Ogg Vorbis optimized for speed
Post by: suur13 on 2005-11-25 13:48:03

Quote

Can I patch the source under the Linux ?

What would be the exact command and what I need (besides aotuv source) ?
[a href="index.php?act=findpost&pid=343961"][{POST_SNAPBACK}][/a]

Title: Ogg Vorbis optimized for speed
Post by: Garf on 2005-11-25 14:56:46

Quote

One question: How do I decode the result of EAQUAL? Any pointer will be appreciated. Thanks a lot.
[a href="index.php?act=findpost&pid=344879"][{POST_SNAPBACK}][/a]

ODG = Objective difference grade

From memory

Code: [Select]

 0 = Imperceptible
-1 = Perceptible but not annoying
-2 = Slightly annoying
-3 = Annoying
-4 = Very annoying

Positive value = better than perfect

Title: Ogg Vorbis optimized for speed
Post by: suur13 on 2005-12-03 19:38:17

OK, managed to patch aotuv sources with lancer dif (don't know what went wrong last time), but now oggenc segfaults. Switching back to original aotuv helps.

Any comments ?

My box is amd64 Gentoo with gcc 3.4.4.

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2005-12-23 11:50:57

Try to compile by gcc 3.3.x.

For me gcc 3.4 can't compile sources, 4.0 compiles but oggenc display mystic error on start but 3.3 compiles and oggenc works after it.

I'm using Ubuntu 5.10 with gcc 3.3.6, 3.4.4 and 4.0.1 (acutally 4.0.2 pre) on Athlon XP 2200+. libvorbis 1.1.2 compiled by gcc 4 with default package options (--host=i486-linux-gnu) gives ~11x, with -march=athon-xp -mfpmath=sse about 14x, with lancer patches by gcc 3.3 with -march=athon-xp gives about 17x.

Title: Ogg Vorbis optimized for speed
Post by: ckjnigel on 2006-01-01 08:00:41

With the late November Lancer oggenc2 , I get MediaCoder encoding speeds from Flacs around 29x on my AMD 3300+ Win X64 system (q 6.16).
I know this thread is about speed, but I wonder if others disagree with my perception that quality now is comparable to MPC at rates around 200 kbps.

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2006-01-01 15:56:57

The quality at which -q setting...?

Title: Ogg Vorbis optimized for speed
Post by: ckjnigel on 2006-01-01 20:43:29

Quote

The quality at which -q setting...?
[a href="index.php?act=findpost&pid=353740"][{POST_SNAPBACK}][/a]

Say, in the range of nominal 200 kbps, which is q 6.16 to 6.24.
I recall that MPC was considered near as dammit to transparent at q 8. So, I'm getting at whether there's a sweet spot in the latest Japanese tweaked Ogg Vorbis encoders in that 6 to 8 range.
I'm pretty sure that that glitch on the 6.0 boundary for the official release that made those just north sound much better has been solved...

Title: Ogg Vorbis optimized for speed
Post by: vinnie97 on 2006-01-02 05:21:13

I know that I've ditched mpc "insane" for ogg q7. The poor seeking and limited hardware support for mpc and the improvements in Vorbis are what convinced me.

I don't think Guru has tested beyond the 170 to 180 range yet, which showed ogg to be on par with (and in some cases better than) mpc.

Title: Ogg Vorbis optimized for speed
Post by: HotshotGG on 2006-01-02 08:33:29

Quote

I don't think Guru has tested beyond the 170 to 180 range yet, which showed ogg to be on par with (and in some cases better than) mpc.

No, need to it's waste of time IMO. Most people with the exception of a few like GuruB can tell the difference, I can't. If it was low-bitrate test then sure why not

Title: Ogg Vorbis optimized for speed
Post by: vinnie97 on 2006-01-02 20:18:18

Quote

Quote
I don't think Guru has tested beyond the 170 to 180 range yet, which showed ogg to be on par with (and in some cases better than) mpc.

No, need to it's waste of time IMO. Most people with the exception of a few like GuruB can tell the difference, I can't. If it was low-bitrate test then sure why not
[a href="index.php?act=findpost&pid=353907"][{POST_SNAPBACK}][/a]

I agree that it's probably futile...but when I still see mpc recommended and touted as the best quality codec in the higher bitrate range, I have to wonder if some more tests are needed to debunk the myth. Finding a significant number of golden ears and "artifact professionals" would be difficult, though.

I'm referring above to claims like the following:

"Highest quality lossy codec at high bitrates" at dbpoweramp's codec central

and the general exuberance and confidence on display for mpc at the musepack forums when it hasn't seen any real quality improvements since its superiority was initially discovered...they even discount Guru's recent 170-180 kbps test as significant and make broad-sweeping claims that 128 kbps is not transparent (and that such bitrate testing is not interesting), which I think the latest 128 test will prove otherwise.

Title: Ogg Vorbis optimized for speed
Post by: ckjnigel on 2006-01-05 06:59:05

OK... thanks to HotshotGG and Vinnie97. I do suspect that the inherent superiority of MPC at 192+ kbps is shibboleth unquestioned because only those able to read Japanese can keep up with what those codec developers have been doing.
I'm delighted by how very simple and quick it has been to convert my FLACs for use in my 60 Gb iAudio -- one week to get it two thirds full. It was because that player has a 9,999 file limit that I decided to go up to around 200 kbps *.ogg . I was surprised that I so readily detected improvement over nominal 160 kbps LAME VBR.

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-01-10 05:21:05

Yes that reminds me... an MPC-fans said that the result is "inconclusive" as it is low bitrate and MPC achieves transparency at high bitrate...

I mean if Vorbis already achieves transparency at -q 5 or -q 6, what's the point of using -q 7 and higher?

If I need exact transparency I'll use FLAC. For my daily use, -q 2 suffices

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2006-02-01 03:21:05

New version seems to be out.

Title: Ogg Vorbis optimized for speed
Post by: Zoom on 2006-02-01 04:20:17

Not sure what the differences are, but I encoded a couple dozen files with the older version and the new one. I noted an increase of speed of about one percent.

36.28 seconds average for the old version and 36.64 for the new version. Significant or not? I dunno, but hey one percent is one percent

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-02-01 04:41:41

Quote

New version seems to be out.[a href="index.php?act=findpost&pid=361129"][{POST_SNAPBACK}][/a]

New version of what? Where?

Title: Ogg Vorbis optimized for speed
Post by: Zoom on 2006-02-01 04:56:52

The new version of the encoder discussed in this thread, Archer/Lancer. The new version has a build date of January 31st 2006.

http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)

Title: Ogg Vorbis optimized for speed
Post by: de Mon on 2006-02-01 05:11:27

Previous version was based on OggDropXPd 1.8.6. This one is based on 1.8.7. I think it is the only difference.

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2006-02-01 05:13:50

From google-translated page I think that there are some new optimizations and/or bug-fixes.

Title: Ogg Vorbis optimized for speed
Post by: Emanuel on 2006-02-01 11:56:42

Via the Google Japan-English (beta) translation:

Quote

Libogg in 1.1.3 update
Oggenc in 2.8 update
OggdropXPd in 1.8.7 version rise
With SSE optimization of mapping_forward and _2class analysis of ICL imperfectly correspondence

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-02-03 06:18:19

Quote

Via the Google Japan-English (beta) translation:
Quote
Libogg in 1.1.3 update
Oggenc in 2.8 update
OggdropXPd in 1.8.7 version rise
With SSE optimization of mapping_forward and _2class analysis of ICL imperfectly correspondence

[a href="index.php?act=findpost&pid=361215"][{POST_SNAPBACK}][/a]

It seems that Lancer 20060131 oggenc2.exe piping from STDIN is broken. Can anyone test it?

Title: Ogg Vorbis optimized for speed
Post by: Zoom on 2006-02-03 06:59:37

Quote

Quote
Via the Google Japan-English (beta) translation:
Quote
Libogg in 1.1.3 update
Oggenc in 2.8 update
OggdropXPd in 1.8.7 version rise
With SSE optimization of mapping_forward and _2class analysis of ICL imperfectly correspondence

[a href="index.php?act=findpost&pid=361215"][{POST_SNAPBACK}][/a]

It seems that Lancer 20060131 oggenc2.exe piping from STDIN is broken. Can anyone test it?
[a href="index.php?act=findpost&pid=361605"][{POST_SNAPBACK}][/a]

Works here...

"You can specify taking the file from stdin by using - as the input filename.
In this mode, output is to stdout unless an output filename is specified
with -o"

Title: Ogg Vorbis optimized for speed
Post by: sh1leshk4 on 2006-02-03 09:24:29

Quote

It seems that Lancer 20060131 oggenc2.exe piping from STDIN is broken. Can anyone test it?
[a href="index.php?act=findpost&pid=361605"][{POST_SNAPBACK}][/a]

What did you do or use (software and command line arguments) when you had the problem?

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-02-03 16:31:33

Quote

Quote
It seems that Lancer 20060131 oggenc2.exe piping from STDIN is broken. Can anyone test it?
[a href="index.php?act=findpost&pid=361605"][{POST_SNAPBACK}][/a]

What did you do or use (software and command line arguments) when you had the problem?
[a href="index.php?act=findpost&pid=361630"][{POST_SNAPBACK}][/a]

I use it for the dMC compressor, actually I don't test it by using command line.

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-02-05 12:59:54

Quote

Quote
Quote
It seems that Lancer 20060131 oggenc2.exe piping from STDIN is broken. Can anyone test it?
[a href="index.php?act=findpost&pid=361605"][{POST_SNAPBACK}][/a]

What did you do or use (software and command line arguments) when you had the problem?
[a href="index.php?act=findpost&pid=361630"][{POST_SNAPBACK}][/a]

I use it for the dMC compressor, actually I don't test it by using command line.
[a href="index.php?act=findpost&pid=361685"][{POST_SNAPBACK}][/a]

I tested Lancer 20060131 oggenc2.exe today. It seems that feeding a complete wave file form CLI or though pipe to oggenc2 will encode. *BUT* I don't know what dMC / fb2k will feed to oggenc2. To only thing I know is that, oggenc2_lancer20051121 doesn't have such issue.

Title: Ogg Vorbis optimized for speed
Post by: rutra80 on 2006-02-05 19:52:38

Works fine here with fb2k & piping.

Title: Ogg Vorbis optimized for speed
Post by: Skymmer on 2006-02-06 14:12:14

People, I need previous Lancer 20051121, both oggenc2 and OggDropXPd, but links from official site does not work. Can anybody give working links? Big thanks in advance !

Title: Ogg Vorbis optimized for speed
Post by: [solid] on 2006-02-06 14:44:46

could anyone provide a static linux oggenc with lancer?
please?

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-02-06 19:14:49

Quote

People, I need previous Lancer 20051121, both oggenc2 and OggDropXPd, but links from official site does not work. Can anybody give working links? Big thanks in advance ![a href="index.php?act=findpost&pid=362299"][{POST_SNAPBACK}][/a]

If you will wait I will upload this morning. In RAR format okay?

Title: Ogg Vorbis optimized for speed
Post by: Skymmer on 2006-02-07 05:15:22

Yeah. Or 7z.
By the way I already have oggenc2 so you can post OggDropXPd only. Thanks !!!

Title: Ogg Vorbis optimized for speed
Post by: ckjnigel on 2006-02-07 07:07:34

Quote

Works fine here with fb2k & piping.
[a href="index.php?act=findpost&pid=362151"][{POST_SNAPBACK}][/a]

Yup, I simply replaced my old oggenc2 and foobar2000 did its thing on my FLACs. But I did see that it was faster.
The new oggenc2 works even faster with Stanley Hwang's MediaCoder using mplayer. But my output Oggs have Genre:Unknown and track numbers without preceding 0, e.g., 01. That was so with the November oggenc2, also; can anybody suggest CLI for MediaCoder?

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-02-07 17:24:20

Quote

Yeah. Or 7z.
By the way I already have oggenc2 so you can post OggDropXPd only. Thanks !!![{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=362466")

Uploaded. Get the OggDropXPd Lancer 20051121 [a href="http://marupa.mine.nu/EquityLinks/oggdropxpd_lancer20051121.rar]here[/url].

Just in case, OggEnc 2.6 Lancer 20051121 you can also get here (http://marupa.mine.nu/EquityLinks/oggenc_lancer20051121.rar).

Title: Ogg Vorbis optimized for speed
Post by: gameplaya15143 on 2006-02-07 21:07:44

I just tried it with dbpoweramp... older version pipes just fine.. this one doesnt

i'll see if i can get it to work....

Title: Ogg Vorbis optimized for speed
Post by: gameplaya15143 on 2006-02-08 17:50:58

dbpoweramp error messages..

Code: [Select]

Encoding standard input to
         "D:\~dmcout.ogg"
at quality 3.00
Internal error: attempt to read unsupported bitdepth 16


Done encoding file "D:\~dmcout.ogg"

        File length:  0m 00.0s
        Elapsed time: 0m 00.0s
        Rate:         0.0000
        Average bitrate: 1.$ kb/s

^^ without any extra options, just ' - --output=D:\~dmcout.ogg'

after I removed this from the options file....
--raw --raw-chan=[Channels] --raw-bits=[BitsPerSample] --raw-rate=[SamplesPerSec]
I got this error
ERROR: Input file "(stdin)" is not a supported format

I also tried using [WriteWaveRIFF], but gave me the same error as above (and other junk), and it made my computer 'beep' at me

even though piping works fine with it in foobar2000, it is obvious to me that something is wrong

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-02-09 12:11:43

Quote

dbpoweramp error messages..

Code: [Select]
Encoding standard input to
         "D:\~dmcout.ogg"
at quality 3.00
Internal error: attempt to read unsupported bitdepth 16


Done encoding file "D:\~dmcout.ogg"

        File length:  0m 00.0s
        Elapsed time: 0m 00.0s
        Rate:         0.0000
        Average bitrate: 1.$ kb/s
^^ without any extra options, just ' - --output=D:\~dmcout.ogg'

after I removed this from the options file....
--raw --raw-chan=[Channels] --raw-bits=[BitsPerSample] --raw-rate=[SamplesPerSec]
I got this error
ERROR: Input file "(stdin)" is not a supported format

I also tried using [WriteWaveRIFF], but gave me the same error as above (and other junk), and it made my computer 'beep' at me

even though piping works fine with it in foobar2000, it is obvious to me that something is wrong
[a href="index.php?act=findpost&pid=362798"][{POST_SNAPBACK}][/a]

I also test it with a .pcm file saved from CoolEdit Pro and same error occured.

Title: Ogg Vorbis optimized for speed
Post by: chapas on 2006-02-10 00:43:40

I'd like someone experienced to make a static linux build too, as I failed every time trying to do one

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-02-10 11:19:17

Quote

Quote
dbpoweramp error messages..

Code: [Select]
Encoding standard input to
         "D:\~dmcout.ogg"
at quality 3.00
Internal error: attempt to read unsupported bitdepth 16


Done encoding file "D:\~dmcout.ogg"

        File length:  0m 00.0s
        Elapsed time: 0m 00.0s
        Rate:         0.0000
        Average bitrate: 1.$ kb/s
^^ without any extra options, just ' - --output=D:\~dmcout.ogg'

after I removed this from the options file....
--raw --raw-chan=[Channels] --raw-bits=[BitsPerSample] --raw-rate=[SamplesPerSec]
I got this error
ERROR: Input file "(stdin)" is not a supported format

I also tried using [WriteWaveRIFF], but gave me the same error as above (and other junk), and it made my computer 'beep' at me

even though piping works fine with it in foobar2000, it is obvious to me that something is wrong
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=362798")
I also test it with a .pcm file saved from CoolEdit Pro and same error occured.
[a href="index.php?act=findpost&pid=362957"][{POST_SNAPBACK}][/a]

From [a href="http://64.233.179.104/translate_c?hl=en&sl=ja&u=http://pc7.2ch.net/test/read.cgi/software/1136822006/266&prev=/search%3Fq%3Dhttp://pc7.2ch.net/test/read.cgi/software/1136822006/%26hl%3Den%26hs%3DEOq%26lr%3D%26client%3Dfirefox%26rls%3Dorg.mozilla:en-US:unofficial%26sa%3DG]Author's reply[/url], it is a bug with oggenc 2.8.

Title: Ogg Vorbis optimized for speed
Post by: R.A.F. on 2006-02-10 14:08:38

If you are looking for a (more or less) perfect frontend for transcoding your FLAC´s, Monkey´s audo- or WAVPack-files to OGG-vorbis, including perfect transfer of the TAG´s from the original and replaygaining (automatically after the encoding process), just stick with this frontend:

Universal-Front -All-In-One- (http://home.pages.at/thursdaychild/Universal-Front.7z)(7-zip-packed, 3,29 MB)
(Note: All necessary codecs are already included.)

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2006-03-01 15:02:44

Lancer 20060301 is out.
http://homepage3.nifty.com/blacksword/ (http://homepage3.nifty.com/blacksword/)
Now SSE and SSE2 optimizations!

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-01 15:24:59

Translated by excite.co.jp
--
2006/03/01 Lancer 20060301

The optimization option when compiling is reexamined.
Oggenc is renewed to Ver.2.81.
Because the function of the management of the project of Visual Studo is unstable, the development environment is completely shifted to the command line.
The optimization code for SSE2 is implemented.
The optimization code that uses an in-line assembler for bark_noise_hybridmp and seed_curve is implemented.
The SSE optimization of mdct_forward is changed.
Double-Step Bresenham algorithm is implemented on render_line and render_line0.
AMD CodeAnalyst is introduced into the code analysis.

[EDIT]fine tuning translation.[/EDIT]

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2006-03-01 15:36:40

Test on Pentium M 715 (2MB L2 Cache, 1.5GHz, 400MHz FSB)

Code: [Select]

c:\Temp\test>oggenc_aotuv_451.exe test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 3,00
        [ 99,6%] [ 0m00s remaining] /

Done encoding file "test.ogg"

        File length:  2m 46,0s
        Elapsed time: 0m 20,0s
        Rate:         8,3069
        Average bitrate: 105,0 kb/s


c:\Temp\test>oggenc2_lancer_20060131.exe test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 3,00
        [ 99,6%] [ 0m00s remaining] /

Done encoding file "test.ogg"

        File length:  2m 46,0s
        Elapsed time: 0m 07,9s
        Rate:         21,1372
        Average bitrate: 105,0 kb/s


c:\Temp\test>oggenc2_lancer_20060209_sse2test.exe test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 3,00
        [ 99,6%] [ 0m00s remaining] /

Done encoding file "test.ogg"

        File length:  2m 46,0s
        Elapsed time: 0m 07,8s
        Rate:         21,3080
        Average bitrate: 105,0 kb/s

c:\Temp\test>oggenc2_lancer_20060301_p3.exe test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 3,00
        [ 99,6%] [ 0m00s remaining] /

Done encoding file "test.ogg"

        File length:  2m 46,0s
        Elapsed time: 0m 07,9s
        Rate:         21,0970
        Average bitrate: 105,0 kb/s


c:\Temp\test>oggenc2_lancer_20060301_p4.exe test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 3,00
        [ 99,6%] [ 0m00s remaining] /

Done encoding file "test.ogg"

        File length:  2m 46,0s
        Elapsed time: 0m 07,6s
        Rate:         21,8345
        Average bitrate: 105,0 kb/s


c:\Temp\test>

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2006-03-02 10:24:33

May I post bugreport here? Thanks!

Can't build new lancer code by gcc into linux. Error is

Code: [Select]

../../lib/psy.c: In function 'seed_curve':
../../lib/psy.c:745: error: 'post05' undeclared (first use in this function)
../../lib/psy.c:745: error: (Each undeclared identifier is reported only once
../../lib/psy.c:745: error: for each function it appears in.)
../../lib/psy.c:768: error: 'post06' undeclared (first use in this function)

I think it's because of define at line 635:

Code: [Select]

#if     defined(_MSC_VER)
                int post07       = ((post1-i)&(~1));
                int post06       = (post07&(~3));   
                int post05       = (post06&(~7));

and using post05, post06 & post07 into #else block.
Well, I've move these declarations before "#if defined(_MSC_VER)" and can compile code.

But with gcc 4.0.2 I get

Code: [Select]

% oggenc test.wav -o /dev/null
Opening with wav module: WAV file reader
Encoding "test.wav" to 
         "/dev/null" 
at quality 3,00
Mode initialisation failed: invalid parameters for quality

With gcc 3.4.5 I get error at compile time again, but more mysterious

Code: [Select]

../../lib/floor1.c: In function `floor1_encode':
../../lib/floor1.c:2333: internal compiler error: in trunc_int_for_mode, at explow.c:54
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
For Debian GNU/Linux specific bug reporting instructions,
see <URL:file:///usr/share/doc/gcc-3.4/README.Bugs>.

O'key... At last I'm trying with gcc 3.3.6 and compile code fine like with gcc 4.0.2. And oggenc works! But bitrate not as with generic aoTuV 4.51. For aoTuV it's 115.3 kb/s but for Lancer it's 111,8 kb/s. And difference between tracks in Audacity up to 0.2 of amplitude range.

Is this code untested for any compiler other MS Visual C ? Is it planned to support GCC? (optimization up to x2-x3 is very good thing!)

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-02 14:11:29

Lancer 20060302 is released!

Because there's a problem with decoding function of SSE2 edition of Lancer 20060301, Lancer 20060302 was released.

The straight line drawing algorithm of the fixed point of "Extremely Fast Line Algorithm" was improved and implemented.
It quickens because it became easy to optimize SIMD though it is a
little.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-03-02 16:05:36

Quote

Lancer 20060302 is released!

It doesn't seem to be working...

In foobar (0.8.3), I get the following error:

Code: [Select]

INFO (foo_clienc) : CLI encoder: C:\Program Files\Codec\Vorbis\Lancer 2.81 2006 03-02\oggenc.exe
INFO (foo_clienc) : Destination file: file://C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
INFO (foo_clienc) : Source file: file://F:\Albums\Bad Company\Bad Company (AF Gold)\05 Bad Company.wv
INFO (foo_clienc) : 44100Hz 24bps 2ch
INFO (foo_clienc) : Encoding took 62 milliseconds, speed 1.68x
ERROR (foo_input_std) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_speex) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_input_std) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_speex) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
INFO (CORE) : attempting to edit file info : file://C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
WARNING (CORE) : file info update failure on : file://C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_diskwriter) : Conversion failed.

...and from the command line:

Code: [Select]

Encoding standard input to "C:\Documents and Settings\372\Desktop\05 Bad Company.ogg" at quality 6.00

can't write .WAV data, disk probably full!

** ERRORS:
General errors: 1
Press any key to continue...

I'm using oggenc281_p4_lancer20060302 on a P4 3.2GHz system with 1G of RAM.
The old oggenc28_p4_lancer20060131 works fine.

I notice in the foobar error report it says:

Code: [Select]

INFO (foo_clienc) : 44100Hz 24bps 2ch

...but the file is 16bps, not 24.

Any suggestions?

Thanks!

~esa

Title: Ogg Vorbis optimized for speed
Post by: Tim Mervielde on 2006-03-02 16:24:14

@esa372

Set "Maximum bitdepth" to 16bits, works here with fb2k 0.9RC

Cheers,

Tim

Title: Ogg Vorbis optimized for speed
Post by: Tim Mervielde on 2006-03-02 16:27:10

Quote

Set "Maximum bitdepth" to 16bits, works here with fb2k 0.9RC
[a href="index.php?act=findpost&pid=368571"][{POST_SNAPBACK}][/a]

I was wrong, on a second file, foobar reports:
"Error writing to file (Encoder has terminated prematurely with code -1073741819; please re-check parameters)"

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-02 16:41:41

Quote

Lancer 20060302 is released!

It doesn't seem to be working...

In foobar (0.8.3), I get the following error:

Code: [Select]

INFO (foo_clienc) : CLI encoder: C:\Program Files\Codec\Vorbis\Lancer 2.81 2006 03-02\oggenc.exe
INFO (foo_clienc) : Destination file: file://C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
INFO (foo_clienc) : Source file: file://F:\Albums\Bad Company\Bad Company (AF Gold)\05 Bad Company.wv
INFO (foo_clienc) : 44100Hz 24bps 2ch
INFO (foo_clienc) : Encoding took 62 milliseconds, speed 1.68x
ERROR (foo_input_std) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_speex) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_input_std) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_speex) : Ogg stream is corrupted : C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
INFO (CORE) : attempting to edit file info : file://C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
WARNING (CORE) : file info update failure on : file://C:\Documents and Settings\372\Desktop\05 Bad Company.ogg
ERROR (foo_diskwriter) : Conversion failed.

...and from the command line:

Code: [Select]

Encoding standard input to "C:\Documents and Settings\372\Desktop\05 Bad Company.ogg" at quality 6.00

can't write .WAV data, disk probably full!

** ERRORS:
General errors: 1
Press any key to continue...

I'm using oggenc281_p4_lancer20060302 on a P4 3.2GHz system with 1G of RAM.
The old oggenc28_p4_lancer20060131 works fine.

I notice in the foobar error report it says:

Code: [Select]

INFO (foo_clienc) : 44100Hz 24bps 2ch

...but the file is 16bps, not 24.

Any suggestions?

Thanks!

~esa
[a href="index.php?act=findpost&pid=368569"][{POST_SNAPBACK}][/a]

What's your CPU?
There's report that Athlon64 3000+ doesn't work with this build but P4 2.6G works.

EDIT: Typo.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-03-02 16:56:54

Quote

Quote
Set "Maximum bitdepth" to 16bits, works here with fb2k 0.9RC
I was wrong, on a second file, foobar reports:
"Error writing to file (Encoder has terminated prematurely with code -1073741819; please re-check parameters)"

Too bad... thanks anyway.

Quote

What's your CPU?

Intel P4 3.2GHz

Title: Ogg Vorbis optimized for speed
Post by: Tim Mervielde on 2006-03-02 17:09:00

The good news is that today's P3 build works here (P4 2,66). I just encoded 4 cd's to be sure (Yesterday's P4 build worked too, today's P4 build gives me strange errors... ).

Cheers,

Tim

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-03 00:28:58

Mr 637 said that there's some problems with straight line drawing algorithm and the compiler. A bug fix version will release in next Monday.

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-03-03 01:50:42

Quote

Mr 637 said that there's some problems with straight line drawing algorithm and the compiler. A bug fix version will release in next Monday.

Good to know - thanks!

Title: Ogg Vorbis optimized for speed
Post by: gameplaya15143 on 2006-03-03 03:12:56

cool... this one (march 2, 2006 for those reading this a long time from now) works with piping in dbpoweramp... but it's slower than the november 05 release

edit: actually, the new one is faster... silly me must have been running too many other programs at the time of my initial test nov. lancer went 20x in dbpoweramp, and the new one at 24x

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-03 11:13:02

Quote

Quote
Mr 637 said that there's some problems with straight line drawing algorithm and the compiler. A bug fix version will release in next Monday.
Good to know - thanks!

[a href="index.php?act=findpost&pid=368672"][{POST_SNAPBACK}][/a]

Mr. 637 is always come as a surprise!
Lancer 20060303 is released!
--
Mr 637 said in 2ch board,
"After all, it is a lapse of judgment of the priority of the operator. It is shameful. "

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2006-03-03 11:49:00

The bug with variable declaration in psy.c has gone but resulted bitrate differ from generated by generic aoTuV 4.51 (compiled by gcc 3.3.6 with -march=athlon-xp, run on AthlonXP+ 2200 under Ubuntu 5.10).

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-03-03 15:47:06

Quote

Mr. 637 is always come as a surprise!
Lancer 20060303 is released!

Excellent - thank you!

It's working fine on my system now.

Test @ -q6:
Bad Company (2006 AudioFidelity remaster)

aoTuV b4.51 P4 - 2:15 encode time
Lancer 20060303 P4 - 1:27 encode time

Identical bitrates and file sizes.

Title: Ogg Vorbis optimized for speed
Post by: Farch on 2006-03-04 11:54:24

Athlon64 X2 3800+ E3 (512 L2 Cache, 2x2000, 200Mhz FSB)
oggenc281_p4_lancer20060303
D:\test>oggenc2.exe test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
"test.ogg"
at quality 3,00
[100,0%] [ 0m00s remaining] |

Done encoding file "test.ogg"

File length: 73m 11,0s
Elapsed time: 2m 15,9s
Rate: 32,3169
Average bitrate: 114,6 kb/s

so, what about dual core ?

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-03-04 23:20:42

Launching two encodes in parallel, not setting the affinity by hand like so:

Code: [Select]

C:\temp>type code.bat
@echo off
start codeone 1
start codeone 2

C:\temp>type codeone.bat
@echo off
oggenc2 -q 5 %1.wav
pause

Results in:

Code: [Select]

Done encoding file "1.ogg"

        File length:  4m 16,0s
        Elapsed time: 0m 07,9s
        Rate:         32,4412
        Average bitrate: 171,7 kb/s

Done encoding file "2.ogg"

        File length:  4m 16,0s
        Elapsed time: 0m 08,0s
        Rate:         32,0600
        Average bitrate: 171,7 kb/s

They start and complete in almost exactly the same time, so the effective rate is about 64x real-time!

(The source file(s) are "Reflect the storm" off the new In Flames album, which I made disk-cache hot by checksumming them prior to starting the script. The CPU is an AMD X2 3800+ at 2.0GHz (the default) with cool'n'quiet enabled)

Title: Ogg Vorbis optimized for speed
Post by: Garf on 2006-03-05 09:35:06

You can test this easily with foobar2000 0.9 Release Candidate: it will automagically encode in parallel on multiprocessor machines.

Title: Ogg Vorbis optimized for speed
Post by: Farch on 2006-03-05 11:16:31

but if i have a really big file? 700 mbs of audio - one track or so...
[span style='font-size:8pt;line-height:100%']i remember it was a nice app out there - gogo2, now it speeds up at 250x on my pc and i think that is tyni int counter limination
the homepage was also on homepage1.nifty.com/herumi
so i would like to ogg will be faster, btw multicore lame 3.97 beta 2 gives me only 45x, so possible 64x speed of ogg on x64 platform will be amazing think, isnt`it?[/span]

update:
I wrote to the author about multi-core and got this:

Quote

Hello,

The MultiThreading function is certainly cool. But I do not have
dual-core machine. I have Athlon-XP 1700+ and Pentium4M only.

It should be detailed to the algorithm of vorbis to do the work
and requires more memory in process.
Therefore, it is not easy. However, if a new machine is obtained,
I will challenge it.

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-10 17:14:55

Lancer 20060310 is released.

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-03-17 13:37:24

Lancer 20060317 is released!

Translated by excite.co.jp, fine tuned by me
--
2006/03/15 Lancer 20060317

When the DownMix function of oggdropXPd is used, the hang issue is corrected.
Oggpack_write new is added to the vorbis side for the performance improvement of DLL.
The SSE optimization code of _ve_amp is updated.
An unnecessary SSE optimization code in floor1.c is deleted.
The code in which the bug of GCC is evaded with inspect_error is added.
The enroll and the register renaming processing are executed by the code related to mdct_forward,
bark_noise_hybridmp, and fft.
The loop division point of bark_noise_hybridmp is calculated beforehand and it changes.

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2006-03-31 16:14:42

Lancer 20060331 is released! (http://translate.google.com/translate?u=http%3A%2F%2Fhomepage3.nifty.com%2Fblacksword%2F&langpair=ja%7Cen&hl=en&newwindow=1&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools)

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-03-31 17:42:44

Quote

Lancer 20060331 is released! (http://translate.google.com/translate?u=http%3A%2F%2Fhomepage3.nifty.com%2Fblacksword%2F&langpair=ja%7Cen&hl=en&newwindow=1&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools)

Thank you!

Title: Ogg Vorbis optimized for speed
Post by: yong on 2006-05-07 15:01:24

Lancer 20060506 (http://homepage3.nifty.com/blacksword/)
Now support SSE3 and Multithreading too.

Title: Ogg Vorbis optimized for speed
Post by: Tiis on 2006-05-07 15:22:40

Quote from: yong on 2006-05-07 15:01:24

Lancer 20060506 (http://homepage3.nifty.com/blacksword/)
Now support SSE3 and Multithreading too.

Nice! I love Lancer's Vorbis Tunings SSE2 and MT for me.

Title: Ogg Vorbis optimized for speed
Post by: toot on 2006-05-07 16:19:48

Some encode times with an AMD X2 4400+.. MT is very nice

oggenc283_sse3mt_lancer20060506
File length: 75m 58.0s
Elapsed time: 1m 28.0s
Rate: 51.8139
Average bitrate: 192.8 kb/s

oggenc283_sse2mt_lancer20060506
File length: 75m 58.0s
Elapsed time: 1m 32.4s
Rate: 49.3090
Average bitrate: 192.8 kb/s

oggenc283_sse3_lancer20060506
File length: 75m 58.0s
Elapsed time: 2m 00.8s
Rate: 37.7426
Average bitrate: 192.8 kb/s

oggenc283_sse2_lancer20060506
File length: 75m 58.0s
Elapsed time: 2m 01.9s
Rate: 37.3896
Average bitrate: 192.8 kb/s

oggenc283_sse_lancer20060506
File length: 75m 58.0s
Elapsed time: 2m 01.4s
Rate: 37.5532
Average bitrate: 192.8 kb/s

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-05-08 12:23:46

Uhhh... so which one is which?

I have AthlonXP 2400+ (IIRC Barton core). Which one should I get.

Sorry my mind is a bit swimming at the moment...

Title: Ogg Vorbis optimized for speed
Post by: ilikedirtthe2nd on 2006-05-08 13:06:13

Quote from: pepoluan on 2006-05-08 12:23:46

Uhhh... so which one is which?

I have AthlonXP 2400+ (IIRC Barton core). Which one should I get.

Sorry my mind is a bit swimming at the moment...

SSE Version

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-05-08 17:20:14

Lancer's dll crash my winamp+oddcast3 since 200603010 build.
till now, it is still now fixied.

Title: Ogg Vorbis optimized for speed
Post by: foxyshadis on 2006-05-09 01:41:19

Hmm, I guess this is still based off the old code and not Aoyumi's recent tunings? Multithreading is so cool, the problem is that it takes longer to read off the hard drive (or decode a flac) than to encode now. XD

Title: Ogg Vorbis optimized for speed
Post by: jetpower on 2006-05-09 02:05:01

From the site:

Quote

Based on aotuv-b4.51_20051117

4.51beta is the latest version from Aoyumi.
http://www.geocities.jp/aoyoume/aotuv/index.html (http://www.geocities.jp/aoyoume/aotuv/index.html)
No need to worry

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2006-05-13 17:41:06

Lancer 20060512 (only MT) Released!
Home page (http://translate.google.com/translate?u=http%3A%2F%2Fhomepage3.nifty.com%2Fblacksword%2Findex.htm&langpair=ja%7Cen&hl=en&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools)

Title: Ogg Vorbis optimized for speed
Post by: Cartman_Sr on 2006-05-13 19:02:40

Hey thanks for that link. I tried going there last night but couldn't figure out a single thing I'm just getting into ogg vorbis now...

But I just tried those (oggenc2) and I got a Fatal error: This program is not designed to run on this machine. I have a P4, 2.1 (Dell), with Windows XP sp2. Are these new builds meant for AMD processors only?

edit: Oh wait a minute, I was trying the SSE3 version, give me a few minutes to try the SSE2 version. BTW, what is SSE2/3 anyway? I think I'm in way over my head here.

edit 2: The SSE2 one does work. Ok, I'm a hoser

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-05-13 19:49:09

Quote from: Cartman_Sr on 2006-05-13 19:02:40

Hey thanks for that link. I tried going there last night but couldn't figure out a single thing I'm just getting into ogg vorbis now...

But I just tried those (oggenc2) and I got a Fatal error: This program is not designed to run on this machine. I have a P4, 2.1 (Dell), with Windows XP sp2. Are these new builds meant for AMD processors only?

edit: Oh wait a minute, I was trying the SSE3 version, give me a few minutes to try the SSE2 version. BTW, what is SSE2/3 anyway? I think I'm in way over my head here.

edit 2: The SSE2 one does work. Ok, I'm a hoser

Hey I got confused too for a moment (see my post up there). I should've checked wikipedia first... it lists processors with SSE, SSE2, and SSE3.

What are these SSE-thingies? In a nutshell, they are special instructions to enable CPUs perform exotic calculations faster. SSE2 adds some instructions to SSE. SSE3 adds more instructions to SSE2. Of course there are CPU architecture evolution but let's KISS.

Soooo... I put in a (very very very) simplified guide on which version of Lancer you should use, in the Lancer page of HA Wiki (http://wiki.hydrogenaudio.org/index.php?title=Lancer).

Title: Ogg Vorbis optimized for speed
Post by: Cartman_Sr on 2006-05-13 21:45:23

Hey thanks for adding that wiki page, makes way more sense now! But does the type of processor you have (32 bit vs. 64 bit) factor into it in any way? I know I have a 32 bit processor (does that sound right?), and a single core computer. The SSE2MT version does work on my computer.

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-05-15 18:47:57

Quote from: Cartman_Sr on 2006-05-13 21:45:23

Hey thanks for adding that wiki page, makes way more sense now! But does the type of processor you have (32 bit vs. 64 bit) factor into it in any way? I know I have a 32 bit processor (does that sound right?), and a single core computer. The SSE2MT version does work on my computer.

Um, the bit-ness of your processor is not related strictly to SSEx. For instance, compare Intel: P4 is 32-bit, yet it support SSE2 instructions. AMD did not get the opportunity to embed SSE2 instructions into their 32-bit line, and opt to add SSE2 into their 64-bit line.

So, whether your processor supports a certain version of SSEx or not, depends more on its release date than its bit-ness.

Edit: Updated the wiki page above slightly to explain the (theoretical) benefit of MT versions.

Title: Ogg Vorbis optimized for speed
Post by: HotshotGG on 2006-05-15 22:45:07

Quote

What are these SSE-thingies? In a nutshell, they are special instructions to enable CPUs perform exotic calculations faster. SSE2 adds some instructions to SSE. SSE3 adds more instructions to SSE2. Of course there are CPU architecture evolution but let's KISS.

Yeah that really needs to be clarfied for a lot of folks. I gathered some information about them and rewrote that section in the wiki. It does help in the long run though.

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-05-16 07:58:39

Quote from: HotshotGG on 2006-05-15 22:45:07

Quote
What are these SSE-thingies? In a nutshell, they are special instructions to enable CPUs perform exotic calculations faster. SSE2 adds some instructions to SSE. SSE3 adds more instructions to SSE2. Of course there are CPU architecture evolution but let's KISS.
Yeah that really needs to be clarfied for a lot of folks. I gathered some information about them and rewrote that section in the wiki. It does help in the long run though.

The most sure-fire way to know which SSEx version your processor supports is to download all 5 Lancer OggEnc2 encoders and run them one by one. If your processor does not support the SSEx, OggEnc2 will exit gracefully, informing you so.

I've added this to the Lancer wiki page (http://wiki.hydrogenaudio.org/index.php?title=Lancer). Hope it helps.

Edit: stupid typo. Note to self: don't type something long while holding a lighted cigarette.

Title: Ogg Vorbis optimized for speed
Post by: Mr_Rabid_Teddybear on 2006-05-18 13:56:07

There are also Windows programs that easily tell you the processor instructions for your system; wcpuid (http://hp.vector.co.jp/authors/VA002374/src/download.html) and cpu-z (http://www.cpuid.com/cpuz.php)

Does people know of programs for Linux and Mac with similar function?

Title: Ogg Vorbis optimized for speed
Post by: toot on 2006-05-18 14:02:39

Quote from: Mr_Rabid_Teddybear on 2006-05-18 13:56:07

There are also Windows programs that easily tell you the processor instructions for your system; wcpuid (http://hp.vector.co.jp/authors/VA002374/src/download.html) and cpu-z (http://www.cpuid.com/cpuz.php)

Does people know of programs for Linux and Mac with similar function?

cat /proc/cpuinfo will work on Linux at least.. and maybe on Mac too since the newer OSs are Unix based I think..

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-05-18 18:46:27

Info about wcpuid and cpu-z is now part of the Lancer page (http://wiki.hydrogenaudio.org/index.php?title=Lancer). I'm not into Linux or any Unix, so please complement the info there if need be. Thanx.

Title: Ogg Vorbis optimized for speed
Post by: tgb on 2006-05-25 06:16:58

I have and Athlon 64 x2 3800.
This new sse3 version is the cat's meow for fully utilizing both cores.
On comparison, I did find a strange phenomenon though.
Encode a whole album with 10 songs and the total time for the sse3mt version
took 10 seconds longer than if I boot two 2006/03/31 versions and oggdrop
5 songs in each simultaneously. Can anyone else reproduce this?
Is the threading overhead higher in the sse3mt version?
Not that I'm complaining mind you. The convenience factor is great with the sse3mt version!
Excellent work. The encoding speed is typically over 50x regardless!
My idea of a "killer app" here!

minor edit for grammar.

Title: Ogg Vorbis optimized for speed
Post by: toot on 2006-05-25 09:56:13

Quote from: tgb on 2006-05-25 06:16:58

I have and Athlon 64 x2 3800.
This new sse3 version is the cat's meow for fully utilizing both cores.
On comparison, I did find a strange phenomenon though.
Encode a whole album with 10 songs and the total time for the sse3mt version
took 10 seconds longer than if I boot two 2006/03/31 versions and oggdrop
5 songs in each simultaneously. Can anyone else reproduce this?
Is the threading overhead higher in the sse3mt version?
Not that I'm complaining mind you. The convenience factor is great with the sse3mt version!
Excellent work. The encoding speed is typically over 50x regardless!
My idea of a "killer app" here!

minor edit for grammar.

It's normal that using a multi-threaded app over two cores won't give you a 100% boost over using one core.. It's usually something like 70% faster. In fact, a mere 10 seconds difference for a whole album is very good!

Title: Ogg Vorbis optimized for speed
Post by: MedO on 2006-05-25 16:25:42

Of course double speed is seldom possible using multithreading since many problems can't be parallelised (or whatever it's called) completely. But in the case of encoding a batch of files with OggdropXpd, wouldn't it make more sense then to run one normal encoding thread per core, because that's 100% parallelised? Of course, when only one file is encoded, the multithreaded version is (if above post is correct) only slightly slower, but I think if it gives a speed advantage (which is what Lancer builds are all about I believe) one could implement this parallel encoding into the frontend.
Hope I'm making sense, I need some sleep...

MedO

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-05-29 16:04:40

Here we go: Lancer 20060529 Release (http://homepage3.nifty.com/blacksword/index.htm)

Changelog (by babelfish):

- Correcting the trouble of the decoding section.

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2006-05-30 06:40:07

Is it a chance for any version in near future to work correctly after compilation by GCC (preferably by 4.x branches)?
Latest version which correctly works after compile by gcc 3.3.6 is 20051121 (tested on Athlon XP 2200+ with SSE only support in Ubuntu Linux). All versions after this one give differrent bitrate in generated .ogg (compared to 'standard' - aoTuV b4.51).
Ogg Vorbis is standard lossy codec in Linux world and SSE(2,3) optimized version for Linux is a good support for the community.

Title: Ogg Vorbis optimized for speed
Post by: MedO on 2006-05-30 09:28:24

If the bitrate difference is only slight, this is probably normal. IIRC, even the P3-Optimised version of the original AoTuV-encoder gives slightly different results than the generic build.

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2006-05-30 12:35:03

The difference is significant, about 2-3 kbps for -q 3 (on my test.wav it gives 112.3 kbps instead of 115.3 kbps in aoTuV 4.51).
I've tried to compile aoTuV by gcc 3.3/3.4/4.0 with or without compiler SSE optimization (-march=athlon-xp -mfpmath=sse) and bitrate was different only in hundredths. So this is definitely a bug.

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-05-30 14:02:01

Remember that standard aoTuV uses the FPU and Lancer uses SSE. They have different bit-length to represent real nums, and may thus cause different compression.

If in doubt, ABX.

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2006-05-30 15:22:09

Already compiled oggenc2/oggDropXPd for Windows give the same bitrate as unoptimized aoTuV, but its were compiled MSVC or similar, not by GCC. I think GCC just untested in new versions of Lancer.

Title: Ogg Vorbis optimized for speed
Post by: haregoo on 2006-06-15 16:47:00

Lancer 20060616 released.

Edit: Fixed bug in decoding(SSE2)

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2006-06-17 09:53:56

Translated by google:
2006/06/16 Lancer 20060616
In one for AMD CPU replacing the CPU distinction processing of the DLL file
Optimizing vorbis_oggpack_look with the inline assembler
Adding SSE3 optimization processing to _mm_add_horz*
SSE optimizing oggdec, it adds
Correcting the trouble of the SSE2 optimization of ov_read_float2pcm
The decoding section of oggdropXPd SSE optimization
Optimization profile for multithread operation for single thread and joint ownership conversion

Title: Ogg Vorbis optimized for speed
Post by: sony666 on 2006-07-14 11:20:25

Didn't use Vorbis for some time but now I needed an encoder for some previews and tried the Lancer (2006 06 16th) one.

The speed is just sick, thanks to all involved in that
Works great on my Athlon XP, normal SSE version.

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-07-14 19:14:45

LOL yeah I still got the warm-fuzzy-feeling everytime I encode using Lancer

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-07-22 19:24:55

New release out today.

Code: [Select]

Changes:
* inline assembly replaces as much as possible to intrinsic
* abolish original memory transfer code in block.c
* bitreverse use looking up table
* fix speed down vorbis_book_decodevv_add's regression in lancer
  20060529
* remove optimization prevention code in vorbis_book_decodevv_add
* pre-calculate tables for triggers in mdct
* simplifying a code in which high frequency removed by mdct_backward
* add decode only funcs: mdct_butterflies_backward,
  dct_butterfly_first_backward
* improve SSE optimization: bark_noise_hybridmp
* add SSE optimization: render_line, vorbis_noise_normalize,
  _vp_noise_normalize
* add SSE3 optimization: mdct_bitreverse
* add pre-calculation code: seed_loop, max_seeds
* optimize: seed_chase
* add SORT16 to psy.c
* auto loop unrolling: SORT8, SORT32 in psy.c
* use lddqu in non SSE environment for unaligned memory load
* improve loop condiution code in inline assembly code
* add t option for oggdec benchmarks (without outputting file)

(courtesy of pub at cyanet.jp)

Good to see the asm being replaced with intrinsics.

Title: Ogg Vorbis optimized for speed
Post by: HotshotGG on 2006-07-22 19:57:08

Quote

* add SSE optimization: render_line, vorbis_noise_normalize,
_vp_noise_normalize

SSE optimizations to the noise normalization code ey? that's interesting. Must be very fast

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-07-24 21:44:53

Hope this guy will work on theora or dirac in the future !
I also hope to see SSE/SSE2/SS3 builds merged together and autoselects the optimizations on fly (like FLAC...)

Title: Ogg Vorbis optimized for speed
Post by: haregoo on 2006-08-01 22:09:27

Lancer 20060722 is temporarily unavailable due to memory issue(unconfirmed).

Title: Ogg Vorbis optimized for speed
Post by: rudefyet on 2006-08-02 08:53:20

What kind of memory issue? I've been using 20060722 with no problems.

Title: Ogg Vorbis optimized for speed
Post by: Josef K. on 2006-08-02 11:59:54

Quote from: rudefyet on 2006-08-02 08:53:20

What kind of memory issue? I've been using 20060722 with no problems.

It's impossible to download from the page.
If someone could post Lancer 20060722 release (at least "oggenc2.83"), that would be great. Or just a link for dl, of course.

Title: Ogg Vorbis optimized for speed
Post by: haregoo on 2006-08-02 12:09:14

Lancer 20060802 (http://homepage3.nifty.com/blacksword/exprimental/index.htm) released.

This is bug fixed release.
20060722 had a memory leak according to author.

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-08-03 17:36:29

And the crash issue of Lancer DLL is going to be fixed.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-08-05 18:23:34

Experimental (http://homepage3.nifty.com/blacksword/exprimental/index.htm) Lancer 20060806 is up.

Altavista says:
"Being heap memory access error occurs with vorbis_oggpack_write it abolishes, the optimization module in oggpack_write movement
oggpack_look SSE optimization of optimization
_ve_amp cash control processing of modification
accumulate_fit being imperfect with correction
MDCT-RELATED cash control rearranging unnecessary zero data exception processing the SSE optimization description section of deletion _encodepart with correction inspect_error"

Err.. right.

Title: Ogg Vorbis optimized for speed
Post by: skelly831 on 2006-08-05 19:09:10

Wow! Optimization of optimization, that's gotta be fast!

Title: Ogg Vorbis optimized for speed
Post by: MedO on 2006-08-05 19:27:09

Quote from: skelly831 on 2006-08-05 19:09:10

Wow! Optimization of optimization, that's gotta be fast!

None of the many "recent" optimizations provided a major speedup for me. Maybe it's 20x encoding with an old lancer and 22x with the sse2-optimized version. I have a Celeron M 1400Mhz, so the MT speedups don't help here. Still, the speed is great. What are your experiences/speeds/setups? Just curious...

Title: Ogg Vorbis optimized for speed
Post by: moozooh on 2006-08-05 21:31:34

Quote from: MedO on 2006-08-05 19:27:09

What are your experiences/speeds/setups? Just curious...

I'm experiencing a speedup of about ~1.7x with Lancer 20060806 compared to the latest OggEnc from rarewares (average speed is something around 29x vs 17x, significantly depending on the sound material).

EDIT: My system is WinXP SP2 on an Athlon 64 3400+ (Venice).

Title: Ogg Vorbis optimized for speed
Post by: PrakashP on 2006-08-05 23:05:54

If anybody has a Core 2 Duo, I would be interested how fast the SSE2 version is, as this new CPU doesn't break up SSE2 instructions intto 2 parts and thus can compute them directly.

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2006-08-05 23:44:34

Athlon XP 3200+ (2.2GHz), SSE:

Lancer 20060616 - 37.85x
Lancer 20060805 - 38.56x

So it's about 1.9% faster, nice but hardly significant.

I also recall that a previous version of lancer was about .5x faster than 06/16. Like linux kernels, not every next version is faster on every system.

The SSE3 multithreading version should fly on Core2. I wouldn't be surprised if it encodes over 100x.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-08-06 15:09:32

Quote from: MedO on 2006-08-05 19:27:09

What are your experiences/speeds/setups? Just curious...

Machine: AMD Athlon64 X2 3800+ (2GHz) @ 2.4GHz
Track: Machinae Supremacy - Elite.wav (4m 24.0s)
Options: -q 5

OggEnc (vorbis-tools Rev.10381): 13.9284x (19s)
OggEnc v2.83 (Lancer [20060805](SSE3MT) based on aoTuV b4b): 57.605572x (4.594s)

Title: Ogg Vorbis optimized for speed
Post by: R.A.F. on 2006-08-06 18:29:37

It seems that there was something wrong with the latest Lancer-version (Lancer 20060802(Based on aotuv-b4.51_20051117)) for vorbis, because all files were taken (scratched) from the download-server. See actually his page (http://homepage3.nifty.com/blacksword/).

Title: Ogg Vorbis optimized for speed
Post by: rudefyet on 2006-08-07 21:38:08

20060807 has been released on the main page

Quote

2006/08/07 Lancer 20060807

Correcting the SSE optimization of mdct_forward and mdct_backward
Only static edition reviving vorbis_oggpack_write
Correcting the problem of local_book_besterror_dim1x4

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-08-09 17:37:11

... and now it too is "crossed out".

"2006 August 9th

Continuing with oggdropXPd of Lancer, 20060807 when you encode, because the problem which becomes output of abnormal bit rate was discovered it stops release at one time."

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-08-11 12:17:37

Quote from: eloj on 2006-08-09 17:37:11

... and now it too is "crossed out".

"2006 August 9th

Continuing with oggdropXPd of Lancer, 20060807 when you encode, because the problem which becomes output of abnormal bit rate was discovered it stops release at one time."

Try a newer build.
http://homepage3.nifty.com/blacksword/exprimental/index.htm (http://homepage3.nifty.com/blacksword/exprimental/index.htm)

Title: Ogg Vorbis optimized for speed
Post by: rudefyet on 2006-08-12 06:02:42

And now 20060811 is released on the main page.

Quote

2006/08/11 Lancer 20060811

Correcting the SSE optimization of mdct_backward

2006/08/10 Lancer 20060810 (bit rate abnormal problem, for multithread operation problem evaluation)

Correcting the problem of the SSE optimization of _ve_amp
Correcting the problem where the pattern which each time differs in multithread operation edition is output

Title: Ogg Vorbis optimized for speed
Post by: sony666 on 2006-08-14 09:08:06

Maybe this was suggested before, but it would be beneficial to the project if he/she did the page in English, or if thats a problem ask a friend to translate it.

ありがとう - それは非常に速くある

Title: Ogg Vorbis optimized for speed
Post by: jarsonic on 2006-08-14 12:53:42

Quote from: sony666 on 2006-08-14 09:08:06

Maybe this was suggested before, but it would be beneficial to the project if he/she did the page in English, or if thats a problem ask a friend to translate it.

????? - ??????????

http://translate.google.com/ (http://translate.google.com/)

Title: Ogg Vorbis optimized for speed
Post by: jarsonic on 2006-08-15 00:40:39

I think there's a problem with the multi-threaded versions of 2006/08/11 Lancer 20060811.

I'm on a Core Duo, and I've never had problems with the multithreaded releases up to this one. Now, I'm getting speed slowdowns to 1.x and 3.x, when it was going 40-50x before. The non-multithreaded compiles work fine (40-50x); it's only the SSE2 and SSE3 multithreaded that are screwed up. Can anyone confirm?

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-08-15 01:03:07

Quote from: jarsonic on 2006-08-15 00:40:39

Can anyone confirm?

On my system, the "sse3_mt_lancer20060811" just freezes up.

I've gone back to using the "sse3_mt_lancer20060807", which is working fine.

Title: Ogg Vorbis optimized for speed
Post by: jarsonic on 2006-08-15 01:07:35

Quote from: esa372 on 2006-08-15 01:03:07

Quote from: jarsonic on 2006-08-15 00:40:39
Can anyone confirm?
On my system, the "sse3_mt_lancer20060811" just freezes up.

I've gone back to using the "sse3_mt_lancer20060807", which is working fine.

is there an online archive of past releases, or did you just already have it on your system?

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-08-15 01:20:23

Quote from: jarsonic on 2006-08-15 01:07:35

is there an online archive of past releases, or did you just already have it on your system?

I keep several "layers" of past releases on my computer. I don't know if there's an online archive.

Here's a link for the August 07 release, if you need it:
(right-click -> "Save Target As...")

oggenc283_sse3mt_lancer20060807.zip (http://66.49.140.133/assets/ha/oggenc283_sse3mt_lancer20060807.zip)

Title: Ogg Vorbis optimized for speed
Post by: Yamabushi on 2006-08-15 04:55:12

I had problems with the latest MT as well and have reverted to an earlier version.

Cheers,
Pete

Title: Ogg Vorbis optimized for speed
Post by: bukem on 2006-08-15 12:38:36

Lancer 20060815 Experimental (http://homepage3.nifty.com/blacksword/exprimental/index.htm)

Babelfish translation:

Quote

It released started Lancer 20060815 for multithread operation problem verification at the laboratory.

Title: Ogg Vorbis optimized for speed
Post by: Patsoe on 2006-08-15 12:52:25

Quote from: sony666 on 2006-08-14 09:08:06

Maybe this was suggested before, but it would be beneficial to the project if he/she did the page in English, or if thats a problem ask a friend to translate it.

Quote

It released started Lancer 20060815 for multithread operation problem verification at the laboratory.

Lol, so much for Babelfish - Google Translate gives something similar.

It's really a problem that there's no translation available... this way we can't give the guy (girl?) feedback on how it runs/crashes on our systems, nor tell him he's doing cool and appreciated work (although he may guess that from the download numbers).

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-08-16 15:11:05

I think the author should spend some time, once the code is stable again, on getting it working on modern GCCs, like for instance getting clean compiles under linux /w GCC 4.1

I'd do if myself if I had the mad skillz, but I don't.

Title: Ogg Vorbis optimized for speed
Post by: Franklin on 2006-08-18 13:57:37

Lancer 20060818 (MT only) is out.

Changes:

Quote

It improves the multithread operation processing of mapping0_forward, increases the parallel processing section and accelerates
In order with coodbook.* to make the parallel processing of floor1_encode possible, mounting the delay collective entry function of the Ogg stream
Way floor1_encode can be executed while parallel processing, modification
_vp_couple it corresponds to parallel processing
At the time of profile measurement way it does not enter into the infinite loop, modification

I will encode my whole flac archive (400cds, about 150 gb) this weekend on my amd x2 4400+ to ogg q6 oder q7.

Best regards
Franklin

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-08-18 17:17:27

Quote from: PatchWorKs on 2006-08-18 17:15:07

I think that a bi-directional 2-pass MT encoder would be great.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-08-24 02:42:05

I talked a guy with a Core 2 Duo E6600 @ 3.1GHz into running an encode, and here's the result he reported (for -q 5?):

Code: [Select]

File: "M - Pop Muzik"

OggEnc v2.83 (Lancer [20060818](SSE3MT) based on aoTuV b4b)
 
File length: 5m 01,0s
Elapsed time: 0m 3,432s
Rate: 87,950145
Average bitrate: 193,7 kb/s

So these new CPUs seems to be quite the little SSE monsters.

Title: Ogg Vorbis optimized for speed
Post by: skelly831 on 2006-08-24 03:06:45

EDIT: that makes me feel ashamed of my newly aquired mid-range PC

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2006-08-24 03:09:39

Eloj, that's very interesting, awesome performance, shame it falls short of my 100x prediction. Perhaps in -q2

Yet looking at the numbers i get a feeling the multithreading doesn't speed it up that much, is it possible for you to test the non-mt version and see how much slower it is?

Title: Ogg Vorbis optimized for speed
Post by: MedO on 2006-08-24 11:50:10

Quote from: HbG on 2006-08-24 03:09:39

Eloj, that's very interesting, awesome performance, shame it falls short of my 100x prediction. Perhaps in -q2

Yet looking at the numbers i get a feeling the multithreading doesn't speed it up that much, is it possible for you to test the non-mt version and see how much slower it is?

At these speeds I think Disk I/O can be a bottleneck...

Title: Ogg Vorbis optimized for speed
Post by: HbG on 2006-08-24 14:15:32

See the command i used for testing on the previous page. It doesn't write an output file to disk and if you run it multiple times windows will buffer the input file in memory. For me it leads to accurately repeatable results when you discount the first run.

Also 100x is only 17.2MB/s, which any vaguely modern harddrive can easily keep up with.

Title: Ogg Vorbis optimized for speed
Post by: Franklin on 2006-08-24 14:16:59

Hi,

new releases out: 20060824

Recently i converted 400 cds with lancer with a speed of about 50x on my X2 4400+

Best regards
Franklin

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-08-24 15:33:22

The difference seems to be that it's built on aotuv Release 1.

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-08-28 08:23:12

(babelfished) ChangeLog:

Quote

2006/08/24 Lancer 20060824

Based cord/code modification to aotuv-r1_20051117
Adding SSE optimization to _vp_couple
Adding the cord/code for multi channel processing divisions to xmmlib.h
At the time of OpenMP use the singles lead-lead _vp_quantize_couple_memo and _vp_quantize_couple_sort which are operational modification to multithread operation operation

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-09-03 19:47:38

Lancer 20060903 is out.

Title: Ogg Vorbis optimized for speed
Post by: Franklin on 2006-09-04 07:52:26

Changelog

Quote

2006/09/03 Lancer 20060903

Efficiency of the inline assembler cord/code for ICL detailed survey, deleting the slow part
Efficiency of cash control-related cord/code detailed survey, deleting the slow part
Efficiency of memory transfer type cord/code detailed survey and SSE optimization cord/code part revival
Improving the SSE optimization of bark_noise_hybridmp
Knocking down the renewal frequency of the lapse indication of oggenc2.

Regards
Franklin

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-09-07 11:00:56

Awesome... as always !

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-09-08 19:14:03

trying to use the latest & best Lancer is like joining your Build-of-the-day club...

Title: Ogg Vorbis optimized for speed
Post by: esa372 on 2006-09-17 00:33:53

Quote from: pepoluan on 2006-09-08 19:14:03

trying to use the latest & best Lancer is like joining your Build-of-the-day club...

You ain't kiddin'!

Lancer 2006 09-15 (http://homepage3.nifty.com/blacksword/) is out.

Title: Ogg Vorbis optimized for speed
Post by: jarsonic on 2006-09-17 01:54:25

Quote

2006/09/15 Lancer 20060915

Because binary for multithread operation from profile edition usually modification (the profile optimization effect at the time of MT is low in edition,)
Correcting the description mistake of the cord/code for MT of mapping0_forward
Executing loop unrolling with mdct_forward, mdct_backward and mdct_butterfly_generic under multithread operation environment

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2006-09-17 06:44:19

Quote from: esa372 on 2006-09-17 00:33:53

Quote from: pepoluan on 2006-09-08 19:14:03
trying to use the latest & best Lancer is like joining your Build-of-the-day club...
You ain't kiddin'!

Lancer 2006 09-15 (http://homepage3.nifty.com/blacksword/) is out.

uhhh ...

I haven't even yet unzipped the previous version... and now a new build...

*dies*

Not that I despise BlackSword and his (her?) attempts... domo arigato gozaimasu !

Title: Ogg Vorbis optimized for speed
Post by: skelly831 on 2006-09-17 06:56:58

Quote from: pepoluan on 2006-09-17 06:44:19

I haven't even yet unzipped the previous version... and now a new build...

LOL

Title: Ogg Vorbis optimized for speed
Post by: Squeller on 2006-09-17 10:56:12

No sse build?

Title: Ogg Vorbis optimized for speed
Post by: rt87 on 2006-09-17 11:02:56

Quote from: Squeller on 2006-09-17 10:56:12

No sse build?

It looks like 20060915 build is a MT-only bugfix build.

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2006-10-05 16:18:15

2006/10/05 Lancer 20061005:
- Updated ICL to 9.1.030
- Improved MT optimization code for mapping0_forward
- Tweaked compile options
- Suppress some compiling warnings
- Discontinue GCC support

This release is memorial to me as this binary (with -q4) runs faster than 100x on my new machine.
http://nyaochi.sakura.ne.jp/encoder-benchm...t-20061005.html (http://nyaochi.sakura.ne.jp/encoder-benchmark/result-20061005.html)

Many thanks to 637 (Blacksword) for the brilliant achievement!

Title: Ogg Vorbis optimized for speed
Post by: guruboolez on 2006-10-05 16:33:17

This is really impressive. I remember the old time (pre-RC3 encoder) when Vorbis was painfully slow: x1,5 max on my Duron 800 - up to 3...4 time slower than musepack (not present in this big benchmark), and same speed than LAME --alt-preset extreme.

Title: Ogg Vorbis optimized for speed
Post by: iGold on 2006-10-05 17:35:51

Quote from: nyaochi on 2006-10-05 16:18:15

2006/10/05 Lancer 20061005:
- Discontinue GCC support

Sadly to read but it's more truely as a number of previous versions not worked after GCC correctly.
But under wine oggenc2.exe will work anyway.

Title: Ogg Vorbis optimized for speed
Post by: eloj on 2006-10-05 18:13:59

Man, that truly sucks. This should be written with GCC intrinsics, not ICC. Anyone tried building it with the linux version of ICC?

Title: Ogg Vorbis optimized for speed
Post by: VEG on 2006-10-06 11:09:34

Quote

This release is memorial to me as this binary (with -q4) runs faster than 100x on my new machine.

My congratulations to you!

Title: Ogg Vorbis optimized for speed
Post by: Franklin on 2006-10-16 11:37:07

2006/10/13 Lancer 20061013
Correcting the problem of the memory management cord/code

Regards
Franklin

Title: Ogg Vorbis optimized for speed
Post by: de Mon on 2006-10-20 16:08:34

Hmm. On my AMD 2400 it is slower (1x-2x) than ver 2005 11 21
Is it ok?

Title: Ogg Vorbis optimized for speed
Post by: maacruz on 2006-10-28 20:33:32

Quote from: eloj on 2006-10-05 18:13:59

Man, that truly sucks. This should be written with GCC intrinsics, not ICC. Anyone tried building it with the linux version of ICC?

Agreed

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-11-03 11:52:27

New version out (20061103), here's the -babelfished- changelog:

Quote

Based cord/code modification to aotuv-b5_20061024
Modifying the SSE optimization of _vp_offset_and_mix, _vp_noise_normalize_sort and _vp_couple
ICL in 9.1.032 version rise
Correcting the description mistake of the optimization cord/code

website (http://homepage3.nifty.com/blacksword/index.htm)

Title: Ogg Vorbis optimized for speed
Post by: Franklin on 2006-11-11 13:47:09

Release 20061110 is out:

Quote

2006/11/03 Lancer 20061110

Correcting the trouble which cannot encode the monaural sound source in multithread operation edition.
Improving the SSE optimization of _vp_couple.
Modifying the calculation which disperses the load at the time of the multithread operation of _vp_couple.
Reducing the cord/code of _vp_offset_and_mix.

Regards
Franklin

Title: Ogg Vorbis optimized for speed
Post by: dariju on 2006-11-11 14:28:58

I've made a little comparison between standard

OggEnc Win32 aoTuV beta5 2006/11/11
and
oggenc283_sse3mt_lancer20061110

Results are quite impressive:

Code: [Select]

c:\ogg>oggenc -q2 "thom yorke - harrowdown hill.wav"
Opening with wav module: WAV file reader
Encoding "thom yorke - harrowdown hill.wav" to
         "thom yorke - harrowdown hill.ogg"
at quality 2,00
        [100,0%] [ 0m00s remaining] -

Done encoding file "thom yorke - harrowdown hill.ogg"

        File length:  4m 41,0s
        Elapsed time: 0m 15,0s
        Rate:         18,7956
        Average bitrate: 92,6 kb/s


c:\ogg>oggenc2 -q2 "thom yorke - harrowdown hill.wav"
Opening with wav module: WAV file reader
Encoding "thom yorke - harrowdown hill.wav" to
         "thom yorke - harrowdown hill.ogg"
at quality 2,00
        [100,0%] [ 0m00s remaining] \

Done encoding file "thom yorke - harrowdown hill.ogg"

        File length:  4m 41,0s
        Elapsed time: 0m 2,834s
        Rate:         99,482475
        Average bitrate: 92,6 kb/s

Lancer is 5.5 times faster...

Title: Ogg Vorbis optimized for speed
Post by: Seimour on 2006-11-12 11:51:20

Anyone knows where to get one of these optimized vorbis builds for Linux?. Can this patch (http://homepage3.nifty.com/blacksword/patch_vorbis_lancer20061110.zip) be applied to vorbis' source code and then be built? Using which compiler?

Thanks in advance!

Title: Ogg Vorbis optimized for speed
Post by: nyaochi on 2006-11-13 12:21:45

Quote from: dariju on 2006-11-11 14:28:58

Lancer is 5.5 times faster...

Yeah, I wanted to modify the title of this thread but couldn't.

Quote from: Seimour on 2006-11-12 11:51:20

Anyone knows where to get one of these optimized vorbis builds for Linux?. Can this patch (http://homepage3.nifty.com/blacksword/patch_vorbis_lancer20061110.zip) be applied to vorbis' source code and then be built? Using which compiler?

Intel C/C++ compiler. I'm not sure whether if it can be compiled with the linux version of the compiler.

Title: Ogg Vorbis optimized for speed
Post by: Firon on 2006-11-13 21:38:57

He dropped GCC support a while back, didn't he? Or was it just code that's untested on gcc?

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2006-11-29 09:42:19

Quote from: Gabriel on 2006-11-29 09:02:58

Current Lame versions (3.98) can be compiled in 64bits mode, that is how I am using it most of the time.
Using VC8 as a compiler, it increases encoding speed by about 20% compared to 32bits mode.

Title: Ogg Vorbis optimized for speed
Post by: Josiah McGuckin on 2006-12-22 23:48:22

While trying to sound as un-redundant as possible, I observed speeds as high as 24.2x (primarily in the range of 16-20x), compared to only 10.0'ish from what I remember last... and while I didn't run actual comparisons of the files using any kind of sophisticated process, I did ABX and compare bitrates/qualities of files using both aotuv-b5 and the latest lancer build and was unable to distinguish between the two. Beautiful ... thanks guys!

Title: Ogg Vorbis optimized for speed
Post by: ckjnigel on 2007-01-14 23:06:34

Steve Jobs disappointed me by not coming out with a 100 Gb iPod to hold the 9,800 M4As I'd spent since Christmas transcoding from FLAC to Nero M4As.
So I've set to re-encoding as Ogg Vorbis q 4.5 in hopes that can make them fit on my 60 Gb Cowon iAudio.
It's proceeding right now on the Sony Vaio 2 GHz Core Duo notebook I bought Tuesday.
OMG, is it ever fast with the SSE3 MT build! I'm surprised how much faster it is than using SSE2 optimization on the Athlon64 3300+ desktop under Win X64.
Thanks so much to all the developers!

Title: Ogg Vorbis optimized for speed
Post by: vinnie97 on 2007-01-14 23:57:23

Good choice, more people should give Steve Jobs the finger.

Title: Ogg Vorbis optimized for speed
Post by: PatchWorKs on 2007-02-09 14:39:02

New Vorbis optimization project here (http://softlab.technion.ac.il/project/OggVorbis/html/index.htm), check it out !

Title: Ogg Vorbis optimized for speed
Post by: Firon on 2007-02-09 21:25:23

Someone test it and tell us how it compares to Lancer.

Title: Ogg Vorbis optimized for speed
Post by: MedO on 2007-02-09 23:26:03

Quote from: Firon on 2007-02-09 21:25:23

Someone test it and tell us how it compares to Lancer.

They tuned Xiph Vorbis 1.0.1 and boast a performance increase of 18%. I can't seem to find a date on their page or in the documentation. Also, the download size for the "binaries" package is an impressive 90MiB. Nothing to see here, move along...

Title: Ogg Vorbis optimized for speed
Post by: Mangix on 2007-02-10 00:21:02

i tested it and on their intel-optimized binary, the radio.wav file which was included took 45 seconds to encode on -q10. the non-intel binary took 49.

i also tested out Lancer's builds(SSE2-Threaded one) and i got 22 seconds.

as MedO said, nothing to see here.

Title: Ogg Vorbis optimized for speed
Post by: rjamorim on 2007-02-10 00:37:16

Quote from: ckjnigel on 2007-01-14 23:06:34

Steve Jobs disappointed me by not coming out with a 100 Gb iPod to hold the 9,800 M4As I'd spent since Christmas transcoding from FLAC to Nero M4As.

Honestly, do you need to carry 9800 songs with you?

Title: Ogg Vorbis optimized for speed
Post by: vinnie97 on 2007-02-10 16:48:08

Need isn't as significant as the ability to do so. It gives one the greatest variety of music when not tethered to their PC.

Title: Ogg Vorbis optimized for speed
Post by: singaiya on 2007-02-10 16:59:16

Some people travel and are not at home for weeks or months at a time. iTunes says my 55 gb "checked playlist" lasts 30.7 days. It was enough for the last time I was out of town for 3 weeks, and it's nice to know I didn't run out of music.

Title: Ogg Vorbis optimized for speed
Post by: HydroFred on 2007-02-11 12:00:05

How much faster are SSE2 and SSE3 versions of lancer compared to SSE? My CPU only supports SSE, and I would like to know how much boost I can expect from a CPU upgrade.

Are the MT versions ~twice as fast on a dual core cpu, e.g. Core2Duo / Athlon64 X2 ?

FLAC -> OGG conversion runs at 21x on my system (Athlon XP-M 2600+ @ 10x200), how much can I expect from an Athlon64 X2 3800+ ?

Title: Ogg Vorbis optimized for speed
Post by: haregoo on 2007-02-11 12:47:51

Quote from: HydroFred on 2007-02-11 12:00:05

How much faster are SSE2 and SSE3 versions of lancer compared to SSE?

SSE2/3 has less importance on lancer. But MT enables ~1.4 times faster encoding.
Benchmark on Athlon64 X2 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=29161&view=findpost&p=390175) and Core2Duo (http://nyaochi.sakura.ne.jp/encoder-benchmark/result-20061103.html) FYI.

Lancers MT makes use of up to 2 core per encoding. If you have quad core, you have to run 2 instances at a time, Lancer is enough fast tho

Title: Ogg Vorbis optimized for speed
Post by: MedO on 2007-02-11 13:25:23

Quote from: haregoo on 2007-02-11 12:47:51

Quote from: HydroFred on 2007-02-11 12:00:05

How much faster are SSE2 and SSE3 versions of lancer compared to SSE?

SSE2/3 has less importance on lancer. But MT enables ~1.4 times faster encoding.
Benchmark on Athlon64 X2 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=29161&view=findpost&p=390175) and Core2Duo (http://nyaochi.sakura.ne.jp/encoder-benchmark/result-20061103.html) FYI.

Lancers MT makes use of up to 2 core per encoding. If you have quad core, you have to run 2 instances at a time, Lancer is enough fast tho

I kind of wonder why you don't just run two single-threaded encoders instead of the multithreaded one. Usually you'll encode multiple files anyway. That way it'd be ~2x as fast instead of 1.4x...

Title: Ogg Vorbis optimized for speed
Post by: pepoluan on 2007-02-12 16:03:13

Quote from: MedO on 2007-02-11 13:25:23

I kind of wonder why you don't just run two single-threaded encoders instead of the multithreaded one. Usually you'll encode multiple files anyway. That way it'd be ~2x as fast instead of 1.4x...

We techheads will do that. But simpler users (i.e. the overwhelming majority of PC users) tend to encode one at a time.

Title: Ogg Vorbis optimized for speed
Post by: Farch on 2007-08-28 12:39:23

ye, new problems are coming up, there are 4-core cpu`s on the market, and early will be 8. We need a solution, unlike:
1. lancer need`s to check out how many cores in system
2. use them all
3. it will be bad idea to limit him on 8 cores...(maybe this is not the end?)

Title: Ogg Vorbis optimized for speed
Post by: slav!x on 2007-08-28 15:19:22

Quote from: Farch on 2007-08-28 12:39:23

ye, new problems are coming up, there are 4-core cpu`s on the market, and early will be 8. We need a solution, unlike:
1. lancer need`s to check out how many cores in system
2. use them all
3. it will be bad idea to limit him on 8 cores...(maybe this is not the end?)

and add SSE5 and SSE6 support!!!

Title: Ogg Vorbis optimized for speed
Post by: MickeyP on 2008-02-20 16:40:10

Does anyone know if there are lancer static-built binaries for windows I can download anywhere? This would be very useful, thanks.

Title: Ogg Vorbis optimized for speed
Post by: dutch109 on 2008-02-20 18:52:15

Quote from: MickeyP on 2008-02-20 16:40:10

Does anyone know if there are lancer static-built binaries for windows I can download anywhere? This would be very useful, thanks.

http://homepage3.nifty.com/blacksword/index.htm (http://homepage3.nifty.com/blacksword/index.htm)

http://translate.google.com/translate?u=ht...Flanguage_tools (http://translate.google.com/translate?u=http%3A%2F%2Fhomepage3.nifty.com%2Fblacksword%2Findex.htm&langpair=ja|en&hl=en&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools) (same page translated by Google)

Title: Ogg Vorbis optimized for speed
Post by: MickeyP on 2008-02-20 20:10:30

Can someone who owns the Intel compiler please build and make available windows static library binaries ( static-link libraries (*.lib) that do not depend on any ogg/vorbis dll at run time ) ? Ideally these would be compiled using the release multi--threaded version of MSVCRT, thanks!

Title: Re: Ogg Vorbis optimized for speed
Post by: VEG on 2016-08-16 16:29:43

The Lancer's webpage isn't available more, but you can use a copy on the Web Archive:
https://web.archive.org/web/20100217183320/http://homepage3.nifty.com/blacksword/
The fastest versions:
https://web.archive.org/web/20100217183320/http://homepage3.nifty.com/blacksword/oggenc283_sse3_lancer20061110.zip (SSE3, single threaded)
https://web.archive.org/web/20100217183320/http://homepage3.nifty.com/blacksword/oggenc283_sse3mt_lancer20061110.zip (SSE3, multi threaded)

Still the fastest Ogg Vorbis encoder. What a pity that the author abandoned this project.

Title: Re: Ogg Vorbis optimized for speed
Post by: saratoga on 2016-08-16 18:03:56

I think very few people care about audio encoder speed in the era of fast multicore processors.

Title: Re: Ogg Vorbis optimized for speed
Post by: bat_guano on 2016-08-17 09:39:25

Quote from: saratoga on 2016-08-16 18:03:56

I think very few people care about audio encoder speed in the era of fast multicore processors.

Hi
Currently I'm using...

Quote

$ oggenc -v
OggEnc v2.83 (Lancer [Nov 14 2009](SSE) based on aoTuV b5)

With a single-core machine.
Is there anything faster out there - for LINUX?

Title: Re: Ogg Vorbis optimized for speed
Post by: VEG on 2016-08-17 10:44:03

Quote from: saratoga on 2016-08-16 18:03:56

I think very few people care about audio encoder speed in the era of fast multicore processors.

I'd rather wait for 3-4 times less time with almost the same quality.

oggenc_general_x64 - http://www.rarewares.org/files/ogg/oggenc2.88-1.3.5-x64.zip
oggenc_lancer_sse3mt - https://web.archive.org/web/20100217183320/http://homepage3.nifty.com/blacksword/oggenc283_sse3mt_lancer20061110.zip

Code: [Select]

C:\Users\VEG\Desktop>oggenc_general_x64 -q 0 test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 0.00
        [100.0%] [ 0m00s remaining] -

Done encoding file "test.ogg"

        File length:  9m 51.0s
        Elapsed time: 0m 12.0s
        Rate:         49.2648
        Average bitrate: 51.8 kb/s


C:\Users\VEG\Desktop>oggenc_lancer_sse3mt -q 0 test.wav
Opening with wav module: WAV file reader
Encoding "test.wav" to
         "test.ogg"
at quality 0.00
        [100.0%] [ 0m00s remaining] -

Done encoding file "test.ogg"

        File length:  9m 51.0s
        Elapsed time: 0m 3.841s
        Rate:         153.912430
        Average bitrate: 57.6 kb/s

12.0 seconds vs. 3.8 seconds for one song.

I'm using it for coding of music for my phone.

Title: Re: Ogg Vorbis optimized for speed
Post by: saratoga on 2016-08-17 15:54:45

You mean from multithreading? I can get an 8-10x speed up using foobar and the stock encoder with no loss of quality. I don't think using the mt version makes sense.

Anyway, if you are interested in encoding speed, you should work on it. Multithreading may not make sense but further x64 asm or see intrinsics would likely help a lot.

Title: Re: Ogg Vorbis optimized for speed
Post by: dutch109 on 2016-08-17 22:54:55

If you are interested in an up to date & fast (faster than stock libvorbis) Vorbis encoder, check out https://hydrogenaud.io/index.php/topic,109766.0.html

Title: Re: Ogg Vorbis optimized for speed
Post by: bat_guano on 2016-08-17 23:56:22

Quote from: bat_guano on 2016-08-17 09:39:25

Quote from: saratoga on 2016-08-16 18:03:56
I think very few people care about audio encoder speed in the era of fast multicore processors.
Hi
Currently I'm using...
Quote
$ oggenc -v
OggEnc v2.83 (Lancer [Nov 14 2009](SSE) based on aoTuV b5)

With a single-core machine.
Is there anything faster out there - for LINUX?

Bump

Title: Re: Ogg Vorbis optimized for speed
Post by: bat_guano on 2016-08-29 21:18:31

Quote from: bat_guano on 2016-08-17 23:56:22

Quote from: bat_guano on 2016-08-17 09:39:25
Quote from: saratoga on 2016-08-16 18:03:56
I think very few people care about audio encoder speed in the era of fast multicore processors.
Hi
Currently I'm using...
Quote
$ oggenc -v
OggEnc v2.83 (Lancer [Nov 14 2009](SSE) based on aoTuV b5)

With a single-core machine.
Is there anything faster out there - for LINUX?
Bump

Has this thread died, or is it a very very difficult question?

Title: Re: Ogg Vorbis optimized for speed
Post by: saratoga on 2016-08-29 21:42:35

I think dutch109 answered that question anyway with a link to faster builds for single core machines.

Quote from: bat_guano on 2016-08-29 21:18:31

Quote from: bat_guano on 2016-08-17 23:56:22
Quote from: bat_guano on 2016-08-17 09:39:25
Quote from: saratoga on 2016-08-16 18:03:56
I think very few people care about audio encoder speed in the era of fast multicore processors.
Hi
Currently I'm using...
Quote
$ oggenc -v
OggEnc v2.83 (Lancer [Nov 14 2009](SSE) based on aoTuV b5)

With a single-core machine.
Is there anything faster out there - for LINUX?
Bump
Has this thread died, or is it a very very difficult question?

Did you see dutch109's post above? It has a link to newer, more optimized builds than you are running.

Title: Re: Ogg Vorbis optimized for speed
Post by: bat_guano on 2016-08-29 22:56:37

Quote from: saratoga on 2016-08-29 21:42:35

Did you see dutch109's post above? It has a link to newer, more optimized builds than you are running.

I don't see a link to optimized linux builds, it is a thread about patches.

Title: Re: Ogg Vorbis optimized for speed
Post by: greynol on 2016-08-29 23:11:23

https://www.freac.org/en

Title: Re: Ogg Vorbis optimized for speed
Post by: VEG on 2016-09-15 06:56:20

Quote from: saratoga on 2016-08-29 21:42:35

It has a link to newer, more optimized builds than you are running.

Unfortunately, it only uses the name of the Lancer, but it works much slower than the original Lancer. Quality is almost the same, all Vorbis encoders provide almost the same quality. libvorbis doesn't use patches from the latest aoTuV versions, just because improvements are barely perceptible, if they are perceptible at all. I don't think that authors of the libvorbis would ignore aoTuV patches if improvements were indisputable.

Title: Re: Ogg Vorbis optimized for speed
Post by: saratoga on 2016-09-15 16:22:14

What is the difference in speed between the current and original builds?

Title: Re: Ogg Vorbis optimized for speed
Post by: dutch109 on 2016-09-16 20:24:23

Quote from: VEG on 2016-09-15 06:56:20

libvorbis doesn't use patches from the latest aoTuV versions, just because improvements are barely perceptible, if they are perceptible at all. I don't think that authors of the libvorbis would ignore aoTuV patches if improvements were indisputable.

They might also ignore AoTuv changes because they disagree with the result it produces, see the changelog from libvorbis 1.3.2:

Quote

...
* vorbisenc: Back out an [old] AoTuV HF weighting that was
first enabled in 1.3.0; there are a few samples where I
really don't like the effect it causes.
...

https://github.com/xiph/vorbis/blob/master/CHANGES#L36

Title: Re: Ogg Vorbis optimized for speed
Post by: ani_Jackal3 on 2020-10-21 13:29:40

Quote from: dutch109 on 2016-09-16 20:24:23

They might also ignore AoTuv changes because they disagree with the result it produces, see the changelog from libvorbis 1.3.2

Noticed bit rate bloat too, Like some samples average 200 ~ 385kbps on AoTuv . Yet libvorbis stays within 100 ~ 190kbps with zero effect to sound quality?.

HydrogenAudio

Lossy Audio Compression => Ogg Vorbis => Ogg Vorbis - Tech => Topic started by: nyaochi on 2004-11-04 19:11:58