I noticed something weird today, encoding the same flac audio file with opusenc, once on x86_64 and once on an ARM platform I get very different results:
opusenc --vbr --bitrate 128
x86:
Encoding using libopus 1.2 (audio)
-----------------------------------------------------
Input: 44.1kHz 2 channels
Output: 2 channels (2 coupled)
20ms packets, 128kbit/sec VBR
Preskip: 356
Encoding complete
-----------------------------------------------------
Encoded: 6 minutes and 27.4 seconds
Runtime: 4 seconds
(96.85x realtime)
Wrote: 6616545 bytes, 19370 packets, 390 pages
Bitrate: 135.597kbit/s (without overhead)
Instant rates: 1.2kbit/s to 254.4kbit/s
(3 to 636 bytes per packet)
Overhead: 0.76% (container+metadata)
ARM:
Encoding using libopus 1.2 (audio)
-----------------------------------------------------
Input: 44.1kHz 2 channels
Output: 2 channels (2 coupled)
20ms packets, 128kbit/sec VBR
Preskip: 312
Encoding complete
-----------------------------------------------------
Encoded: 6 minutes and 27.38 seconds
Runtime: 52 seconds
(7.45x realtime)
Wrote: 6024359 bytes, 19369 packets, 390 pages
Bitrate: 123.401kbit/s (without overhead)
Instant rates: 1.2kbit/s to 257.2kbit/s
(3 to 643 bytes per packet)
Overhead: 0.813% (container+metadata)
I can maybe understand that behaviour is not entirely deterministic, but is it normal that the rate-control decisions are so very different?
I assume the difference is due to fixed-point vs floating-point, but that being said it still looks a little large... Can you upload the file you're encoding somewhere? Or at least a short segment that clearly shows the behaviour?
Sure, I'll upload the song somehwere... I'll send you the link via PM
Sure, I'll upload the song somehwere... I'll send you the link via PM
Where did you get your ARM build? Can you try both --complexity 9 and --complexity 10 on ARM?
Sure, I'll upload the song somehwere... I'll send you the link via PM
Where did you get your ARM build? Can you try both --complexity 9 and --complexity 10 on ARM?
Compiled it myself directly from source (with gcc 6.3.0)
This is the result with complexity 9:
Notice: Using resampling with complexity<10.
Opusenc is fastest with 48, 24, 16, 12, or 8kHz input.
Encoding using libopus 1.2.1 (audio)
-----------------------------------------------------
Input: 44.1kHz 2 channels
Output: 2 channels (2 coupled)
20ms packets, 128kbit/sec VBR
Preskip: 312
Encoding complete
-----------------------------------------------------
Encoded: 6 minutes and 27.38 seconds
Runtime: 55 seconds
(7.043x realtime)
Wrote: 6024359 bytes, 19369 packets, 390 pages
Bitrate: 123.401kbit/s (without overhead)
Instant rates: 1.2kbit/s to 257.2kbit/s
(3 to 643 bytes per packet)
Overhead: 0.813% (container+metadata)
Is this difference audible? If so, is there any way to test the fixed-point version of the Opus encoder from the x64 CPU? I'd like to listen to the fixed-point sound so that I can confirm that there are no serious artifacts. I have more than 1000 test tracks.
Is this difference audible? If so, is there any way to test the fixed-point version of the Opus encoder from the x64 CPU? I'd like to listen to the fixed-point sound so that I can confirm that there are no serious artifacts. I have more than 1000 test tracks.
I'm not sure... probably not at these bitrates. I don't have good enough equipment or hearing in any case (never been good at ABX-ing)... maybe if I'd find a similar disparity at much lower bitrates (I'll check it out later).
I'm not making any claims about quality, just wanted to report on this oddity that I noticed.
I don't know much about ARM, but why isn't the ARM implementation using floating point, don't most modern ARM chips have hardware floating point operatoins?
I don't know much about ARM, but why isn't the ARM implementation using floating point, don't most modern ARM chips have hardware floating point operatoins?
Well, many people compile fixed-point on ARM, but by default (no configure option), it's going to be floating point. Can you give me the exact options you used for both libopus and opus-tools? Also, just check --complexity 10 in case opusenc somehow defaulted to 9.
Well, many people compile fixed-point on ARM, but by default (no configure option), it's going to be floating point. Can you give me the exact options you used for both libopus and opus-tools? Also, just check --complexity 10 in case opusenc somehow defaulted to 9.
I think the complexity setting works fine. The output is identical with --comp 10 and without (in terms of filesize and bitrate... I think md5sum only differs because encoding options in comment tag)
Are you talking about the configure options? It's a Gentoo distribution, so I gotta check the ebuild files:
for libopus:
INTRINSIC_FLAGS="cpu_flags_x86_sse cpu_flags_arm_neon"
<snip>
for i in ${INTRINSIC_FLAGS} ; do
use ${i} && myeconfargs+=( --enable-intrinsics --enable-update-draft)
opus-tools:
src_configure() {
econf $(use_with flac)
Just flac support... are there any other relevant configure options here? I don't think opus-tools is invoking
Or did you mean the compiler options? Here they are anyway:
CHOST="armv7a-hardfloat-linux-musleabi
CC=gcc
CXX=g++
CFLAGS="-O2 -pipe -fomit-frame-pointer -march=native -mtune=native"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,--hash-style=gnu,-O1"
It sounds to me like you can't reproduce this behaviour? I forgot to mention (sorry!) this system is running musl libc... maybe that's causing it somehow?
i attached two builds. same packages. only difference is libopus-0.dll.
It sounds to me like you can't reproduce this behaviour? I forgot to mention (sorry!) this system is running musl libc... maybe that's causing it somehow?
Yeah, I'm having problems reproducing... Can you try converting the file to a 48 kHz raw file and using opus_demo to encode it? See the same problem?
i attached two builds. same packages. only difference is libopus-0.dll.
Many thanks. I will test this.
Interesting... in order to investigate this further I compiled opus-tools and all dependencies with minimal CFLAGS, debug optimizations/symbols and linked it statically: The behaviour disappeared. Now I've just got to bisect the issue until I find which options causes this strange behaviour... It's probably going to take a while.
Well.. this is embarrassing. Turns out, that compiling your entire system with "-funsafe-math-optimizations" is actually... for lack of a better term... unsafe! Who could have known?! Certainly not this idiot trying to nickle and dime his CPU for every last bit of performance.
This mode enables optimizations that allow arbitrary reassociations and transformations with no accuracy guarantees. It also does not try to preserve the sign of zeros.
libopus is actually fine with this optimization, but my libc (when it is compiled with this flag) - I guess - returns wrong results for some math operations invoked by opus. Lesson learned!
One more thing: Now the bitrates are almost identical, but the preskip values are still different. Is this normal?
Well.. this is embarrassing. Turns out, that compiling your entire system with "-funsafe-math-optimizations" is actually... for lack of a better term... unsafe! Who could have known?! Certainly not this idiot trying to nickle and dime his CPU for every last bit of performance.
libopus is actually fine with this optimization, but my libc (when it is compiled with this flag) - I guess - returns wrong results for some math operations invoked by opus. Lesson learned!
libopus uses transcendental functions (exp, sin/cos, ...) that are implemented in libm and rely on IEEE math even though libopus itself is designed to be safe with unsafe optimizations.
One more thing: Now the bitrates are almost identical, but the preskip values are still different. Is this normal?
No, that's not normal. What preskip values are you getting with the different setups?
No, that's not normal. What preskip values are you getting with the different setups?
Strange... this is what I'm seeing now after fixing (I hope) the math problems:
x86_64:
Encoding using libopus 1.2.1 (audio)
-----------------------------------------------------
Input: 44.1kHz 2 channels
Output: 2 channels (2 coupled)
20ms packets, 128kbit/sec VBR
Preskip: 356
Encoding complete
-----------------------------------------------------
Encoded: 6 minutes and 27.4 seconds
Runtime: 6 seconds
(64.57x realtime)
Wrote: 6618802 bytes, 19370 packets, 390 pages
Bitrate: 135.644kbit/s (without overhead)
Instant rates: 1.2kbit/s to 254.4kbit/s
(3 to 636 bytes per packet)
Overhead: 0.759% (container+metadata)
ARM:
Encoding using libopus 1.2.1 (audio)
-----------------------------------------------------
Input: 44.1kHz 2 channels
Output: 2 channels (2 coupled)
20ms packets, 128kbit/sec VBR
Preskip: 312
Encoding complete
-----------------------------------------------------
Encoded: 6 minutes and 27.38 seconds
Runtime: 53 seconds
(7.309x realtime)
Wrote: 6620372 bytes, 19369 packets, 390 pages
Bitrate: 135.683kbit/s (without overhead)
Instant rates: 1.2kbit/s to 254.8kbit/s
(3 to 637 bytes per packet)
Overhead: 0.759% (container+metadata)
356 vs 312 samples. All the same track that I've sent you earlier.
356 vs 312 samples. All the same track that I've sent you earlier.
Oh that... make sure your versions match. The exact behaviour of the resampler (chopping samples in the encoder vs in the decoder) changed recently and for 44.1 kHz, you get 356 is what you get before the change and 312 is what you get after the change. So you're probably just using different versions.
Oh that... make sure your versions match. The exact behaviour of the resampler (chopping samples in the encoder vs in the decoder) changed recently and for 44.1 kHz, you get 356 is what you get before the change and 312 is what you get after the change. So you're probably just using different versions.
Eh, it's both libopus 1.2.1 though?! ... *shrug*
Edit: Oh, wait, you're talking about opus-tools aren't you? I didn't even notice that I was running different versions of that...