Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Follow-ups
- Re: Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

2020-10-20 18:10:29

RESULTS

FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Blocked ANOVA analysis

Number of listeners: 60
Critical significance:  0.05
Significance of data: 0.00E+000 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of         Degrees     Sum of    Mean
variation         of Freedom  squares   Square    F      p

Total              299         455.02
Testers (blocks)    59          57.64
Codecs eval'd        4         299.02   74.75   179.36  0.00E+000
Error              236          98.36    0.42
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:   0.232

Means:

HIGH     USAC     OPUS     HE-AAC   LOW      
  4.74     3.08     3.07     2.69     1.64   

---------------------------- p-value Matrix ---------------------------

         USAC     OPUS     HE-AAC   LOW      
HIGH     0.000*   0.000*   0.000*   0.000*   
USAC              0.910    0.001*   0.000*   
OPUS                       0.001*   0.000*   
HE-AAC                              0.000*   
-----------------------------------------------------------------------

HIGH is better than USAC, OPUS, HE-AAC, LOW
USAC is better than HE-AAC, LOW
OPUS is better than HE-AAC, LOW
HE-AAC is better than LOW

• HE-AAC : Statistically inferior to both OPUS and USAC (Exhale). Generally speaking, it suffers from SBR artifacts (sizzling noise on sibilants, tonal or high pitched sound) and from smearing (pre-echo). Good stero image. Sound could be very pleasing but SBR is the cure and the poison. It saves a lot of bit which can be spent elsewhere but introduce those SBR artifacts.

• OPUS : Best tied with USAC. Suffers from a puffy sound: coarse, hiss, messy. Also have a narrowed stereo audible many times. It also introduces some artifacts. It performed very well with audiobooks and really poorly with classical music (issue I also noticed with 64 kbps test few months ago).

• USAC: Exhale encoder (32000 Hz) doesn't support SBR. It seems a bit naked to handle such strong bitrate starvation but surprisingly it performs very well and also competes with OPUS. It also sounded better than HE-AAC on every six group of samples (Billboard charts, classical music, HA.org, problem samples, audiobooks and movies). IgorC got few weeks ago the same conclusion. USAC has some issues at this bitrate: slightly narrowed stereo image on some samples, metallic sound which was more irritating, smearing and some other kind of common lossy artifacts.

• LOW ANCHOR: I was looking for something that wasn't too lowpassed. HE-AACv2 (with parametric stereo) at 24 kbps has a 15 KHz lowpass and seemed to be a good choice. I chose FDK encoder, which is CBR only at this mode and bitrate. I made a mistake and used it at 16 kbps for the 20 speech files. The main quality issues were narrow and sometimes fanciful stereo imaging. It also add a metallic sound and the effect increased a lot for 16 kbps encoding (lowpass at 12 KHz).

• HIGH ANCHOR: I used MP3 LAME VBR ~130kbps for my last comparison at 64 kbps and it wasn't strong enough to play the high anchor role. This time I chose something more robust and used the winner of my AAC test: Apple's encoder, set as Constraint VBR (CVBR) to target ~128kbps. Overall quality is very good, not always fully transparent but rarely annoying.

Now, let's see in details how encoders are performing accordingly on samples' genre.

MUSIC ONLY (40 samples)

SPEECH ONLY (20 samples)

Code: [Select]

••••••••••••••••••••••••
•MUSIC ONLY (40 samples•
••••••••••••••••••••••••

FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Blocked ANOVA analysis

Number of listeners: 40
Critical significance:  0.05
Significance of data: 0.00E+000 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of         Degrees     Sum of    Mean
variation         of Freedom  squares   Square    F      p

Total              199         294.66
Testers (blocks)    39          50.23
Codecs eval'd        4         177.09   44.27   102.55  0.00E+000
Error              156          67.34    0.43
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:   0.290

Means:

HIGH     USAC     OPUS     HE-AAC   LOW      
  4.65     3.06     2.88     2.56     1.78   

---------------------------- p-value Matrix ---------------------------

         USAC     OPUS     HE-AAC   LOW      
HIGH     0.000*   0.000*   0.000*   0.000*   
USAC              0.229    0.001*   0.000*   
OPUS                       0.031*   0.000*   
HE-AAC                              0.000*   
-----------------------------------------------------------------------

HIGH is better than USAC, OPUS, HE-AAC, LOW
USAC is better than HE-AAC, LOW
OPUS is better than HE-AAC, LOW
HE-AAC is better than LOW



•••••••••••••••••••••••••••
•AUDIOBOOKS & MOVIES GROUP•
•••••••••••••••••••••••••••

FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Blocked ANOVA analysis

Number of listeners: 20
Critical significance:  0.05
Significance of data: 0.00E+000 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of         Degrees     Sum of    Mean
variation         of Freedom  squares   Square    F      p

Total               99         158.45
Testers (blocks)    19           5.50
Codecs eval'd        4         129.51   32.38   105.00  0.00E+000
Error               76          23.44    0.31
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:   0.350

Means:

HIGH     OPUS     USAC     HE-AAC   LOW      
  4.91     3.45     3.13     2.93     1.36   

---------------------------- p-value Matrix ---------------------------

         OPUS     USAC     HE-AAC   LOW      
HIGH     0.000*   0.000*   0.000*   0.000*   
OPUS              0.077    0.004*   0.000*   
USAC                       0.247    0.000*   
HE-AAC                              0.000*   
-----------------------------------------------------------------------

HIGH is better than OPUS, USAC, HE-AAC, LOW
OPUS is better than HE-AAC, LOW
USAC is better than LOW
HE-AAC is better than LOW

• MUSIC: OPUS and USAC are tied, both are better than HE-AAC which is better than LOW ANCHOR (HE-AACv2 at 24 kbps).

• SPEECH: OPUS and USAC are tied, but only OPUS is better than HE-AAC. HE-AAC and USAC are tied on this group. I recall thtat the LOW ANCHOR is here HE-AACv2 at 16 kbps only (mistake of mine).

DETAILED RESULTS FOR MUSIC

TEST FAQ

Are these samples difficult ones?

Some of them. The 10 classical samples are musical parts I really enjoy. There was no selection based on difficulty. The 10 “Billboard” samples were made indiscriminately: 30 seconds coming from an exact range of each track (1 min 00 sec to 1 min 30 sec). But HA.org samples may be selected by members for their possible difficulty. And a final group of 10 samples is intentionally a possible “killer-samples” group. These samples and selection was already used for previous tests I made this year.
For this test I add 20 more samples. 12 for audiobooks (6 female, 6 male: german, italian, spanish, english, french, korean, dutch) and 8 for movies. Audiobooks are lossless sourced (no MP3 CD). Movies are Blu-Ray source (48000 Hz, 24 bit, lossless), downsampled to stereo. They were converted to 16 bit for blind test (ABC/HR refused to open 24 bit WAV/FLAC).

All samples can be downloaded at: https://www.dropbox.com/sh/zjephy3g54j4gur/AACjGhM9tabl26n7s4ihYl2Ra?dl=0

Methodology

Java ABC/HR is my software of choice. Volume was normalized and delay removed within foobar2000. Everything was decoded and converted as WAV file using foobar2000 and resampler at 48000 Hz. It’s a blind test but with no ABX sessions. I’d rather test non-extensively a wide set of samples and let small ranking errors be statistically vanished than spending a lot of time on a smaller set of samples. My hardware setup is very basic: laptop headphone output (no DAC), AKG Q701 headphone, moderate listening volume playback.

EDIT: I ranked 265 files (score < 5.0) and I made no mistakes (ranking the reference instead of the encoded one). Final score is therefore 265/265 and probability of guessing is very close to zero

Encoders setup

• HE-AAC: encoded through foobar2000's converter (AAC Apple Graphical Interface): CVBR 48 in HE profile. Metadata: qaac 2.67, CoreAudioToolbox 7.10.9.0, AAC-HE Encoder, CVBR 48kbps, Quality 96
• HIGH ANCHOR: encoded through foobar2000's converter (AAC Apple Graphical Interface): CVBR 48 in HE profile. Metadata: qaac 2.67, CoreAudioToolbox 7.10.9.0, AAC-LC Encoder, CVBR 128kbps, Quality 96
• LOW ANCHOR: encoded through foobar2000's converter (AAC FDK Graphical Interface): CBR 24 and accidentaly 16. Metadata: fdkaac 1.0.0, libfdk-aac 4.0.0, CBR 24kbps (and 16kbps)
• OPUS: encoded through foobar2000 command line encoder: VBR 48 (--quiet --bitrate 48 --vbr --ignorelength - %d). Metadata: libopus 1.3-26-ge85ed772, libopusenc 0.2.1-2-g9cb17c6
• USAC: encoded through foobar2000 command line encoder: mode 0 (0 %d). Metadata: none. Encoder is exhale-1.0.7-9323a9d0 from 2020.09.28

TABLE

(with the nice help of kamedo2 and his fantastic tool!!! thanks again

)

Re: Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Reply #1 – 2020-10-20 21:02:22

Quote from: fabiorug on 2020-10-20 20:42:47

How good Opus 1.3.1-26 is? Can you send me a build, I want to compare to opus 1.3.1-78 latest release appveyor Rillian opustools and opusrug (I'm the one who edited the code).

I joined the version I used for this test.

Re: Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Reply #2 – 2020-10-20 21:12:32

Quote from: fabiorug on 2020-10-20 21:11:35

RP have you a link of your test with 1.3.1 stock at 96 kbps or other bitrates?

Sorry I don't understand

Re: Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Reply #3 – 2020-10-20 21:46:25

Quote from: fabiorug on 2020-10-20 21:26:14

try opusrug

From your topic:

Quote

Opusrug uses lowest possible bitrate while having highest sound quality, good for high quality melodic music. I don't guarantee high quality, you should use for music at least 64 kbps (though it's optimized for 56-80 kbps),

Quote

this would be unfortunately my last update, as I don't plan changing the quality, add psychoacoustic. Also it isn't meant for low bitrates like 46 kbps.

I don't see why I should test Opusrug, which is not intended to be useful at 48 kbps, which is not intended for voice, and which is not developed anyway. It's an already dead project and was never appropriate to what I have tested here. These are your words (which I'm far to understand properly I must confess).

Re: Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Reply #4 – 2020-10-20 22:55:57

@guruboolez
Now it's really usefull test.
Glad to see for a first time for years a new encoder (exhale) which reach a quality of Opus and has all chances to get even better in future.

Quote from: fabiorug on 2020-10-20 21:29:46

it's deprecated on the opustools appveyor, 6 months have passed.

You are imagning things.
I've already told you that the last changes related to audio quality of Opus were done in 2018. No quality changes since then.
Cut your cheap talk.

Re: Personal listening test at 48 kbps: Opus, Exhale, HE-AAC

Reply #5 – 2020-11-14 09:58:56

Hi folks, very good tests! It is nice to see we have future AAC family king.

One note from my side. I must say I am very far from guru but like music and have good equipment to test some samples here. The most interesting for me was "changes" sample because I ABXed similar samples before ("Changes" reminds me well known Eig sample). I tested this sample with recent opus and exhale complies (1.0.8 mode0 32khz). Guru rated OPUS with 1.5 and exhale 3.5 , but for my ears "sand fall" sound artifact produced by exhale are much more irritating than OPUS distortions.

if anyone is interested in my encoded samples, they are here
Thanks

Notice