Skip to main content

Notice

If you are using a Hotmail or Outlook email address, please change it now, as Microsoft is rejecting all email from our service outright.
Topic: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I (Read 1401 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

15 years ago I made a huge listening test at ∼130 kbps: that guided me to Apple’s AAC for my portable usage. Results are now totally outdated. AAC was a state-of-the-art audio format in 2005, but it’s now an old one. In the meantime, OPUS was born and immediately appeared as the most efficient encoder in public listening tests. And for my own needs (I mainly listen to classical music), xHE-AAC already shown its strength over OPUS at 64 kbps!.

My knowledge was never properly updated since this big test. I made a few sparse blind tests but the results were not helpful. As I no longer knew what to use I often kept untouched lossless encodings for my digital devices these last years, or some high-bitrate AAC encodings to reduce size a bit—and I was even caught sending high resolution lossless files at 3000 kbps on my phone!
 
Of course, storage space is obviously not the same issue in 2020 than in 2005. While my small DAP and then iPod Classic were entirely filled with tiny MP3 and AAC files only, my memory cards must now host HD movies, complete series, Netflix’s cache, hundreds of 12 to 48 megapixels pictures, games and applications. More storage today, but so much digital stuff to manage… For this reason, I may be inclined to stick again with an efficient audio encoder and save space rather than lazily opt for lossless or an unnecessary high bitrate setting. But that means going back to work and doing a big blind listening test again :)
 
The questions I’m trying to answer are: what is the best suitable format for my own needs, and what are the most efficient and still enjoying settings? How far can I go below ∼130 kbps and still stay close to transparency?
 
For this test I compared AAC and OPUS at different bitrates: around 80, 96, 112 and 128 kbps. I even added AAC at ∼144 kbps. And to these nine encodings I also add a low anchor. That makes ten encodings per sample! Ideally, I’d like to put exhale xHE-AAC in the same pool but that would make too many things to test at the same time. Moreover, as I said, this test has a “practical” purpose and at the moment xHE-AAC is not really usable on an Android/portable platform. I’d nevertheless try a similar comparison with Exhale next year.
 
TESTED SAMPLES

The full picture will consist in a 150 samples listening test: 75 samples for classical music (a set I already used this year for a multiformat test at 64 kbps and for a 128 kbps AAC test) and 75 additional samples for other kind of music. The first part is already completed. I expect to perform the second part in the next few weeks. In the meantime, I decided to reveal the results of the 1st round.
These samples are well-known musical moments or parts I really enjoy. Some are very easy to encode and could be transparent even at 64 kbps, some others are much harder. I have deliberately chosen to not overrepresent problem samples and stay closer to average music I listen to on a daily basis.

The tested samples are available >here<.
 
ENCODERS AND SETTINGS
The last official releases of Apple AAC and OPUS were used:
Apple AAC 7.10.9.0 : -q36, -q45, -q54, q64, -q73 [expected bitrate: 80, 96 ,112, 128 and 144]
Opus 1.31 from official website: --bitrate 80, 96, 112, 128
FDK 4.0.0 HE-AACv2 at CBR 32 kbps as low anchor

I used True VBR (TVBR) mode instead of Constraint VBR (CVBR) for both formats (TVBR is recommended for both (>source< and >source<). I choose Apple’s AAC implementation over Fraunhofer’s ones according to the final result of my previous listening test.
 
 
BITRATE TABLE

Before beginning this test, I had to measure the average bitrate of each VBR setting. From experience I know that average bitrate may vary from one musical genre to another: the average values I get with my classical library is radically different from other musical genres. From ∼580 kbps with classical music and FLAC the bitrate often goes beyond 1000 kbps with louder hard rock/metal music. While variations are not so huge with lossy, they exist and may affect the comparison’s equity.

Therefore, I built a large test library of more than 240 albums: 91 classical music albums, a personal selection of 25 jazz albums, 25 for metal, 25 for electronic. I also add all 50 best-selling albums in USA and the 25 first discs from the Billboard hottest sales in the last decade. My data now includes a lot of popular stuff and is therefore more reliable and—I hope—more useful than ever. The whole set now makes 255 CDs for 247 hours of music.
A full detailed list is available as a work-in-progress datasheet
:

https://docs.google.com/spreadsheets/d/18lGNoBB0ZB2A4SGAw3_z-5L7gzl54MG3D_V7tf2jII8/edit?usp=sharing
 
For more comfort, here is a summary of the tested settings with 247 hours of music:

GENRE     AAC q36AAC q45AAC q54AAC q64AAC q73OPUS 80OPUS 96OPUS 112OPUS 128
AVERAGE76.7 kbps96.6 kbps111.8 kbps126.4 kbps143.9 kbps83.5 kbps100.4 kbps117.1 kbps133.8 kbps
CLASSICAL        75.6 kbps96.8 kbps112.7 kbps127.2 kbps142.2 kbps85.8 kbps103.4 kbps119.3 kbps136.8 kbps
JAZZ        80.8 kbps100.5 kbps115.7 kbps131.0 kbps149.4 kbps85.4 kbps102.1 kbps119.3 kbps135.8 kbps
METAL        76.0 kbps96.6 kbps111.7 kbps126.7 kbps145.5 kbps77.1 kbps93.3 kbps109.2 kbps125.2 kbps
ELECTRONIC77.9 kbps97.1 kbps112.1 kbps126.5 kbps144.2 kbps87.5 kbps105.5 kbps123.2 kbps140.7 kbps
50 BEST-SELLING76.6 kbps96.6 kbps111.7 kbps126.7 kbps145.4 kbps81.9 kbps98.5 kbps115.2 kbps131.7 kbps
25 BILLBOARD73.4 kbps91.6 kbps106.0 kbps119.4 kbps136.3 kbps83.1 kbps99.7 kbps116.4 kbps132.8 kbps

As you can see, bitrate is globally comparable from one setting to another but OPUS uses 4 to 7 kbps more at similar VBR settings. Unlike Apple’s AAC, OPUS VBR settings can be finely adjusted and can match more closely its competitor. But I decided to keep the most usual values and not lower OPUS bitrate (e.g --bitrate 128 instead of –bitrate 122). This light overweight also convinced me to add AAC at ∼144 kbps which is sometimes close to OPUS 128 (142 kbps for AAC and 137 for OPUS for classical music, and 144 vs 141 with electronic music). I also expect OPUS to be more efficient than AAC and comparing it at ~130 kbps vs AAC at 144 kbps makes sense to me.

 
RESULTS

 
LOW ANCHORAAC q36AAC q45AAC q54AAC q64AAC q73OPUS 80OPUS 96OPUS 112OPUS 128
1.593.013.944.334.664.853.404.024.334.61

 
Beautiful graph, isn’t it :)

I start with an issue: the venerable ff123’s Friedman analysis tool only work up to 8 competitors. Fortunately, kamedo2’s brilliant graph generator handle this without an issue! I simply draw some lines in a picture editor to see better the confidence limit at the top of the graph:
 

AAC 144 is better than OPUS 128 and AAC 128 (confidence >95%)
AAC 128 is better than OPUS 112 and AAC 112 (confidence >95%)
OPUS 128 is still tied to OPUS 112 and AAC 112 from a statistically point of view (95% confidence is not reached but seems to be between 90 and 95% so I guess it’s safe to claim than OPUS 128 is superior to OPUS 112 and AAC 112).

 
AAC performs surprisingly well here with classical music. Except for 80 kbps, Apple’s AAC encoder offers comparable performance than OPUS for my taste and for these 75 first samples dedicated to classical music. And AAC gets this excellent performance with a slightly lower bitrate! I must say that I’m a bit disappointed by OPUS which I expected to perform better, but not totally surprised either: I already noticed that OPUS at 64 kbps was much less impressive with classical than with other musical genre (score: 3.31 for classical and 4.06 otherwise). I’m quite sure that OPUS will end with a better score OPUS (and AAC with a lesser one) with the second group of samples.

What also surprises me is AAC 144 kbps being statistically better than AAC 128. The latter being close to transparency on almost everything on daily playback I couldn’t imagine a constantly better performance. But those extra bits are most often noticeable—on direct comparison at least. I’m sure that this difference should be much more subtle on daily listening but it can’t be fully ignored. From a practical point of view, AAC -q73 (≈144 kbps) and OPUS 128 (≈137 kbps here) have quite the same bitrate but AACs score is 0.25 higher. In other words Apple’s AAC can appear a bit more efficient than OPUS at mid bitrate—with classical music at least..

This part of the test also shows how quickly and strongly AAC suffers from bitrate starvation. While both OPUS and AAC get the exact same score at 112 kbps, AAC starts to show some limits at 96 kbps and then quickly drops from that point. The gap between 96 kbps and 80 kbps is huge with AAC (from 3.94 to 3.01). And during my previous test at 64 kbps with the same samples LC-AAC felt very deep, at 2.20. So OPUS is clearly much stronger at lower bitrate than LC-AAC and is considerably more acceptable at 80 kbps… and even at 64 kbps according to my previous test (score=3.31 at 64 kbps).

For the anecdote, I checked the results of kamedo2’s test. He tested different AAC encoders (FDK, FAAC and ffmpeg) and different samples, so it shouldn’t be strictly comparable. The gaps he got between 96 and 128 kbps are:
• FAAC = 0.56 (3.76-3.20)
• ffmpeg = 0.47 (3.44-2.97)
• FDK = 0.78 (4.27-3.49)
and my result with Apple’s AAC:
• Apple = 0.72 (4.66-3.94)
The gap between two advanced and mature AAC encoders (FDK and Apple) is nearly the same and is rather huge. It confirms that AAC-LC sweet spot seems to be higher than 100 kbps and probably lies between the 112 and 128 kbps range.

From an audible point of view, there is not much to say at 112 and 128 kbps: both OPUS and AAC are very similar and don’t have strong flaws. The differences are easier to catch below. Smearing is sometimes higher with AAC, sometimes higher with OPUS. AAC suffers from traditional lossy flaws which are most often obvious at 80 kbps: distortions, like missing matters in the spectrum, sounding sometimes like noise reduction filters. It’s very “MPEGish” I would say because it’s very similar to MP3—and from my souvenirs to WMA which doesn’t come from MPEG. On the other side OPUS has much less of these artifacts. The encoder replaces details with noise, giving often a coarse, hissy sound. At 80 kbps it sometimes feels a bit narrower, with some alteration of the stereo imaging. Sound is therefore confusing and a bit tiring. But at 80 kbps those issues are less irritating than what AAC produces. At 96 kbps AAC produces a cleaner output but could suffer from other distortions. I suspect OPUS algorithms to be more efficient with louder music so I have to go back to ABC/HR and test the 75 other samples to confirm it or not :)

Few words for the low anchor: it’s HE-AAC with parametric stereo. I don’t know yet how it sounds with pop/rock music but with classical the main issue is stereo imaging. It’s out of phase, wrong, giving headache in less than 30 seconds. I often checked the jack of my headphone, thinking it was partially out. I suspect it gives a better illusion without headphones and on a small device like a radio.
 




To answer my questions: now what can I use for daily listening?

Opus 96 seems very efficient and it ends with a score slightly higher than 4 (4.06), which literally means perceptible but not annoying. It should be good enough, no? Well, I’m not so sure. Let’s take another look to the results, and put some markers on the graph:


• With OPUS --bitrate 96, 36 samples on 75 were rated ≤4.0. In other words, 48% of the tested samples showed some real annoyance. It’s not very good and it confirms a basic experience: listen to a (classical) disc encoded at 96 kbps, and audible distortions and artifacts are constantly altering the pleasure.

• At 112 kbps, both OPUS and AAC end with the exact same score. 26 samples on 75 encoded with OPUS 112 were still more or less annoying (34% of the tested samples). AAC at the same bitrate is slightly worse: 29 samples rated ≤4.0 (39%). Again, it’s not satisfying.

• 128 kbps encoding is more reassuring. Both Opus and AAC suffered a bit with 14 samples on 75 (still, 19%…). 3 samples only are between 3.00 and 3.50 with Apple’s AAC while OPUS has still 8 samples in the same case. In other words, Apple’s AAC -q64 has a better score and less strong issues (and a smaller bitrate: -7,5 kbps).

• AAC 144 kbps offers an interesting performance: it only scores 4 times at 4.00 or below and the lowest score is 3.7. This performance is much closer to what I expect to be an annoyance-free setting. Final score of 4.85 is also quite high!

=>  so my choice would be either AAC at 128 kbps or AAC at 144 kbps



One more thing: how did subjective performance change in 15 years? My hearing is probably not so good anymore, the samples database is also different, and the encoder have improved. Direct comparison between two tests is therefore questionable. The 2005 graph for classical music (150 samples) is the following one:


Apple's AAC was rated at 4.22. Remember, the encoder just implemented VBR (which seemed to be CVBR at the time because bitrate didn't go below 128 kbps). Today it's 4.66!
The high anchor was LAME --preset standard (or -V2 now) and was rated at 4.58, being statistically better than competitors at 128 kbps. Today both AAC and OPUS at ∼130 seem to perform as well as MP3 VBR at 192 kbps. This assumption was also somehow confirmed recently by another listening test made last year by kamedo2:


OPUS seems also outperforming its ancestor: Ogg Vorbis.



So, the first part of my test is now over… I'll make a small break and I’ll begin the second part :) It will probably take few weeks.
In the meantime, comments are welcome!







Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #1
Thank you very much for your efforts! Posts like these are the reason I still keep returning to HA after all these years.
The results are very interesting and useful for choosing a good setting for everyday listening. I also listen mainly to classical  music and your findings agree with the (very limited) testing I did myself.
Proverb for Paranoids: "If they can get you asking the wrong questions, they don't have to worry about answers."
-T. Pynchon (Gravity's Rainbow)

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #2
Your tests seem to fit my views in the electronic sense. With AAC on it own 160k = 4.80, Vorbis = 4.50. Not get off topic has their even been one off 256kbps test & even silly 480kbps test?.
Got locked out on a password i didn't remember. :/

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #3
@guruboolez, for clarification's sake, when you said "VBR" in
Quote
Today both AAC and OPUS at ∼130 seem to perform as well as MP3 VBR at 192 kbps
 
 you actually meant CBR, right?

Also, congratulations on another meticulously executed test!
Listen to the music, not the media it's on.

Musepack --quality 6
Wavpack -hb4.55x5cvm

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #4
Thank you all for the reply :) It's nice to hear that!

@guruboolez, for clarification's sake, when you said "VBR" in
Quote
Today both AAC and OPUS at ∼130 seem to perform as well as MP3 VBR at 192 kbps

 you actually meant CBR, right?
No, I really meant VBR.

It's based on a risky comparison (2005 results vs 2020 results: different samples, and a listener who is now 15 years older). OPUS and AAC ended today with ~4.50. And during my old test I marked VBR MP3 ~4.50 (it was LAME 3.98 alpha: there were improvements since). On that fragile basis, I'm assuming that both OPUS and AAC could offer a similar quality at ~128 kbps to MP3 VBR at ~192 kbps.

On the other side Kamedo2 recently made a listening test, and it appeared that OPUS 96 is very close to the up-to-date LAME 3.100 CBR 192. My test also shows that an 0.5 gap exists between OPUS 96 and OPUS 128, and I'm pretty sure that the gap between MP3 VBR and MP3 CBR is not superior to 0.5. Therefore it strengthened my assumption that OPUS and AAC at ~128 kbps operate at the same quality level as LAME VBR --standard (V2, ~192 kbps).

I hope I made it more clearer :)

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #5
You certainly did. Thank you!

Edit: So nice to see you guys still at it, proving palpable progress has been made (still is, as xHE-AAC/USAC is proving it now)  in lossy audio coding - a field many a self-proclaimed golden ear have keenly dismissed as being stuck ages ago.
Listen to the music, not the media it's on.

Musepack --quality 6
Wavpack -hb4.55x5cvm

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #6
...and to think most people outside communities like HA, showing a stiff upper lip towards lossy audio formats, haven't the faintest it's all they been listening to all this time, whenever playing their beloved videos - on and offline - or watching digital TV.
Listen to the music, not the media it's on.

Musepack --quality 6
Wavpack -hb4.55x5cvm

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #7
...and to think most people outside communities like HA, showing a stiff upper lip towards lossy audio formats, haven't the faintest it's all they been listening to all this time, whenever playing their beloved videos - on and offline - or watching digital TV.

Oh, i had few on another site(ASR forums) attack me, When i asked have you got any proof on telling Raw video files from H.264, MP2, HEVC. When i said that HEVC could allow brodcast tier 720p video at 1 mbit. While unironically flaming me on how the ER4 has 0.65 ~ 1.7% THD, After showing a <0.4% ER4SR in another thread.  
Got locked out on a password i didn't remember. :/

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #8
Oh, i had few on another site(ASR forums) attack me, When i asked have you got any proof on telling Raw video files from H.264, MP2, HEVC. When i said that HEVC could allow brodcast tier 720p video at 1 mbit. While unironically flaming me on how the ER4 has 0.65 ~ 1.7% THD, After showing a <0.4% ER4SR in another thread.
 
That's also the same sort of obnoxious behaviour you get from placebophiles on Twitter, whenever asking them to prove whatever wacky claim they happen to be directing at (usually) lossy audio, at any particular moment.
Listen to the music, not the media it's on.

Musepack --quality 6
Wavpack -hb4.55x5cvm

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #9
Oh, i had few on another site(ASR forums) attack me, When i asked have you got any proof on telling Raw video files from H.264, MP2, HEVC. When i said that HEVC could allow brodcast tier 720p video at 1 mbit. While unironically flaming me on how the ER4 has 0.65 ~ 1.7% THD, After showing a <0.4% ER4SR in another thread.

That's also the same sort of obnoxious behaviour you get from placebophiles on Twitter, whenever asking them to prove whatever wacky claim they happen to be directing at (usually) lossy audio, at any particular moment.

I've had them lash out when i said how can to diss the ER4XR, When they struggled to tell Speakers with 10% for bass & 2% above 1KHz?. Needed tone down myself when they wen't screw lossy use lossless when i was happy with 160k AAC + Vorbis mix with no room to counter argue.
Got locked out on a password i didn't remember. :/

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #10
Sometimes I think this "quest for the best" culture has become, thanks to the power Google affords us all, inherent to the rather instantaneous results we, as a society, have grown used to aim for lately - as if an end-all panacea were actually a thing!

Proof of that is the usual load of what's the best this or that... questions pervading well meaning communities like Quora.

As a simple search result can attest, such questions usually come from completely newbies into something - who don't want to "waste" their time, patience or money (from day one) by (God forbids!) learning/adopting something "wrong":
be it something as complex and time-demanding as a programming language, or more trivial decisions such as which Linux distro to adopt for personal use or even a simple calculator phone app... or lossy audio, for that matter - which, were they to pick the, again, wrong choice, might make them look like they were on some sort of losing side later on.

So much so that, back to the Linux distro option, it strikes me as odd the distro community hub (distrowatch), in good olde "snake eats own tail" fashion, relies, of all things, on how much page views within their website each distro has been getting from its own visitors to rank them up on their home page as if in some sort of (sigh!) best of list!

With such rankings, one could be forgiven to think fiddling with new stuff or being a trail blazer these days has become something rather out of fashion.




Listen to the music, not the media it's on.

Musepack --quality 6
Wavpack -hb4.55x5cvm

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #11
@Guruboolez
Excellent work and really useful results. :)
Probably anything above 200k should be very high quality with Apple encoder.
Do you think that worst samples would go from 3.7 to 4.0 or above with 256k (cvbr/tvbr - apple)??

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #12
Do you think that worst samples would go from 3.7 to 4.0 or above with 256k (cvbr/tvbr - apple)??
I have no doubt about that. The distortions will decrease while the setting get stronger. At ~200 kbps those issues should vanish or become barely audible on critical listening conditions only (direct A/B comparison).

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #13
Guruboolez, it seems your listening tests are getting larger and larger! Truly impressive work, very thoroughly done!

Just one question since I'm not familiar with Apple's AAC encoder: would it have been possible to use mode "q37" or "q38" instead of the "q 36" at the low-rate end? It seems (according to your 200+-hour bit-rate test) that this mode rarely reaches your target rate while Opus almost always exceeds that target rate. So part of the reason that AAC scores relatively low at 80 kbps could be that Opus uses ~10% more bit-rate there.

Chris
If I don't reply to your reply, it means I agree with you.

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #14
Just one question since I'm not familiar with Apple's AAC encoder: would it have been possible to use mode "q37" or "q38" instead of the "q 36" at the low-rate end? It seems (according to your 200+-hour bit-rate test) that this mode rarely reaches your target rate while Opus almost always exceeds that target rate. So part of the reason that AAC scores relatively low at 80 kbps could be that Opus uses ~10% more bit-rate there.
Hi Chris, thanks for your kind words.
Apple's AAC works with few VBR steps: q36, q45, q54… There's nothing in the middle to adjust more precisely the target bitrate with VBR. -q36 is the closest to 80 kbps with VBR enabled.
There's indeed a true penalty here for AAC compared to OPUS with the classical music set at ~80kbps. The gap in quality should probably be a bit lower at identical bitrate.

Re: Personal Blind Listening Test: AAC vs OPUS from 80 to 140 kbps • PART I

Reply #15
I used True VBR (TVBR) mode instead of Constraint VBR (CVBR) for both formats (TVBR is recommended for both (>source< and >source<).

If I remember correctly CVBR was found to be slightly superior to TVBR for Apple's AAC encoder at 96-128 kbps in listening tests from around 2013-2014.

 
SimplePortal 1.0.0 RC1 © 2008-2020