HydrogenAudio

Hydrogenaudio Forum => Listening Tests => Topic started by: dsimcha on 2019-11-16 22:14:05

Title: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: dsimcha on 2019-11-16 22:14:05
In my own listening tests (admittedly on only a few songs), I've found Opus 96 kbps reliably transparent.  Others' observations on HA seem to agree.  Yet, in the public 96 kbps listening test (https://listening-test.coresv.net/results.htm), most samples were not found to be transparent at this bitrate.  I also have found Vorbis near-transparent at this bitrate--I can sometimes pick out subtle artifacts in critical listening but never notice anything obvious.  Yet, Vorbis scored worse than Opus in the same listening test.

Similarly, 128 kbps AAC (Fhg) seems transparent to me even with CBR and even 96 is close.  Yet, the most recent listening test (http://listening-tests.hydrogenaud.io/sebastian/mf-128-1/results.htm) at this bitrate suggests various AAC codecs perform similarly to Opus@96 kbps.

Why do public listening tests seem so much more pessimistic than my experiences or, in the case of Opus, others' experiences on this forum?
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: itisljar on 2019-11-17 12:25:54
Opus from 5 years ago and opus today are two different beasts. Optimizations were made, so sound quality is even better. There has been no, as far as I am aware, recent tests of codecs.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: IgorC on 2019-11-17 14:06:15
Opus from 5 years ago and opus today are two different beasts. Optimizations were made, so sound quality is even better.
Original poster talks about 96 kbps. 

During the last 5 years of development of Opus:
5-48  kbps - big quality improvements
56-80 kbps - very small improvements
80-500 kbps - microscopic improvements which are hard to detect  + bugfixes

  • Are exceptionally-hard samples typically selected for listening tests?
It was a mixed bag.  Though hard samples were well represented.

  • Do listening test participants typically have an exceptionally good ear for subtle artifacts?
Yes,  more than half of the results had come from well trained listeners.  In real life scenario a such listeners represent low percentage of all people.

  • Have codecs gradually improved with time such that the listening tests I cite are outdated?
No. Last audible improvements were made in:

Apple AAC  - approx. 10 years ago or so
Vorbis - 2011 (last version of Aotuv Beta 6)
MP3 LAME  - 2011 (last version was 3.99 which contained quality improvement and not just bugfixes and misc stuff). 3.100, 3.100.1 are here just for maintenance and bugfixes.
Opus - last measurable improvements for bitrate higher than 80 kbps were made in version 1.1, December 2013. Since then only lower bitrates were improved.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: IgorC on 2019-11-17 14:33:02
In my own listening tests (admittedly on only a few songs), I've found Opus 96 kbps reliably transparent.  Others' observations on HA seem to agree.  Yet, in the public 96 kbps listening test (https://listening-test.coresv.net/results.htm), most samples were not found to be transparent at this bitrate.  I also have found Vorbis near-transparent at this bitrate--I can sometimes pick out subtle artifacts in critical listening but never notice anything obvious.  Yet, Vorbis scored worse than Opus in the same listening test.

The thing about such public tests (96kbps and higher) is that participants are mainly trained listeners and not quite average user.
In order to obtain more realistic results by including more of average listeners, test should be conducted at lower bitrates such as 48-64 kbps.

For example this one http://www.mp3-tech.org/tests/aac_48/results.html . Look how well MP3 LAME 128 kbps performed, because it was used as high anchor and because of low bitrate test  there were plenty of average users.

Considering this, everything that performs as well as LAME MP3 128k VBR will be in a  transparent zone for big mass of people.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: dsimcha on 2019-11-17 18:51:37

For example this one http://www.mp3-tech.org/tests/aac_48/results.html . Look how well MP3 LAME 128 kbps performed, because it was used as high anchor and because of low bitrate test  there were plenty of average users.

Considering this, everything that performs as well as LAME MP3 128k VBR will be in a  transparent zone for big mass of people.

Interesting!  I've definitely successfully ABX'd LAME 128 VBR (-V5) on some songs where Opus 96 is perfectly transparent to me.  I listen to a lot of rock with lots of cymbals.  At non-transparent bitrates, LAME can produce some very obvious non-linear/robotic artifacts there.  When Opus becomes non-transparent, the artifacts sound more like subtle additive noise and stereo image distortion.  They're harder to describe and harder to pick out by listening hard to a specific instrument. 

Also, Opus 96 scores ~0.4 MOS points higher than LAME 128 in the public 96 kbps listening test.  I guess the tl;dr is that Opus 96 is non-transparent on killer samples and/or to people who really know what subtle artifacts to look for, but probably transparent in most other cases.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: IgorC on 2019-11-18 18:33:51
I guess the tl;dr is that Opus 96 is non-transparent on killer samples and/or to people who really know what subtle artifacts to look for, but probably transparent in most other cases.
Agree
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: ani_Jackal3 on 2020-10-17 20:33:45
Pretty much how 144kbps Vorbis is enough for me 99% of the time. Yet AAC at same setting sounds horrid with Ambient music(some need 224kbps to fix it!), FHG is better but seems to really hate Doom metal like wut?. And both encoders seem to have killer samples that fail to respond to 256 ~ 500kbps unlike Vorbis/Opus.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: includemeout on 2020-10-17 21:28:28
Watch it! (https://hydrogenaud.io/index.php?topic=3974#post_tos8)
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: ani_Jackal3 on 2020-10-17 21:42:04
Watch it! (https://hydrogenaud.io/index.php?topic=3974#post_tos8)

No idea why bother here. Lmao
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: C.R.Helmrich on 2020-10-18 00:00:48
Because you may learn a thing or two about e.g. the need for double-blind listening tests and the difference between a coding standard (e.g. AAC) and manufacturers of encoders generating files compliant with that standard (e.g. FhG). After that, I suggest you read your own sentence above again.
Yet AAC at same setting sounds horrid with Ambient music (some need 224kbps to fix it!), FHG is better but seems to really hate Doom metal like wut?. And both encoders seem to have killer samples that fail to respond to 256 ~ 500kbps unlike Vorbis/Opus.
Chris
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: includemeout on 2020-10-18 01:16:15
Watch it! (https://hydrogenaud.io/index.php?topic=3974#post_tos8)
  
No idea why bother here. Lmao
 
 All that Chris said plus the fact that by ignoring such precepts, your statements - regardless of how well-meaning they are - incur exactly into the anecdotal evidence trap mentioned in this thread's title.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: ani_Jackal3 on 2020-10-18 07:56:33
Watch it! (https://hydrogenaud.io/index.php?topic=3974#post_tos8)
 
No idea why bother here. Lmao

 All that Chris said plus the fact that by ignoring such precepts, your statements - regardless of how well-meaning they are - incur exactly into the anecdotal evidence trap mentioned in this thread's title.

I posted a sample before here where Vorbis needs Q9.5 on merzbow. There old samples i made for AAC that suck at 144k but don't respond to 320k, Why should i care if your not even going to reply back when i spent 30mins wasting time making 30 sec samples with the same fucking replies?. Is this bait are you that self unaware on how this place just reeks of AAC fanboyism, Even GB 128k face off on problem sample page shows AAC struggling on some content i don't get with Vorbis/Opus(1.3)?.

I was going to make a thread about my views on multi codecs but might do it else where.

Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: C.R.Helmrich on 2020-10-18 14:46:49
... old samples i made for AAC that suck at 144k but don't respond to 320k
I would be very helpful if you could point us to those samples so we don't have to search for them ourselves. And can you clarify what you mean by "don't respond to..."? As in "they become transparent at..."?
Quote
Even GB 128k face off on problem sample page shows AAC struggling on some content i don't get with Vorbis/Opus(1.3)?.
And Igor's recent test (https://hydrogenaud.io/index.php?topic=120007.0) shows, that Vorbis at 192 kbps struggles on some content that AAC (iTunes encoder) at 192 kbps doesn't struggle on.

By the way, the test done by GB (I assume that's guruboolez) also shows the shockingly different quality different AAC encoders produce at the same bit-rate.

Chris
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: Klimis on 2020-10-18 20:48:44
It doesn't help also the fact that all codecs seem to "break" in a very different way so it kind of makes it hard to rank them in the same kind of criteria because different listeners get "triggered" by different flaws. For example, for a long time I used to feel that 96kbps OPUS was my transparency threshold until I felt that something was off and after a lot of research I found out that alot of stuff I hear alot and I had trained my ears to expect elements to be heard in a certain way on the stereo field, they were being centered and I have not been able to "unhear" that since then. Now, other than that I d ont think any other codec can rival the quality of Opus at this bitrate, but still its a dealbreaker for me. So how do I quantify that in a test?
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: ani_Jackal3 on 2020-10-19 08:02:43
It doesn't help also the fact that all codecs seem to "break" in a very different way so it kind of makes it hard to rank them in the same kind of criteria because different listeners get "triggered" by different flaws. For example, for a long time I used to feel that 96kbps OPUS was my transparency threshold until I felt that something was off and after a lot of research I found out that alot of stuff I hear alot and I had trained my ears to expect elements to be heard in a certain way on the stereo field, they were being centered and I have not been able to "unhear" that since then. Now, other than that I d ont think any other codec can rival the quality of Opus at this bitrate, but still its a dealbreaker for me. So how do I quantify that in a test?

Sounds like me i used to stick with musepack/ogg at 160kbps, But after a while i could tell it was adding noise and weird stuff to the stereo field. Stopped using them wen't back to LAME V2 ~ V0 just used Wavpack hybrid at 384kbps to squish any issues.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: C.R.Helmrich on 2020-10-19 13:10:21
It doesn't help also the fact that all codecs seem to "break" in a very different way so it kind of makes it hard to rank them in the same kind of criteria because different listeners get "triggered" by different flaws. ... So how do I quantify that in a test?
You're absolutely right, different listeners perceive different artifacts differently strong. Similar to how different people may react differently to some medication in medical tests. That's why it's called a subjective test, and that's why typically, those tests are conducted with at least 10 or 20 people.

How to quantify the distortions you hear is (mostly) up to you. The only thing that a typical blind listening test (with a reference = uncoded version you compare against) requires you to do is to consider any difference you hear, between a coded version and the uncoded reference, as a degradation. How severe that degradation is (perceptible, but not annoying, ... very annoying) is entirely up to you. But sometimes you get a "low quality anchor" as a reference, which makes it a bit easier for you to assign a score.

And yes, if you're doing a lot of audio codec comparisons, your perception of artifacts is likely to change over time. So the statements
Quote
and I have not been able to "unhear" that since then.
after a while i could tell it was adding noise and weird stuff to the stereo field.
are very reasonable. Being a codec developer, I've experienced such things a long time ago, and sometimes (when listening to Youtube music, for example) it's like a curse having to perceive all those coding artifacts (even more so in the video, by the way).

Chris
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: includemeout on 2020-10-20 01:58:38
I posted a sample before here where Vorbis needs Q9.5 on merzbow. There old samples i made for AAC that suck at 144k but don't respond to 320k, Why should i care if your not even going to reply back when i spent 30mins wasting time making 30 sec samples with the same fucking replies?.
 
Yeah, right. As if this was a social media platform and we were your followers, to be aware of your previous posts! If you want to be taken seriously here, and not just as another crackpot who happens to be fiercely defending his favourite codec, have at least the decency of referring to your onw posts and not expecting us to be able to quote you by heart!

Not replying back!? What the heck you're talking about?

Quote
Is this bait are you that self unaware on how this place just reeks of AAC fanboyism
 
 Man, you're definitely seeing things! What bait? That was just a reminder that, by now, you should've known better to keep an eye on the ball whenever making such apparently-unsuported claims over here - you even seemed to have taken that with a light heart at first, till I only complemented what someone else said and you started shit-throwing and using innuendos  - which didn't make your claims any favour, as they came out rather hollow  - and from what I learned hanging out here since 2001, such statement are actually a blatant tell-tale sign of said fanboyism - and TBH, these "mine is better than yours" statements are soo 2002, 2003, BTW!

Quote
I was going to make a thread about my views on multi codecs but might do it else where.
 
With that 'hollier-than-thou' attitude, you talk as if your chosing not to "enlighten us with your wisdom" would be a massive blow to this community! ::)
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: ani_Jackal3 on 2021-03-18 10:10:03
Imagine melting down when caught ignoring samples fuck me you're such a joke, fuck off already.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: Porcus on 2021-03-18 12:16:43
Imagine melting down when caught ignoring samples fuck me you're such a joke, fuck off already.

Yelling "fuck off" when asked for terms-of-service compliant evidence, while refusing to link to evidence that you claim to already possess?
You know, a link wouldn't cost you more than a link to this thread - offered as evidence that you should never be taken serious eh?

Now playing: https://esotericuk.bandcamp.com/track/the-laughter-of-the-ignorant
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: DARcode on 2021-03-20 18:06:06
ani_JackASS3 a total class act.

Esoteric kinda reminded me Autopsy, not bad.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: peskypesky on 2021-05-11 23:06:44

Similarly, 128 kbps AAC (Fhg) seems transparent to me even with CBR and even 96 is close.  Yet, the most recent listening test (http://listening-tests.hydrogenaud.io/sebastian/mf-128-1/results.htm) at this bitrate suggests various AAC codecs perform similarly to Opus@96 kbps.


I'm the same as you. I've done extensive ABX testing, and Core Audio AAC is transparent for me at 128kbps using Low Complexity and VBR.

But...I'm 55, so my ears are older and my high frequency hearing tapers off at about 11.5 kHz.  Also, I don't have audiophile-level equipment. Maybe if I had a stereo system like my brorher's, which cost about $40k, I'd need higher bitrates for transparency.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: binaryhermit on 2021-05-12 02:56:56
Maybe if I had a stereo system like my brorher's, which cost about $40k, I'd need higher bitrates for transparency.
Isn't the conventional wisdom that poor quality systems actually tend to make transparency more difficult since they tend to respond in ways that violate the assumptions that encoders make about frequency response and stuff like that?

(Also, I assume that $40k system is speaker-based, and headphones are generally recommended for ABX tests since they make differences more obvious, I believe?)
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: peskypesky on 2021-05-12 03:01:51
(Also, I assume that $40k system is speaker-based, and headphones are generally recommended for ABX tests since they make differences more obvious, I believe?)

If a person usually listens to music over speakers, it would make more sense for that person to determine transparency with those speakers.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: NateHigs on 2021-05-12 12:21:15
Maybe if I had a stereo system like my brorher's, which cost about $40k, I'd need higher bitrates for transparency.
Isn't the conventional wisdom that poor quality systems actually tend to make transparency more difficult since they tend to respond in ways that violate the assumptions that encoders make about frequency response and stuff like that?

(Also, I assume that $40k system is speaker-based, and headphones are generally recommended for ABX tests since they make differences more obvious, I believe?)

I would not agree with that assumption at all. There's no way to tune a codec unless you assume a good level of transparency from the playback system (IMHO). I also believe that a poor quality playback system should be less revealing in most (if not all) cases.

This comment really piqued my interest - if you have examples to support it please let me know!
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: NateHigs on 2021-05-12 12:23:28
(Also, I assume that $40k system is speaker-based, and headphones are generally recommended for ABX tests since they make differences more obvious, I believe?)

If a person usually listens to music over speakers, it would make more sense for that person to determine transparency with those speakers.

There's a theory (entirely my own I must admit) that all music is designed to be played back on speakers. Headphones / earphones are attempting to replicate that playback. Therefor transparency in good speakers is likely the priority.

HOWEVER, in most cases I think, people now have much better headphones than speakers (if they own speakers at all).
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: peskypesky on 2021-05-12 20:21:50
There's a theory (entirely my own I must admit) that all music is designed to be played back on speakers. Headphones / earphones are attempting to replicate that playback. Therefor transparency in good speakers is likely the priority.

HOWEVER, in most cases I think, people now have much better headphones than speakers (if they own speakers at all).

It would be interesting to survey music engineers and producers to see how they are dealing with this transition from speakers to headphones, earbuds, and IEM's by consumers.  Because it does seem to be true that a lot of consumers no longer buy stereo systems on which to listen to their music. They are listening on their cellphones with IEM's and earbuds  a lot of the time. But there are the bluetooth speakers being used too.

My "theory" is that engineers and producers would be emphasizing low frequencies more now in their mixes and mastering, as smaller speakers are less capable of producing those bass frequencies.

I myself listen to music about 50% on headphones and 50% on speakers, so I have done ABX testing with both. My brother listens about 95% of the time over his speakers, so if he were to do ABX testing, I think that's what he should use.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: NateHigs on 2021-05-13 10:43:14
There's a theory (entirely my own I must admit) that all music is designed to be played back on speakers. Headphones / earphones are attempting to replicate that playback. Therefor transparency in good speakers is likely the priority.

HOWEVER, in most cases I think, people now have much better headphones than speakers (if they own speakers at all).

It would be interesting to survey music engineers and producers to see how they are dealing with this transition from speakers to headphones, earbuds, and IEM's by consumers.  Because it does seem to be true that a lot of consumers no longer buy stereo systems on which to listen to their music. They are listening on their cellphones with IEM's and earbuds  a lot of the time. But there are the bluetooth speakers being used too.

My "theory" is that engineers and producers would be emphasizing low frequencies more now in their mixes and mastering, as smaller speakers are less capable of producing those bass frequencies.

I myself listen to music about 50% on headphones and 50% on speakers, so I have done ABX testing with both. My brother listens about 95% of the time over his speakers, so if he were to do ABX testing, I think that's what he should use.

Anecdotally, the trend towards a more bass heavy music does seem to be in place.

I have never spoken with an engineer who cares what things sound like on headphones (be it a mix or mastering engineer). All professionals seem to 100% rely upon their monitors, which is why they invest heavily in them. Bedroom producers I would argue are in the opposite end of the spectrum.

I very, very rarely use my headphones as I just can't hear the sub bass in them (HD600), which is an important part of the experience for me. I also don't like things being strapped to my head for hours on end :-D
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: kode54 on 2021-05-14 01:53:03
Whereas I never use speakers. Ever.

1) I live with other people. They care about noise.
2) I am in a video call for most of my waking hours. Desktop computer video apps don't have acoustic echo cancellation, so I'd be broadcasting my speakers to the other end of the call, the entire time.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: ThaCrip on 2021-05-14 09:06:50
I myself listen to music about 50% on headphones and 50% on speakers, so I have done ABX testing with both. My brother listens about 95% of the time over his speakers, so if he were to do ABX testing, I think that's what he should use.

I use more speakers than headphones over a average month. at home I largely use my Klipsch ProMedia speakers. but on-the-go I just use my nothing fancy headphones (Sony MDR-NC7 (which I think I had since 2010)) connected to a digital audio player (AGPTEK-U3 as of Mar 2021. but prior to that I was using a Sandisk Sansa e250 since 2008 but I pretty much retired it since you can't find quality replacement batteries for it which is why I ultimately got that AGPTEK-U3 device and I am hoping it will last).

pretty much all of the ABX stuff I did over the years was done on my Klipsch PC speakers. so while I can't say anything definitively (since I never thoroughly tested), my general hunch is if something sounds good on my Klipsch speakers, then a random set of headphones would probably be similar enough. or another way I could put it... if I am happy with the sound quality on the Klipsch, I can't imagine ill notice any obvious flaws on a random set of headphones when just sitting back and enjoying my music.
Title: Re: Transparency: Public Listening Tests vs Anecdotes and My Experiences
Post by: peskypesky on 2021-05-14 09:40:57
I very, very rarely use my headphones as I just can't hear the sub bass in them (HD600), which is an important part of the experience for me. I also don't like things being strapped to my head for hours on end :-D

oh yeah, there's no doubt that my subwoofer moves air in a way that no headphones or IEM's can dream of matching....so I prefer speakers...but when I lived in NYC (for 23 years) there was no way I could blast music over speakers. So I got in the habit of using headphones.

To me: speakers > headphones > IEMs.