Skip to main content


Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Science and preferences in audio quality (Read 2105 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Science and preferences in audio quality

Interesting white paper by noted loudspeaker researcher Floyd E. Toole here:

I think it has some sections which are applicable to audio codec listening tests.


Science and preferences in audio quality

Reply #1
Some highlights from the paper [my comments bracketed]:

Some assert that it is a matter of personal taste, that our opinions of sound quality are as variable as our tastes in "wine, persons or song".

[However,] in repeated judgments, listeners with normal hearing exhibited standard deviations small enough for a high statistical significance to be associated with small (about 0.5 scale unit) [on a fidelity rating scale of 0 to 10] rating differences between products.  This important observation is paralleled by another, perhaps even more important one:  that groups of such listeners closely agreed with each other.

[Broadband hearing loss increases the variability of quality judgments].  The hearing loss, in this case, was defined by the average threshold elevation at frequencies below 1 kHz.  Those exhibiting this form of hearing loss, also tended to have loss at high frequencies.  High-frequency loss, by itself, was not a clearly correlated factor.

Listeners with hearing loss not only exhibit high judgment variability, they can also exhibit strong individualistic biases in their judgments.

The conclusion is clear.  If there is any desire to extrapolate the results of a listening evaluation to the population at large, it is essential to use representative listeners.  In this context, it appears to be adequate to employ listeners with broadband hearing levels within about 20 dB of audiometric zero.  According to some large surveys, this is representative of about 75%, or more, of the population.

[On the importance of using trained listeners]:
Probably the single most apparent deficiency of novice listeners was the lack of a vocabulary to describe what they heard.  Without such descriptions, most listeners found it difficult to be analytical in forming their judgments, and to remember how various test products sounded.  It was also clear that, without the prompting of a well-designed questionnaire, not all listeners paid attention to all perceptual dimensions, resulting in judgments that were highly selective.

As the relationships between technically-measurable parameters and their audible importance became clear, it was possible to design training sessions that focused on improving the ability of listeners to hear and to identify specific classes of problems in loudspeakers. [Note:  replace loudspeakers with codecs, for the interests of this forum!]  With the aid of computers, this training has been refined to a self-administered procedure, which keeps track of the student's progress.  [Reference paper:  "A Method for Training of Listeners and Selecting Program Material for Listening Tests", S.E. Olive, 97th Convention, Audio Eng. Soc., Preprint No. 3893 (1994 November)]

[On the importance of Anchors]:
Wehn reminders of bad sound are removed from the tests, an interesting thing happens:  listeners spontaneously expand the scaling of their responses to fill more of the range.  In the absence of reminders about how bad thigns can really be, we become more critical of the relatively good sounds we are evaluating.


I'd love to see how they performed listener training, and to find out how to best implement such training in the service of testing audio codecs.

Audiometric tests to separate out those with broadband hearing loss, though, seem to be a pipe dream for Internet based testing.

Science and preferences in audio quality

Reply #2
To pick up on your last point:

I wonder - is it really beyond us to figure this out?

For a single combination of sound card and headphones (with windows mixer settings at some pre-set level) it would be possible to relate the digital signal with the SPL at the listener's ear. OK, so we have to find someone with the equipment to test this, but it's not impossible.

It would also be possible to calculate/measure the output level of a sound card for a given digital signal level SEPARATELY from a measurement of the efficiency of a given pair of headphones.

It's not hard to imagine a database or simple table giving paramters for the most common sound cards with direct headphone output. I bet 4 or 5 sound cards would cover 80% of users.

Measuring the SPL generated by various headphones for a given input voltage is a little harder, and it varies from listener to listener, due to the different shape of each listener's ear - it makes a difference in how well the headphones "couple" to the listener's ear. However, again, this isn't impossible. Sadly, it will be frequency dependent, but this only complicates things a little.

With these two factors known for a users audio set-up, it's possible to know exactly how loud the sound is at the user's ear for a given digital signal RMS level.

So you could test broadband (as well as frequency selective) hearing loss over the net, IF you had knowledge of the users equipment.

If there's an analogue volume control in the way, it throws it all out - but most modern sound cards don't have this, do they?



Science and preferences in audio quality

Reply #3

You're right, it's not impossible.  Just highly difficult

BTW, I found a document by Sean Olive which gives a better idea of how listeners are trained.  Sean Olive is a well-known name in the audio industry and is Harmon International's manager of subjective evaluation, a unique position.

Interesting tidbits from the document:

Listeners are trained to recognize different equalization profiles.
"1 trained listener = 7 untrained listeners"
"performance improves monotonically with an increase in number of training sessions."
"Low frequency boosts are often confused with high frequency cuts..."
"Program is the single largest factor affecting listener performance."
"Pink noise and Tracy Chapman were best signals for identifying spectral distortions."


SimplePortal 1.0.0 RC1 © 2008-2021