Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Problems with Blind ABX testing - advice needed (Read 11192 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Problems with Blind ABX testing - advice needed

Hello hydrogen audio - I joined here because I found via search some discussion about ABX testing that I am researching...I wondered if I could get some input.

I am a pro audio enthusiast and music producer, and am involved in many a discussion on a well know pro audio forum regarding equiptment. The Gold Standard test is of course the ABX running at least 10 times - where studio engineers who wish to be honest with themselves will test their beliefs and equiptment.

As someone who was educated in science I recommend ABX testing to anyone. However it has come to my attention that there may be issues with the way people perform ABX tests and that possibly they could be giving people a inaccurate picture. Many people I hear of discussing a blind test say they performed an ABX and failed it therefore they could not hear the difference. In fact - in pro audio I feel that because some software is now so convenient compared to analogue hardware - people actually are quite ready to be proved wrong - or - they have become cynical about their own ability to hear because of this possibility of imagination or bias. In some ways - which I understand is different to the hi fi world perhaps - people in the studio world may have become 'overly' cynical about their own hearing - which is a shame IMO especially when they rely on it professionally. Hence my research.

My question is really - I was wondering what the potential pitfalls of blind ABX testing are?

Personally I have noticed that because in a full track of audio - there is so much information - it is easy to become overwhelmed with all the inromation and miss the subtle differences. One person explained to me that it can be similar if you look down a microscope - if you focus your attention on an aspect of one part of the audio - you can miss the rest. I have found the solution to this is to listen many times to the testing audio - in order to get used to it. My results improve dramatically in this case-going from failure - to 10/10. I suspect many people do not - which could lead to them getting worse results.

Also - I find I need to loop very short segments of audio - and I understand this is because of the brain cannot remember for very long audio excerpts (?)...Im talking around 2-3 second clips.

What about with music - which is something people often 'feel' subconsciously - when put into a position where they consciously evaluate - are they able to translate the subconscious feeling into a concept they can then consciously correctly identify? There are considerations about how our brain work that may impact the results also I think...

Any thoughts most welcome

Problems with Blind ABX testing - advice needed

Reply #1
If you subconsciously feel different about two pieces of music then you are able to choose one over the other.  Reliably, consistently, and without other cues to prompt your thinking.  You don't have to consciously spot an artefact at 3.456 seconds into the track to be able to ABX the difference.  If you can't reliably and reproducibly (most of the time, at least) spot a difference, whether explicitly or emotionally, then you can't say there is a difference (and should carefully consider your decisions if you had previously thought there was a difference).  Maybe you aren't trying hard enough, maybe you're just tired at that moment, but maybe there isn't a difference that is audible to you.  An ABX fail doesn't mean there is no difference, just that you didn't spot one.  Someone else might, or you might on a different day or with a different set of headphones

Problems with Blind ABX testing - advice needed

Reply #2
If you subconsciously feel different about two pieces of music then you are able to choose one over the other.  Reliably, consistently, and without other cues to prompt your thinking.  You don't have to consciously spot an artefact at 3.456 seconds into the track to be able to ABX the difference.  If you can't reliably and reproducibly (most of the time, at least) spot a difference, whether explicitly or emotionally, then you can't say there is a difference (and should carefully consider your decisions if you had previously thought there was a difference).  Maybe you aren't trying hard enough, maybe you're just tired at that moment, but maybe there isn't a difference that is audible to you.  An ABX fail doesn't mean there is no difference, just that you didn't spot one.  Someone else might, or you might on a different day or with a different set of headphones


yes. But what I have noticed is that lets say comparing two processors - I heard the difference in real world testing - then in some ABX - initially I failed and could not hear the difference. However - what I 'think' then happened is my brain was forced to consciously examine specifically 'what' it was that I was hearing as the difference - what were the attributes? When I had defined that - passing the tests became much easier. The problem I suspect is that very fine differences can be swamped by the content of the audio when in blind comparisons. This is just a theory...I suppose what I suggesting is perhaps that its not that simple as some people make out - failure = cannot hear.

Problems with Blind ABX testing - advice needed

Reply #3
Quote
But what I have noticed is that lets say comparing two processors - I heard the difference in real world testing - then in some ABX - initially I failed and could not hear the difference. However - what I 'think' then happened is my brain was forced to consciously examine specifically 'what' it was that I was hearing as the difference - what were the attributes? When I had defined that - passing the tests became much easier.
Of course, if you know what to listen for and when to listen for it, it's easier to hear.    You've been "trained".

ABX isn't always the 1st step...  It's often a case of "I think I hear a difference", let's see if I can really hear the difference in a proper scientific, level-matched, blind, test. 

Or, somebody claims there's a "night and day" difference between 24/96 and 16/44.1.  Then you do an ABX test and maybe those night and day differences magically go away.    That's what "gets to" most so us...  People who claim to hear a difference in sighted listening, then they start telling you how the blind test is flawed.    If you can hear a difference in a sighted test, there's no way that blind testing makes the experiment less reliable!




...When you've got your music producer hat on, you can't get too hung-up on this stuff.  You've got to use your ears and your judgement.  You are making changes that you can obviously hear and you don't have to ABX, or even A/B, everything!      Most mixing engineers A/B against a reference recording, and of course there is a difference but the idea is just to have a reference to "keep their ears calibrated".

Your listeners can't A/B or ABX what you are doing, and they are all listening on different equipment in a different environment.

Problems with Blind ABX testing - advice needed

Reply #4
My question is really - I was wondering what the potential pitfalls of blind ABX testing are?


One is that people assume that the huge differences they hear/imagine (due to cognitive bias) will persist if you eliminate the bias. When they don't then audiophiles often get angry and find excuses to dismiss honesty controls such as DBTs.

The way to prevent this is to explain to these people the effects of biases and that listening (with your ears, not eyes!) for small differences is NOT easy. They think it is (due to bias), just like a dowser thinks it's trivial to find water, but in reality (when you check in a proper test) it is not and people often fail horribly.
Take your time before finishing the first trial. Compare A/B. Then A/X or B/X. Switch as many times as necessary to try to find differences. Start the trials once you're reasonably confident.


The problem I suspect is that very fine differences can be swamped by the content of the audio when in blind comparisons. This is just a theory...I suppose what I suggesting is perhaps that its not that simple as some people make out - failure = cannot hear.

Two things:
1) No evidence has been put forward that blind testing cannot reveal "fine differences". Actually, we have good evidence against that claim (although none would be needed to not accept that claim): positive tests with small differences, inattentional deafness (e.g. due to visual distraction), everything we know about cognitive bias and the distortion of perception.

2) Failure to reject the null hypothesis ("audibly identical") never means that we accept it.
Here's a nice analogy:
Quote
Look at it in terms of "innocent until proven guilty" in a courtroom: As the person analyzing data, you are the judge. The hypothesis test is the trial, and the null hypothesis is the defendant. The alternative hypothesis is like the prosecution, which needs to make its case beyond a reasonable doubt (say, with 95% certainty).

If the evidence presented doesn't prove the defendant is guilty beyond a reasonable doubt, you still have not proved that the defendant is innocent. But based on the evidence, you can't reject that possibility.

So how would that verdict be announced? It enters the court record as "Not guilty." 

That phrase is perfect: "Not guilty" doesn't mean the defendant is innocent, because that has not been proven. It just means the prosecution couldn't prove its case to the necessary, "beyond a reasonable doubt" standard. It failed to convince the judge to abdandon the assumption of innocence.

Btw I treat your "fine differences" claim from 1) similarly.
"I hear it when I see it."

Problems with Blind ABX testing - advice needed

Reply #5
yes. But what I have noticed is that lets say comparing two processors - I heard the difference in real world testing - then in some ABX - initially I failed and could not hear the difference. .


You thought you heard a difference.  You may or may not have heard an actual difference.  The ABX test provides no support for a belief that you did hear a difference.  It does not prove, by itself, that you didn't hear a difference.

But if the measured differences are such that one would not expect a human being to be able to hear it, why would you continue to believe that you did, given the evidence that you can't?

Ed Seedhouse
VA7SDH

Problems with Blind ABX testing - advice needed

Reply #6
My question is really - I was wondering what the potential pitfalls of blind ABX testing are?


The largest and most obvious pitfall of any scientifically valid testing scheme (not necessarily limited to blind testing) is false negatives - situations where an audible difference is known to exist by some reliable and valid means but is somehow missed by some other means that generally has a useful degree of reliability and validity.  The complementary situation which is false positives are often systematically impossible with most valid means of conducting a listening test.

One of the necessary conditions for a valid test is that it is falsifiable - that is capable of producing a negative result when appropriate.

The largest source of misapprehensions about scientific listening tests is IME the degree to which the general public to include many working professionals are apparently satisfied with means of testing that are actually very flawed in some situations.

IOW the thing that many people miss about listening tests relates to the selection of the proper tool for the job, and then properly applying that tool when the rubber hits the road.

The most common technique for performing listening tests is the time-hallowed (in some circles) sighted, non-level-matched open evaluation.  This methodology is appropriate when audible differences are relatively large. For example if I am applying eq or setting levels on to a channel on a mixing console I will probably do a sighted evaluation to determine which setting I prefer. The audible difference is relative large.

Interestingly enough the lore of people who operate mixing consoles often  includes those situations where one works hard and obtains the proper setting for a control that is later discovered to have no possible effect. Been there, done that!  This speaks to the fact that nobody seems to be well trained enough, familiar enough, practiced enough or has hearing acute enough to totally avoid false positives, even when dealing with relatively large technical differences.

The problem with sighted evaluations is that they generate a larger and larger percentage of false positives the less skilled the listener, and the more subtle the actual difference.  This is the problem that blind testing (of many kinds - ABX is only part of the menu of blind testing schemes) was invented to address.

The problem with sighted evaluations is very visible in consumer high end audio, where all sorts of very poorly trained listeners claim that they have heard differences that, in technical terms are impossibly small or non existent.

The corresponding problem is that blind tests deal with this problem of false positives very effectively, but can easily produce false negatives. 

There are effective systematic ways to avoid false negatives.  The methodology I favor involves creating a series of  musical samples that have technical flaws of the kind that we are trying to detect, but start out with the flaw at such a level that almost everybody will hear it. There is a sequence of musical samples that contain the flaw at decreasing levels, right down to real-world levels characteristic of the situation at hand.

Then if you have a listener that can't hear the problem when it is large, you don't waste time trying to get him to hear it reliably when it is small. Exactly what constitutes large and small may not be known initially, but will come to light as a natural by product of the testing procedure and analysis of data when it is applied to a number of listeners.

Problems with Blind ABX testing - advice needed

Reply #7
The complementary situation which is false positives are often systematically impossible with most valid means of conducting a listening test.

Yet it happens regularly ... sometimes even deliberately.
The best test can have flaws, hence the reproducibility requirement in science.

The most common technique for performing listening tests is the time-hallowed (in some circles) sighted, non-level-matched open evaluation.  This methodology is appropriate when audible differences are relatively large. For example if I am applying eq or setting levels on to a channel on a mixing console I will probably do a sighted evaluation to determine which setting I prefer. The audible difference is relative large.

Yeah, but even then the guys in the pro field will make use of bypass or A/B functionality to get a more clear idea of what really changed.
That way you will also detect if you tuned an EQ that was on bypass to begin with.  I hope everyone else also has made this experience once.
"I hear it when I see it."

Problems with Blind ABX testing - advice needed

Reply #8
At the risk of stating the bleeding obvious, just be aware that ABXing is - likely - best suited for the question "is there a difference? Y/N".
As opposed to the scenario where A and B are patently distinct, and test subject is told to compare them head-to-head and choose X := the best.

Problems with Blind ABX testing - advice needed

Reply #9
yes. But what I have noticed is that lets say comparing two processors - I heard the difference in real world testing - then in some ABX - initially I failed and could not hear the difference. .


You thought you heard a difference.  You may or may not have heard an actual difference.  The ABX test provides no support for a belief that you did hear a difference.  It does not prove, by itself, that you didn't hear a difference.

But if the measured differences are such that one would not expect a human being to be able to hear it, why would you continue to believe that you did, given the evidence that you can't?


yes indeed - I thought I heard a difference.

I should clarify - the type of tests that come up regularly in pro audio are comparisons between analogue processors and digital 'analogue emulations'. There is a massive rise in the usage of plugins and marketing is strong. I am not talking about tests of 44.1 vs 96k - which I would not generally expect people to hear any difference in a blind test.

Problems with Blind ABX testing - advice needed

Reply #10
Well, depending on the effect, a digital implementation may be virtually impossible to distinguish. Analog may introduce distortion and noise, digital non-linear effects will cause aliasing.

Problems with ABX tests could be matching (time, volume) of the digital with the D/A-processor-A/D recorded version to prevent false positives.
It could make sense to also send the digital version through D/A/D.

Also, you should use a mix of different types of music/signals for the test. But I guess I'm not telling you anything new here.
"I hear it when I see it."

Problems with Blind ABX testing - advice needed

Reply #11
The largest and most obvious pitfall of any scientifically valid testing scheme (not necessarily limited to blind testing) is false negatives



Thanks for the response. Essentially what Im asking - is what are the scenarios that might lead to false negatives.


Problems with Blind ABX testing - advice needed

Reply #12
The largest and most obvious pitfall of any scientifically valid testing scheme (not necessarily limited to blind testing) is false negatives



Thanks for the response. Essentially what Im asking - is what are the scenarios that might lead to false negatives.


Well, Arnie says "any scientifically valid testing scheme" and that's pretty clear to me.  All scientific tests may return false positives and false negatives.  When dealing with statistical samples we reduce the probabilities of both by enlarging the sample.  However this is problematic from a practical point of view as we must, for example, increase our sample by four times to get cut the error rates in half.

If we are using measurements then we need to improve the instruments we use to measure.

But whatever we do we cannot ever avoid the possibility of false results.  We can only reduce their probability.


Ed Seedhouse
VA7SDH

Problems with Blind ABX testing - advice needed

Reply #13
Quote
I should clarify - the type of tests that come up regularly in pro audio are comparisons between analogue processors and digital 'analogue emulations'. There is a massive rise in the usage of plugins and marketing is strong. I am not talking about tests of 44.1 vs 96k - which I would not generally expect people to hear any difference in a blind test.
I would imagine it's pretty difficult to get the same exact settings with an analog (or hardware) effect and the simulated software effect.  If the effect is related to levels (like with a compressor) the input level would have to be simulated in software to match the electrical signal level, and that could be difficult.   

With some of analog/hardware effect boxes it would be difficult to get the exact same setting twice, or to set up two boxes identically.

There are companies that make hardware reverbs and software simulations, and I'd assume the manufacturer can make them both sound identical.    But, I'm not so sure you can do as well "at home".

Problems with Blind ABX testing - advice needed

Reply #14
The complementary situation which is false positives are often systematically impossible with most valid means of conducting a listening test.

Yet it happens regularly ... sometimes even deliberately.
The best test can have flaws, hence the reproducibility requirement in science.


By all means. Reproducability means to me that a recipe for a listening test with an exceptional result is made known, and other people take it off into their "Kitchens" and report that they used it to obtain compatible if not identical results.

A critical part of the scheme is that the test obtains sufficient respect in the relevant community that others feel that it is worth the effort to reproduce. Another part is documentation that is complete enough to make the duplication effort both attractive and possible.

The most common technique for performing listening tests is the time-hallowed (in some circles) sighted, non-level-matched open evaluation.  This methodology is appropriate when audible differences are relatively large. For example if I am applying eq or setting levels on to a channel on a mixing console I will probably do a sighted evaluation to determine which setting I prefer. The audible difference is relative large.

Yeah, but even then the guys in the pro field will make use of bypass or A/B functionality to get a more clear idea of what really changed.


Exactly. But even with that, mistakes get made.

I've had any amount of $#!+ thrown at me because I mix recordings and do live sound without any involvement with blind testing.  My defense is that the essence of those efforts is dealing with unambigiously audible differences, and the tools at hand are designed to do exactly that. 

With the advent of digital consoles the console op can see the results of what he changes in tenths of a dB and Hz with pretty good accuracy and repeatability. In  over a decade of that kind of work I found that all of the changes I implemented were a half dB or more, generally much more.  I'd go so far as to say that the difference between a good sounding setting of a fader and a worse  sounding setting of a fader is a dB or more, but generally less than 3 dB.

Mixing is a bit of a feat of critical hearing, because doing it right can mean setting gains that could be over 100 dB within as little as a half dB. 

Quote
That way you will also detect if you tuned an EQ that was on bypass to begin with.  I hope everyone else also has made this experience once.



By whatever means!

Problems with Blind ABX testing - advice needed

Reply #15
The largest and most obvious pitfall of any scientifically valid testing scheme (not necessarily limited to blind testing) is false negatives



Thanks for the response. Essentially what Im asking - is what are the scenarios that might lead to false negatives.



False negatives usually come from poorly trained listeners and poorly designed tests.

I already covered listener training, I think. Need more clarification?

Poor test design can be anything.  Many common sources leading to listening tests with poor sensitivity are covered by ITU Recommendation BS 1116-2 which is online (google).

One problem with ABX is that it is as others have pointed out what I think of as a knife edge test. The knife edge between a positive and a negative result is very sharp and test does not do such a good job of delineating the difference between very close and a little further off.

Other DBT methodologies like ABC/hr and MUSHRA  (google) are better for that.

DBTs are widely used for food testing, and that may be closer to the kind of testing you may want to do.  Google Triangle Test.


Problems with Blind ABX testing - advice needed

Reply #16
Quote
I should clarify - the type of tests that come up regularly in pro audio are comparisons between analogue processors and digital 'analogue emulations'. There is a massive rise in the usage of plugins and marketing is strong. I am not talking about tests of 44.1 vs 96k - which I would not generally expect people to hear any difference in a blind test.
I would imagine it's pretty difficult to get the same exact settings with an analog (or hardware) effect and the simulated software effect.  If the effect is related to levels (like with a compressor) the input level would have to be simulated in software to match the electrical signal level, and that could be difficult.   

With some of analog/hardware effect boxes it would be difficult to get the exact same setting twice, or to set up two boxes identically.

There are companies that make hardware reverbs and software simulations, and I'd assume the manufacturer can make them both sound identical.    But, I'm not so sure you can do as well "at home".


yes getting the settings right is very important. Often 1:1 settings alone will but suffice but careful tuning by ear is neccessary. Esp also levels. However often - there are differences in emulations and the real thing even when closely matched. The topic of analogue emulation is a hotly debated one...

I already covered listener training, I think. Need more clarification?


Ah I didnt quite get that bit...I suppose you could make a test comparing processors by having them less accurately set so they were more obviously different? But its not quite the same as testing for artifacts...

Problems with Blind ABX testing - advice needed

Reply #17
I already covered listener training, I think. Need more clarification?


Ah I didnt quite get that bit...I suppose you could make a test comparing processors by having them less accurately set so they were more obviously different? But its not quite the same as testing for artifacts...


I think the point is that for the purposes of training people, we don't have to use precise simulations of the final test conditions. We might want to but this is the real world! ;-)

For example we train race car drivers to drive small low powered vehicles (e.g. passenger cars, go carts and then junior race cars) before we put them into full tilt NASCAR or F1 racers. 

Listener training can similarly be a staged affair where we start people out in situations that make the general kind of artifact supremely obvious, and then back the artifacts off in stages until we are at the real world test.

One perhaps non-obvious way to come up with artifacts at various levels is to re-record the processing step and then feed that into the processor again and again for the next more obvious test.

If you are dealing with people off the street or used to just their own home stereo or stereo shop demos, one often finds a few who are remarkably insensitive listeners. Then a guy who doesn't even seem to care about sound quality knocks your socks off and aces everything in sight.  Doesn't matter what they talk, it is how they can actually walk the walk.  With a sighted evaluation it is far harder to tell.  IME people who already have some audio production experience often hit the ground running, but again assume nothing, measure what you can.

Problems with Blind ABX testing - advice needed

Reply #18
Two possible problems which need to be addressed in blind testing are test subject (test listener) stress and motivation. A test subject can cry foul if, for example, their "shoes were tied too tightly", they weren't ready to start the test, or they just felt "off" that day. Similarly, if they just don't really care one way or another they might not try their best to really listen as carefully as possible. They need motivation to try their best.

Stress

When I conducted (proctored) my blind test [nearly double blind except for the fact that I remained in the room, standing behind the subject, and announced the beginning of each trial, for example: "We are ready to begin trial three. You may commence music playback when you are ready."] I took several precautions to ensure stress wouldn't be a problem. The best way to do this is to let the test subject select, design, and decide EVERYTHING possible about the test (yet still keeps the test fair and unbiased, of course).

He picked the test's who, what, where, when, and why. He picked the electronics, wiring, music, room, song(s), tracks, passages, volume level, speakers, placement, chair, snacks [he chose only a glass of water], clothing, shoe tightness [no joke, he adjusted this prior to testing], venue [high end music salon room with professional grade room treatments, after closing time so it was dead quiet,], hardware switching methodology [he oddly selected hardwired switching and declined the expensive, high end Adcom switcher I had procured just for the test, his mistake if you ask me but the important thing is it was what HE wanted], date, time, pace, trial start times, breaks, HVAC thermostat settings, etc.

I only insisted that the music be from commercially released CDs or SACDs [him being a part-time recording engineer, I disallowed any of his unreleased, private material he had recorded himself, and he was fully aware of this limitation days before the test and freely accepted this condition without hesitation] and that I got to set the level matching using a voltmeter I had brought for the test using a calibration tone on a CD.

He was allowed to listen openly, pre-test, as much as he wanted. I then got him to say, on camera, that all conditions were sufficient such that he should be able hear the difference between his selected amps, that all test conditions were acceptable to him, and that everything was to his liking. We used a Mark Levinson grade power amp [Proceed, actually, a subdivision of ML, but later ML announced this particular unit would be rebranded AS Mark Levinson when Proceed was dissolved, although that never did happen though, but I saved the link to that press release] vs. a less powerful Yamaha integrated amp one seventh the price. No clipping was allowed of course. In this way I protected myself from him later claiming "Things weren't to my liking" when he lost, since I had him on camera saying everything was just fine!

Motivation

We made a very small bet, and I decided to give him 2:1 odds in his favor. This way besides just attempting to prove his point he also had a small financial incentive to hear differences as best he could, YET the amount he would lose if he couldn't show statistical significance (at a level he agreed upon on camera pre-test, as well as the number trials, 16) was trivial and not a hardship, or else he'd cry "Stress!".

I won.

Problems with Blind ABX testing - advice needed

Reply #19
Two possible problems which need to be addressed in blind testing are test subject (test listener) stress and motivation. A test subject can cry foul if, for example, their "shoes were tied too tightly", they weren't ready to start the test, or they just felt "off" that day. Similarly, if they just don't really care one way or another they might not try their best to really listen as carefully as possible. They need motivation to try their best.

Stress

When I conducted (proctored) my blind test [nearly double blind except for the fact that I remained in the room, standing behind the subject, and announced the beginning of each trial, for example: "We are ready to begin trial three. You may commence music playback when you are ready."] I took several precautions to ensure stress wouldn't be a problem. The best way to do this is to let the test subject select, design, and decide EVERYTHING possible about the test (yet still keeps the test fair and unbiased, of course).


That's the approach we took right after we invented ABX and we pursued that approach for a decade or more. Then we became far more sophisticated about listener training, and that was like turning a corner.

Providing a context where people start out obtaining easy positive results due to the samples being very strong for the audible problem being investigated gives them confidence and dramatically reduces their strain. As succeeding samples become harder there is going to be more stress and in the end they may not obtain positive results at all, but at least they know what they are listening for and they know for sure that they have mastered the basic mechanics of the test.


Problems with Blind ABX testing - advice needed

Reply #20
... Providing a context where people start out obtaining easy positive results due to the samples being very strong for the audible problem being investigated gives them confidence and dramatically reduces their strain. As succeeding samples become harder there is going to be more stress and in the end they may not obtain positive results at all, but at least they know what they are listening for and they know for sure that they have mastered the basic mechanics of the test.


The Philips "Golden Ear Challenge" is a good example of this approach.
Regards,
   Don Hills
"People hear what they see." - Doris Day

Problems with Blind ABX testing - advice needed

Reply #21
Providing a context where people start out obtaining easy positive results due to the samples being very strong for the audible problem being investigated gives them confidence and dramatically reduces their strain. As succeeding samples become harder there is going to be more stress and in the end they may not obtain positive results at all, but at least they know what they are listening for and they know for sure that they have mastered the basic mechanics of the test.


I find it interesting that you and mzil use the term 'stress' here. The story by mzil shows that the goal for all those arrangements was not the actual reduction of stress, but the reduction of opportunities for the listener to use the stress argument in the event of his failure to hear a difference. It is all about tactics in the battle between objectivist and subjectivist.

In reality, I believe that stress has little to do with hearing ability. The biggest stress seem to come from the realization of the listener that he may be failing. Nobody seems to want to objectify the actual effects of any stress factor. The subjectivists seem to be convinced that stress is detrimental to their hearing ability, but nobody actually verifies whether that's true. I think it is far from obvious, and there are arguments for the opposite.

Some subjectivists are wont of using arguments from evolution when discussing hearing abilities. I frequently encounter arguments that link our hearing ability with our fitness for survival in the face of predators somewhere nearby in the jungle. If this crude argument is to be taken seriously, it surely would also indicate that our hearing abilities in times of stress ought to be at least as good as in relaxed times, wouldn't it? In such situations, stress literally ought to sharpen our senses, for a very direct benefit in fitness for survival!

I think it would be about time someone took to investigating what effect stress actually has on our hearing. I suspect the outcome may (once again) not be what most subjectivists accept as true. Or did I miss something that has already been done?

At any rate, I expect the maximum sensitivity in listening tests will not result from focussing on stress reduction. I read your experience as supporting that view. The bigger factor seems to be training, focused on the exact effect that is being investigated.

Problems with Blind ABX testing - advice needed

Reply #22
several precautions to ensure stress wouldn't be a problem.
let the test subject select, design, and decide EVERYTHING possible
He picked the test's who, what, where, when, and why [...] date, time, pace, trial start times, breaks, HVAC thermostat settings, etc.
He was allowed to listen openly, pre-test, as much as he wanted. I then got him to say, on camera,
Mark Levinson
2:1 odds in his favor


Raising the stakes to pokerbluff him - STRESS!

Problems with Blind ABX testing - advice needed

Reply #23
The Philips "Golden Ear Challenge" is a good example of this approach.

It is well done but the people that (think to) hear finest things like generations of bit rot are already overstressed even with this Philips test
I remember at least i have read it that way at ComputerAudiophile.
http://www.computeraudiophile.com/f8-gener...hallenge-19381/
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Problems with Blind ABX testing - advice needed

Reply #24
several precautions to ensure stress wouldn't be a problem.  let the test subject select, design, and decide EVERYTHING possible He picked the test's who, what, where, when, and why [...] date, time, pace, trial start times, breaks, HVAC thermostat settings, etc.  He was allowed to listen openly, pre-test, as much as he wanted. I then got him to say, on camera,  Mark Levinson  2:1 odds in his favor
  Raising the stakes to pokerbluff him - STRESS!

Raising the stakes in HIS favor, meaning he would win double the amount that he might lose, can only reduce his stress, no?  [or maybe I'm misunderstanding your meaning of "pokerbluff"?]