Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: What's the problem with double-blind testing? (Read 248950 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

What's the problem with double-blind testing?

Reply #200
The question double-blind testing usually answers around here is: "Can people hear a difference?" You're demanding a pretty broad definition of "hear". I'd venture to say that your definition is almost meaninglessly broad, especially if you're extending it to mean "shows different brain scan results". Just because there is a neurological response to a physical stimulus does not imply that someone is necessarily consciously aware of the stimulus nor does it imply that a person "heard" anything.

Conscious awareness of a thing is chosen simply because it's most easily testable. There is no reason why blind-testing need be limited to conscious awareness, it's just a whole lot harder to test based on brain-scans. Anything short of conscious awareness testing cannot answer the question "Can people hear a difference?" in any meaningful way. When you start to alter the definition of "hear" beyond the intuitive awareness of what it means that all non-deaf people share, all the assumptions and axioms we start with go right out the window.

As for your FLAC analogy: I can ABX 320kbps MP3 vs. FLAC on some samples. Yet I use V4 on my portable. I can hear a difference if I try and the conditions are right. That's missing the point though. The point is that I can fit more music on my player at V4, and that I won't notice if I'm not consciously focusing on trying to differentiate the two. I can sure "hear" the difference, but unless I'm really bothered, I don't hear the difference.

What's the problem with double-blind testing?

Reply #201
You are assuming that the only way to tell that X is different from Y, in the context of (1), is by the subject's own conscious discrimination. There is no necessity to make that assumption.
Which other ways would there be (unless the subject is told)?


Please see post #195 for a suggestion.


And also #199.  There are ways we tell that blindsight exists, but it is (almost by definition) not via straightforward conscious discrimination of stimuli as in the conventional ABX tests under discussion.

What's the problem with double-blind testing?

Reply #202
This is getting really tiring.
Actually that might well be the biggest problem with double-blind testing
Let me try to help Mark a bit with an example. I have a colleague recording/mixing engineer who is convinced that 24/96 sounds better than 24/44.1 (in "his" setup). One of his motivations is that he is less tired after a day of mixing at 96k. Needless to say that I was curious to investigate further, but since his mix setups are usually rather complex, it is not at all easy to switch sampling rates quickly so an ABX test seems not possible. Besides that it's probably very difficult to separate auditory and non-auditory variables in the test (e.g. hours of sleep, disturbing phonecalls during work, duration of work, monitoring levels etc.).
What would be the best options for double-blind testing when A and B are significantly separated in time and/or likely to be different ?

What's the problem with double-blind testing?

Reply #203
He might have made an adequate observation. At 24/96 the computer spends more time for processing, so he has slightly longer pauses between interactions with his machine.  Also filtering is much less mentally challenging without Nyquist at your doorstep.

What's the problem with double-blind testing?

Reply #204
As was said repeatedly(!), ABX has no time limit.  Since your friend's experience is not double-blind, you cannot rule out placebo as a legitimate reason he believes he hears differences, can you?

What's the problem with double-blind testing?

Reply #205
Mark do you hang a lot in the mental masturbation section of the philosophy forums? 

Clearly the burden of proof comment above is equally valid for people who say something exists, as for people to say that something might exist (i.e. it's plausible). You're being very careful (yeah, we're onto you) not to go all the way and say it exists, but you should provide a reason as to why you should think it might exist. So far I don't think you have.

About the brain scan thing. That's another blind test for you. See if the subject can differentiate when different parts of his brain light up. You're saying that he probably could. That's a more interesting test in my opinion, anyway.

What's the problem with double-blind testing?

Reply #206
The question double-blind testing usually answers around here is: "Can people hear a difference?" You're demanding a pretty broad definition of "hear". I'd venture to say that your definition is almost meaninglessly broad, especially if you're extending it to mean "shows different brain scan results". Just because there is a neurological response to a physical stimulus does not imply that someone is necessarily consciously aware of the stimulus nor does it imply that a person "heard" anything.

Conscious awareness of a thing is chosen simply because it's most easily testable. There is no reason why blind-testing need be limited to conscious awareness, it's just a whole lot harder to test based on brain-scans. Anything short of conscious awareness testing cannot answer the question "Can people hear a difference?" in any meaningful way. When you start to alter the definition of "hear" beyond the intuitive awareness of what it means that all non-deaf people share, all the assumptions and axioms we start with go right out the window.

As for your FLAC analogy: I can ABX 320kbps MP3 vs. FLAC on some samples. Yet I use V4 on my portable. I can hear a difference if I try and the conditions are right. That's missing the point though. The point is that I can fit more music on my player at V4, and that I won't notice if I'm not consciously focusing on trying to differentiate the two. I can sure "hear" the difference, but unless I'm really bothered, I don't hear the difference.


Thanks for your response.  The thing is, I don't think I have to have a definition of "hear" in order to have a (not pointless) worry about whether compression (e.g.) makes some difference to what I hear.  I think that hearing is whatever, at the "end of science," cognitive theories would eventually say that it is.  And I don't know that compression fails to make a difference to that.  (Since I am the one doing the hearing, even though I don't have a scientific definition of it, I can still worry about it.)  Moreover, I have an open-ended attitude about what sorts of things, what experiences or cognitive effects, I ought to care about.  Suppose having certain sorts of information causes certain differences in cognitive processing or representation in me, at some level, even though I can't discriminate the stimuli.  When met with the claim, "You should not care about the difference, because you can't discriminate the two," I just see no necessity in that.  How do I know in advance that it would be silly or pointless for me to have an interest in having one sort of cognitive process go on in me, rather than the other, where the difference is not manifested in conscious discrimination?  We know more and more that what is consciously available to us is just the tip of the iceberg, so I don't know that such an interest would necessarily be pointless.

What's more, there may be effects that are conscious but context dependent, as with the "fatigue" hypothesis.  This is a cognitively meaningful hypothesis (despite the naysayers) but hard to test by ABX since quick-switch gets rid of the context and, if we use rpp3po's method on longer stimuli, valid only if the subject has a word for the effect in her language and we know to use it in explaining the protocol to her.

What's the problem with double-blind testing?

Reply #207
And as a side note, on the "civility" argument. One very often hears this from people on the other side of science, I wonder why.

What's the problem with double-blind testing?

Reply #208
...yet he offers no brain scans, so we're left in Mark's fantasy land where we possibly live amongst aliens, invisible garaged dragons and giant pink bunnies that orbit Uranus.

Even if brain scans show differences (I doubt they will) to different stimuli that would appear to be indistinguishable in ABX testing [which he will not specify (flac vs. mp3, 16-bit vs. 24 bit?)] it's a giant leap to conclude that one is "better" for us than the other.  Just as my ability to kill someone in a distant galaxy by snapping my fingers, it could be that the mp3 created from 16 bits will help prevent some types of cancer while the 24-bit flac will not help prevent any types of cancer.

Canar, Axon and others laid the point out pretty plainly.

What's the problem with double-blind testing?

Reply #209
valid only if the subject has a word for the effect in her language and we know to use it in explaining the protocol to her.

Complete nonsense.  Words are not necessary to choose X as being A or B and no one ever said such a choice must be limited to what one "hears".

What's the problem with double-blind testing?

Reply #210
Quote
The thing is, I don't think I have to have a definition of "hear" in order to have a (not pointless) worry about whether compression (e.g.) makes some difference to what I hear.
If you don't know what comprises "hearing" and are unwilling to define it, you cannot even say you hear anything at all. Worrying about compression making a difference to what you hear is stupid. The evidence is clear: 320kbps is insufficent. Furthermore, there is no lossy codec that exists that does not have samples for which it fails. If the lossiness of lossy compression is an issue at all, in any way shape or form, use lossless.

When moving past the pragmatic into the realm of the theoretical, anything is possible. You're focused on the anything. You've completely lost sight of the "why". Why do we blind-test? Because it allows us to show that we hear a difference. Why do we worry about hearing a difference? Because we're trying to improve psychoacoustic audio codecs.

You talk about hearing, yet you refuse to put your finger on what hearing is, instead erecting some silly argument about "end of science" definitions. What if, at the "end of science", hearing is defined simply as what we can consciously perceive? Your whole argument falls apart. I see no reason why such a simple definition is not possible or even likely.

Instead, you insist on public mental masturbation, bringing up edge cases which are not even known to exist yet, but may exist. Okay, sure, let's suppose you're right. Now what? Oh, you've proven that you like mentally masturbating in public on topics so completely detached from reality that there is no pragmatic basis on which you base your arguments.

What's the problem with double-blind testing?

Reply #211
As was said repeatedly(!), ABX has no time limit.
I'm just not convinced(!). In my experience, when the time between two stimuli A and B becomes large (e.g. days, weeks), many non-auditory variables are introduced (e.g. memory). Example: several years ago the seats in the Concertgebouw had to be replaced. Requirement was that the famous acoustics had to stay identical. After installation of the carefully designed seats many listeners had the impression that the acoustics had changed. How would you ABX this ?
Since your friend's experience is not double-blind, you cannot rule out placebo as a legitimate reason he believes he hears differences, can you?
Would your term "placebo" have the same meaning as my "non-auditory variables" ?

What's the problem with double-blind testing?

Reply #212
Placebo was just going to the broader point that once you open the door to far-fetched possibilities as this discussion has taken, we should make sure to include not-so-far-fetched possibilities such as your friend is imagining the differences (assuming his equipment performs to specification/is not broken).  It's quite easy that your friend is feeling more fatigued when working with CDDA because he has a preconceived notion that it is more fatiguing.  After all the placebo effect isn't exactly controversial and it isn't as if your friend is not aware of the bit depth and sample rate he's using, now is it?

Regarding the lack of control do you think the same things will happen only on the day that A is being presented and not B?  However, just because your friend doesn't have the luxury to dedicate his days at work to a more scientifically controlled test, doesn't mean that it can't be done.  Until it can be done, his suggestion that he can tell the difference based on fatigue holds little (if any) scientific merit if we're only supposed to go by his word.

Your room example is not exactly valid.  There is no defined A and defined B which we can go back and forth between like there is with hi-res vs. CDDA, so ABX testing clearly does not apply.  Though I suppose you could swap all the seats out but then you need to make sure the testing is still blind.

What's the problem with double-blind testing?

Reply #213
Mark do you hang a lot in the mental masturbation section of the philosophy forums? 

Clearly the burden of proof comment above is equally valid for people who say something exists, as for people to say that something might exist (i.e. it's plausible). You're being very careful (yeah, we're onto you) not to go all the way and say it exists, but you should provide a reason as to why you should think it might exist. So far I don't think you have.


For a start, see http://www.arts.uwaterloo.ca/~pmerikle/pap...tudies.1998.pdf .

If, on the other hand, you think that the possibilities I'm asking about are so implausible as not to be worth considering, on what scientific basis do you make that, or any, estimate of their probability?

About the brain scan thing. That's another blind test for you. See if the subject can differentiate when different parts of his brain light up. You're saying that he probably could.


No, actually, that's the opposite of what I'm saying, because if he could do that then he probably can differentiate the stimuli to begin with and there would be no reason to do the brain scans in the first place.

That's a more interesting test in my opinion, anyway.


What's the problem with double-blind testing?

Reply #214
Quote
The thing is, I don't think I have to have a definition of "hear" in order to have a (not pointless) worry about whether compression (e.g.) makes some difference to what I hear.
If you don't know what comprises "hearing" and are unwilling to define it, you cannot even say you hear anything at all. Worrying about compression making a difference to what you hear is stupid.


Please don't take this the wrong way, but that's a naive view of meaning.  The notion that in order to say things we have to have definitions in our heads, or a capacity to state necessary and sufficient conditions, was effectively demolished by Wittgenstein.  Such a capacity may be necessary for certain purposes, perhaps in scientific theory building, but it is not a general requirement for using language meaningfully.  So when you say that the worry or skeptical doubt I have expressed is "stupid," you have a false picture of meaning.  Hearing, in the sense of conscious auditory experience, is a natural phenomenon that we can refer to; science will help us discover what it is; but to insist that, unless we have an operational definition such as "shows different brain scan results" in mind (as some logical positivists might have insisted), we are not saying anything at all, is a mistake.



 

What's the problem with double-blind testing?

Reply #215
Entering into serious discourse without defining what it is you're talking about is asinine. Worrying about compression is what I was labelling as "stupid". Worrying about compression (which is a very small thing in the bigger picture) is not something I'd recommend to anyone. I suggest you change your tone if you wish to continue on in these forums. Language can do all kinds of internally-contradictory nonsense. Just because a construct can exist within the bounds of language does not imply that a construct can exist within the bounds of logic or science. Just because I can talk about walking on water personally does not mean that I can walk on water.

You're still missing my main point: What's the point of any of your argument? That there are phenomena that cannot be tested? That tests are specific to the test being taken? It doesn't invalidate a blind test. If we assume that there are untestable subconscious phenomena behind lossy compression, the only difference in outcome is that we have a valid reason to switch to lossless. However, there are already valid reasons to use lossless. If we assume there are untestable subconscious phenomena behind lossy compression, then blind-tests designed to test conscious phenomena are still valid in the context of conscious phenomena.

What's the problem with double-blind testing?

Reply #216
Clearly the burden of proof comment above is equally valid for people who say something exists, as for people to say that something might exist (i.e. it's plausible). You're being very careful (yeah, we're onto you) not to go all the way and say it exists, but you should provide a reason as to why you should think it might exist. So far I don't think you have.


For a start, see http://www.arts.uwaterloo.ca/~pmerikle/pap...tudies.1998.pdf .

OK before I go on and try to read the whole paper, is there any part that I should be focusing on? The abstract doesn't indicate anything that may support your position. On the contrary, for instance:
Quote from: from abstract link=msg=0 date=
In addition, recent studies of patients undergoing general anaesthesia have shown that the effects of stimuli perceived unconsciously during surgery can last for approximately 24 hours.

How do you think they tested if there were any effects of unconscious perception? We're not saying that there can't be unconscious perception, only that its effects if any can be tested, and clearly the authors of that paper think the same.

I am sorry to say that it's a common practice of pseudoscience sympathizers to quote irrelevant-but-relevant-sounding papers in the hope that (or sincerely misguidedly) people won't have the time to read them just for an internet discussion. So tell me again, which part of the paper should I be focusing on, and what do you think it states?

Quote from: Mark DeB link=msg=0 date=
If, on the other hand, you think that the possibilities I'm asking about are so implausible as not to be worth considering, on what scientific basis do you make that, or any, estimate of their probability?

On the basis that you haven't made a worthwhile case to be considered. You're running on assumptions that are unwarranted and contradictory (one can "unconsciously perceive" something as different than something else, yet it is not detectable?). And, you're trying to explain a phenomenon that is easily and better explained by other known mechanisms (placebo and all that).

What's the problem with double-blind testing?

Reply #217
Entering into serious discourse without defining what it is you're talking about is asinine.


Right, including the topic of meaning (or serious discourse) itself.  The relevant concept, in connection with what you were saying in your previous post, is the "division of linguistic labor," which is explained by, among other writers, Hilary Putnam ("The Meaning of 'Meaning'," Philosophical Papers Vol. 2, and probably also in academic databases such as JSTOR).  This explains how "serious discourse" can go on in one part of a linguistic community thanks to activity in another part of the community, such as the scientific sector, that determines the meaning and reference of the relevant terms (that's why "definitions" don't have to be in the heads of the former).  This is directly relevant to the view you expressed.

Anyway, there's no need to be abusive.  Is this really how you were taught to talk to other people?

Worrying about compression is what I was labelling as "stupid". Worrying about compression (which is a very small thing in the bigger picture) is not something I'd recommend to anyone. I suggest you change your tone if you wish to continue on in these forums. Language can do all kinds of internally-contradictory nonsense. Just because a construct can exist within the bounds of language does not imply that a construct can exist within the bounds of logic or science. Just because I can talk about walking on water personally does not mean that I can walk on water.

You're still missing my main point: What's the point of any of your argument? That there are phenomena that cannot be tested? That tests are specific to the test being taken? It doesn't invalidate a blind test. If we assume that there are untestable subconscious phenomena behind lossy compression, the only difference in outcome is that we have a valid reason to switch to lossless. However, there are already valid reasons to use lossless. If we assume there are untestable subconscious phenomena behind lossy compression, then blind-tests designed to test conscious phenomena are still valid in the context of conscious phenomena.


As far as your main point is concerned, I don't see anything I disagree with in it, although I'm not sure I understand the last sentence (it seems kind of tautological).

In any event: my question was answered:

Consider the following hypothesis.  One of the signals has a fatiguing effect that occurs with longer stimuli, causing the way the music sounds to the person to be subtly different from the way it sounds via the other signal.

(1) Does the best currently available psychoacoustic theory say that this can't happen?


No.


I appreciate the information and thank you all for the discussion.

What's the problem with double-blind testing?

Reply #218
Clearly the burden of proof comment above is equally valid for people who say something exists, as for people to say that something might exist (i.e. it's plausible). You're being very careful (yeah, we're onto you) not to go all the way and say it exists, but you should provide a reason as to why you should think it might exist. So far I don't think you have.


For a start, see http://www.arts.uwaterloo.ca/~pmerikle/pap...tudies.1998.pdf .

OK before I go on and try to read the whole paper, is there any part that I should be focusing on? The abstract doesn't indicate anything that may support your position. On the contrary, for instance:
Quote from: from abstract link=msg=0 date=
In addition, recent studies of patients undergoing general anaesthesia have shown that the effects of stimuli perceived unconsciously during surgery can last for approximately 24 hours.

How do you think they tested if there were any effects of unconscious perception? We're not saying that there can't be unconscious perception, only that its effects if any can be tested, and clearly the authors of that paper think the same.

I am sorry to say that it's a common practice of pseudoscience sympathizers to quote irrelevant-but-relevant-sounding papers in the hope that (or sincerely misguidedly) people won't have the time to read them just for an internet discussion. So tell me again, which part of the paper should I be focusing on, and what do you think it states?

Quote from: Mark DeB link=msg=0 date=
If, on the other hand, you think that the possibilities I'm asking about are so implausible as not to be worth considering, on what scientific basis do you make that, or any, estimate of their probability?

On the basis that you haven't made a worthwhile case to be considered. You're running on assumptions that are unwarranted and contradictory (one can "unconsciously perceive" something as different than something else, yet it is not detectable?). And, you're trying to explain a phenomenon that is easily and better explained by other known mechanisms (placebo and all that).


Sorry, but statements about me are not a scientific basis for anything relevant.  And the sorts of phenomena described in the article clearly do raise the plausibility level of "might" well beyond the "flat earth" or "dragon in garage" level at which you caricatured them as being; against the background of the knowledge that such phenomena exist, it is not at all pointless to ask whether the sorts of phenomena I have asked about exist. 

Yes, you are right that the authors of the article think that unconscious perception can be detected in some ways (otherwise, how would they be able to write the article?), but the point, or one point, is that those ways of detecting it need not be limited to, and might have to be more subtle and indirect, than the sorts of listening tests that are supposed to be adequate for relevant purposes here.  Also, the authors talk about the effect of unconscious perception on emotional states, which might very well be detected not by the subject herself but using behavioral or physiological criteria.  That may give an idea of why I think that article is relevant, and I hope you find it useful.  Your questions in this post were good ones, and, in principle, I would be interested to discuss these issues further, but you have been sufficiently abusive to leave me disinclined to do this; I hope you will understand.  As I noted in my previous post, my question was answered.  Thank you for the discussion.

What's the problem with double-blind testing?

Reply #219
There is not one single instance in that paper that suggests the issues you've raised about lossy compression or different resolution are to be perceived differently depending on one's level of consciousness.

Sorry Mark, but you'll have to try a little harder.

What's the problem with double-blind testing?

Reply #220
Mark, you continue to miss the point. You can cite all the references you like. That does not make them true. This is especially true in philosophical contexts. Some philosophers argue that we can know nothing. So does that mean we do know nothing?

It is immensely ironic to me that you cite papers delineating semantic externalism while making such skeptical arguments, especially when semantic externalism is a philosophical position that stands in opposition to skepticism in many/most cases. You cannot have it both ways.  Your argument is muddled and internally-contradictory, and that's why you are under such heavy criticism. I am sorry you cannot see it that way, but I suggest that instead of writing us off, that you consider what we've written here.

We do like to argue here, but we get irritable when arguments get circular and debaters ignore important points, such as defining the topics of discussion. All the philosophical hand-waving in the world won't make it sensible to ignore a request of other people to define what it is you're talking about when you talk about a thing. Philosophers will ask you to do it. Scientists will ask you to do it. Programmers will demand you do it.

What's the problem with double-blind testing?

Reply #221
I think it's not said often enough that ABX testing, and controlled testing in general, is of primary importance only to reduce cost. Either by substituting a less expensive product for a cheaper one or by quantifying the risk of some damaging event (ie that some treatment does not in fact work and cannot be distinguished from placebo). Speaking truly off the cuff, I don't think it is a coincidence that blind testing and truly large-scale manufacturing both came into existence at around the same time. That controlled testing also happens to be the best way of evaluating sensory thresholds happens to be a nice coincidence - but I don't think that's necessarily the fundamental point to be making here.

If there did exist such a large gap in covered sensory experience, between "mainstream" blind testing and the experiences of everyday life, one would expect these attempts at cost reduction to be remarkably futile. The value of the supposedly ignored sensoria would rise as its ignorance allows its further and further abuse with further attempts at cost reduction; this "lost" value detracts from the value of the product. If this lost value is negligible, that represents that the effect is probably negligible, too. A lot of money flows around the idea that people like listening to music at 48kbps. That blind testing for cost reduction works - that we don't get ill after going to McDonald's, that they are growing, and our ears don't fall off after listening to XM/Sirius, and people tend not to have any problem listening to lossy encoded music on FM stations - seems to me to be a fairly powerful observation that any sensory gap with blind testing is so small as to be meaningless, and that the "human" factors are far better explanations for preferences which cannot be explained than the results of such controlled tests.

What I'm getting at here is that, while I tend to agree with the sentiment shifting to a distinctly logical positivist angle here, I'm not even sure I'd need to take that hard of a line. Assertions of nonexistence are not falsifiable, but, the possibility of existence can be reduced to a level of unimportance that make it meaningless to everybody. While my understanding of Mark's overall question here seems somewhat reasonable - the question of how blind listening results are converted into statements of universal applicability - the corner he's painting himself into, which is to assert the hypothetical existence of an effect, which does not show up on a blind test, and yet still matters - is ad hoc.

What's the problem with double-blind testing?

Reply #222
Fine.  But then it remains an open question whether the signals differ in properties relevant to perception, since you haven't suggested any way to resolve it.



Humans can easily 'perceive' two signals that are *irrefutably* objectively  identical -- i.e., the same signal presented at two different times  -- to be 'different'.  But your main concern is that DBT is somehow missing a real difference?


Quote
Your request for documentation of an observed effect misses the point.  I never claimed to have observed any such effect (and, if you have been following things, you know I don't assume that the effects in question can be reliably reported).  The question is whether there are any effects of the kind described in the hypothesis, and whether tests such as you describe would be relevant to finding this out.  Evidently, they would not be.


Unless and until actual responses by actual people cannot be adequately explained by current models, why do we need to 'find this out'?

What phenomenon are you imagining here that is 1) real and 2) not explained either by significant measurable difference in output of the gear, or placebo-like effects in the subject?


Quote
It may be that a hypothesis that posits things that are undetectable in principle would not be worth considering, but I am not suggesting such a hypothesis.  There might well be ways of detecting whatever properties and effects might make the hypothesis true, although those ways would probably vary with the properties/effects.  So there is no reason why some fixed method of determination should be hard wired into the hypothesis.  However, it is not hard to imagine an alternative to ABX listening tests of the kind you suggest: brain scans.  We might well have excellent theoretical reasons to think that, when a part of the brain lights up in a certain way in response to one stimulus but not another, that this corresponds to a difference in the experiences.


Oy vey, I hear Oohashi coming in 5...4...3...


A 'brain scan' will also show difference due to *belief* in difference, rather than a real difference.  E.g., feelings of pleasure (and their brain scan correlates) can be greater when the subject BELIEVES he is drinking expensive wine vs plonk..even if , in fact, he's been served plonk with an expensive label.


Quote
Look, here is another way of putting the point: Why should I bother with lossy formats for listening when I can use FLAC?


Storage space/bandwidth concerns.  Or a player that does not decode FLAC.

Quote
I don't want to lose any relevant information.  What is the criterion for relevance?  Many people seem to assume that conscious discrimination (in certain conditions) is an adequate criterion, but I see no real argument for this.  It's just a dogma.


Or maybe rationality?

What's the problem with double-blind testing?

Reply #223
People are mixing, again and again and again,  'quick switching' with "short auditioning".  Quick switching means the interval between the end of test signal A and the start of test signal B is short (preferably 'zero').  Use of 'short audition' means that the audition periods of A and B are THEMSELVES short -- e.g. measured in seconds rather than minutes/hours.  Both quick switching and use of short audition/short test signals tend to ENHANCE discrimination of difference, by keeping the difference 'fresh in memory'.  That's why they're recommended.

By 'long term ABX' is meant typically that the switching remains QUICK, even though the audition periods themselves are long. 

Anyone who insists that it takes hours of 'sighted' listening to A and B in order to reliably tell them apart, is free to go ahead and do that listening until they're sure they perceive a difference.  THEN try a quick-switch ABX test.


As was said repeatedly(!), ABX has no time limit.
I'm just not convinced(!). In my experience, when the time between two stimuli A and B becomes large (e.g. days, weeks), many non-auditory variables are introduced (e.g. memory).


Yes, that would make it a 'slow switch' ABX -- where the interval between the end of A and the start of B is long .  This makes the test LESS SENSITIVE to difference. 

Quote
Example: several years ago the seats in the Concertgebouw had to be replaced. Requirement was that the famous acoustics had to stay identical. After installation of the carefully designed seats many listeners had the impression that the acoustics had changed. How would you ABX this ?


Well, they probably DID change.  Or were the seats designed to be replicas of what they replaced -- down to the wear-in on the old ones?  But ABX ing it directly would be impossible, impossible to repeatedly (or quickly)  switch between the 'old seats' and the 'new seats'.  (Conceivably one could make a carefully set up recording while the old seats were in place, and another using the same setup with the new seats in place, and play that for listeners in an ABX test.  More easily one could just measure the acoustics of old and new from the same point and see if anything 'looks like' it will cause a significant audible difference.)

This is not equivalent to your friend's problem though.  It IS possible to quickly switch between 24/96 and 16/44, using whatever test signal length your friend desires.

What's the problem with double-blind testing?

Reply #224
For a start, see http://www.arts.uwaterloo.ca/~pmerikle/pap...tudies.1998.pdf .

If, on the other hand, you think that the possibilities I'm asking about are so implausible as not to be worth considering, on what scientific basis do you make that, or any, estimate of their probability?


But no one is saying unconscious influences have no effect on perception.  Indeed the primary reason for BLIND TESTING is so biases , often unconscious, do not contaminate the result  (e.g. induce unacceptably large Type I errors).

The real question is, is a 'perception' of difference a reliable indicator of objective difference?  Whatever the actual answer is, it surely is not 'YES, ALWAYS'.  Even a measured difference *in physiological response of the perceiver* is not a reliable indicator that the stimulus itself has changed (though it may indicate that *another* stimulus is at work).