Hydrogenaudio Forums

Hydrogenaudio Forum => Listening Tests => Topic started by: Arnold B. Krueger on 2015-03-18 19:33:20

Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-18 19:33:20
I'm wondering how people listen when doing ABX tests, as related to what order they prefer to listen to the alternatives.

IOW:

ABX and done (next trial)  (classic 1950 Munson and Gardiner ABX testing)

ABX and then repeat as desired

XABXABX... repeat as desired

any other preferred ordering?

No preference, random as the spirit leads..

Why have you chosen the method that you use?
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-18 21:01:21
I tried to do an ABX test once but after a couple of trials the "cognitive load" was just so overwhelming my head started to spin, I became nauseous, and my vision became blurred, so I walked over to my frig to relax with a cold soda. I opened it and saw both Coke and Pepsi, so my head exploded. Not a pretty scene. 

OK seriously. If  A and B are readily, discernibly distinguishable prior to the test:

X, I vote [since I already know what A and B are from the list order in the previous, pre-test screen], next trial...

X, I vote, next trial,

X, vote, next trial...repeat to end




If the difference is pretty easy, but not dead obvious:

X, A, "Did I hear any change at all?" I ask myself, vote, next trial...repeat to end




If the difference is subtle:

A, B, X, A "Did that last transition cause a change or not?", either vote or keep trying X, A, X, A, X, B, X, B etc, vote, next trial...repeat to end.

The concept of "juggling three things cognitively" never even occurs, for me, I'm focusing instead on which two options has an audible change when I transition, and which doesn't. And if you think about it, the real question I'm asking my brain is "Was that last transition audible?" So from my perspective I'm only jugging ONE concept.
Title: How do you listen to an ABX test?
Post by: castleofargh on 2015-03-18 21:02:47
I don't give up when I can't tell which is which, and I will maybe alternate longer listening and rapid ababababababababab. whatever feels like I have a chance.  that until I can actually hear something changing, or until my brain decides that it's enough and fools me into thinking I heard something different(he does it a lot, I must be boring him to death). then I listen to X as a kind of way to check if I can identify what I thought I had. so it can be another drama with multiple switches again before the actual vote. 
most of the times relatively rapid ABABA are what gives me the best results. not crazy fast, but no more than 2 or 3sec each on a part where I hope to find something. at least that's what I noticed when going at it with different mp3 levels.
and listening to half a song then switch is total bullshhhhh, I never got anything right that way.


anyway doing 20 trials does take me a good deal of time, mostly because I try to win!!!!!
Title: How do you listen to an ABX test?
Post by: Cubist Castle on 2015-03-18 21:11:16
In the first few trials I normally switch between A B and X in whatever order I feel like at the time - or even all 4 if the mood takes me.

Once I have the differences down I tend to listen only to X and A, or only X and B (if there is an artifact in one of A or B, then I generally listen to the other to compare X to).

If I'm not confident about the difference I guess I use the starting method throughout the trial.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-03-18 22:51:57
I usually don't bother with X until I think I've identified the difference between A and B. I then try to find that difference between A and X or B and X.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-19 13:08:48
It may need to be pointed out that the ABX test as implemented by the various ABX Comparators is a vastly different test than the ca. 1950 ABX test. The major difference is interactivity. In the 1950 version of the test, even as practiced today the three alternatives being A, B, and X were very brief often of duration only 100s of milliseconds, presented once in that order and then a choice was demanded.  All test parameters are chosen in advance and are not under the control of the listener.  This may be very appropriate for the purpose at hand which usually relates to the audibility of short sounds such as vowel and consonant sounds, which is vastly different from listening to music for enjoyment.

All modern ABX boxes that the author is aware of put the sonic alternative being listened to (which is a musical selection) under the full control of the listener. He may listen to the alternatives as desired in any order and number that the listener desires. Furthermore, most modern ABX Comparators allow the listener to choose the sound segment he bases his decisions on from a longer selection of music.

It should be pointed out that while A & B have  always been instantly available to the listener, they are not always referred to by the listener.  A listener who is familiar with the test being run can "Run the X's", referring to only the randomized selections, and still obtain perfect or at least statistically significant accuracy.

In the scientific literature ABX has been analyzed several different ways. Some observers have called it a 3AFC test, thinking that there are 3 alternatives: A, B, and X.  However X is not a different alternative than A or B. It is always one or the other so there are only at most 2 alternatives.

Others have called ABX  a 2AFC test thinking that there are 2 alternatives being A or B. However the listening task is not one of choosing from among two alternatives since the actual problem posed to the listener is whether X is either  known sound or not. If it is one known sound then it cannot be the other. The presentation of both known sounds is optional, instantaneous and provided as a convenience for the listener to exploit as he will.  ABX is in this way at most a highly desirable" "Yes/No" test.

It does not really matter which known sound (A or B) is chosen as the reference to compare X to because choosing either A or B as the reference can produce the same reliable outcome if the listener can reliably hear the difference between A and B.

Using either known sound as a reference is equally easy and convenient.  Since the sounds are accessed via an instantaneous random access technique and even the sound itself is under the control of the listener, either known sound can be juxtapositioned as close to the unknown sound as is needed or desired. Thus any claims that ABX requires the memorization of three different sounds are completely false. At  most just one sound may need be memorized.

A representative explanation of ABX testing takes the form of a video that can be viewed online via this URL: http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/ (http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/) .  This test is representative  the author's own experiences and the experiences of many others going back over 30 years to original ABX Comparator that the author built in the late 1970s.

The listener starts out by determining the sonic nature of the difference between A and B by means of a hypothesis about the technical nature of the difference and listening to a musical selection and using a sample editing feature common to ABX Comparators to isolate a note of music that he feels illustrates that difference.  He initially compares the unknown sample at hand to either A and B and records his choices but shifts to just listening to unknown samples and recording his results. He seems pleased with the accuracy and reliability of his results.

It is arguable that over especially the latter trials, the listener is not basing his conclusions on comparisons of memorized  musical selections but rather relying on a single qualitative judgment about the spectral balance of each unknown sample. It is comforting to observe that the technical difference that was hypothesized can be confirmed by detailed technical measurements. In the author's experience this sort of thing is a very common practice among experienced ABX testers.

Therefore claiming that ABX testing necessarily involves memorization of musical selections may be false.

Modern theories about short term memory suggest that approximately 7 such items can be remembered for up to 10 seconds or more. Since the workload in this ABX test example is just one item, and the druation of memory required can be very short even less than a second, it should be easy enough.

The author suspects that ABX's reputation for being tough is the natural result of being compared to sighted evaluations, which are not actually tests at all. A real test is always more work than a sham.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-03-19 14:10:19
I do it similar to mzil.

X is enough for easy tests, else X-A switching. If the difference between A and B is hard to detect I usually also do some additional X-B switching or even some switching between A and B to refresh my memory what to listen for.

Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-19 14:43:43
In the scientific literature ABX has been analyzed several different ways. Some observers have called it a 3AFC test, thinking that there are 3 alternatives: A, B, and X.  However X is not a different alternative than A or B. It is always one or the other so there are only at most 2 alternatives.

Others have called ABX  a 2AFC test thinking that there are 2 alternatives being A or B. However the listening task is not one of choosing from among two alternatives since the actual problem posed to the listener is whether X is either  known sound or not. If it is one known sound then it cannot be the other. The presentation of both known sounds is optional, instantaneous and provided as a convenience for the listener to exploit as he will.  ABX is in this way at most a highly desirable" "Yes/No" test.

I'm not sure to which scientific literature you are referring, but in standard psychophysics/signal detection theory terminology, ABX is a 2AFC match-to-sample test. The answers the subject gives are chosen from 2 alternatives (A or B) and are forced ("I don't know", "I can't decide" and "it sounds like a third" are NOT allowed), thus 2AFC. Because they match X to one of the two samples, A or B, it is "match-to-sample".
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Of course, even forced-choice tests can be terminated at any time by the subject! ;-) It's not *that* forced.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-03-19 16:23:33
I usually don't bother with X until I think I've identified the difference between A and B. I then try to find that difference between A and X or B and X.



This is what I do too. With fooABX I also sometimes check out 'Y' too, after homing in on a difference between A and B.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-19 16:38:59
In the scientific literature ABX has been analyzed several different ways. Some observers have called it a 3AFC test, thinking that there are 3 alternatives: A, B, and X.  However X is not a different alternative than A or B. It is always one or the other so there are only at most 2 alternatives.

Others have called ABX  a 2AFC test thinking that there are 2 alternatives being A or B. However the listening task is not one of choosing from among two alternatives since the actual problem posed to the listener is whether X is either  known sound or not. If it is one known sound then it cannot be the other. The presentation of both known sounds is optional, instantaneous and provided as a convenience for the listener to exploit as he will.  ABX is in this way at most a highly desirable" "Yes/No" test.

I'm not sure to which scientific literature you are referring, but in standard psychophysics/signal detection theory terminology, ABX is a 2AFC match-to-sample test. The answers the subject gives are chosen from 2 alternatives (A or B) and are forced ("I don't know", "I can't decide" and "it sounds like a third" are NOT allowed), thus 2AFC. Because they match X to one of the two samples, A or B, it is "match-to-sample".


As I subsequently pointed out an ABX test can and often is completed quite successfully without listening to either A or B, and they not infrequently completed successfully without listening to either.

Quote
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.


Agreed that the constraints on a particular ABX test can be added or relaxed per the preferences of the designer.  We have long known that whatever constraints we put on a test, someone would make a big thing about how our results would have proven their hobby horse theories if only...  We found that most people run out of energy for critical listening soon enough...

I don't see a response to the fairly frequent case where only A or B but not both are listened to, or the less frequent but documented cases where neither is listened to for all or part of the trials.

It is easy enough to reduce an ABX test trial to the simple question: "Does X sound like (pick one) A or B", which is a simple true or false question.  The remaining options were put in to make things easier for the listener.

Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.  IOW people seemed to need to be refreshed as to what same and different sounded like.

Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.  So we changed the question to the listener's choice of "Does X sound like A" or "Does X sound like B?"  do that all the listener had to do is listen to either A or B and then X.  In many cases we let the user just identify X by its sound without any further listening because that was what he wanted to do and once we scored the results we found that he was doing it accurately.

These are the sorts of things that happen when you compare two things that actually sound different. ;-)

Quote
Of course, even forced-choice tests can be terminated at any time by the subject! ;-) It's not *that* forced.


The point is that many critics of ABX including most of the golden ears such as Hartley and the Meridian/Dolby gang make up their own rules for ABX and then say its a bad test because of those rules.

I have reviewed a goodly number of papers and other documents on the topic, and almost all do not provide any rules or even suggestions  for organizing the sound samples or actually taking the test. 

In 1982 when Clark's paper was published quite a bit of what is now known or suspected about long and short term memory was either not well known or known at all. Therefore we agreed among ourselves to not constrain people with our possibly mistaken ideas about how to listen in an ABX environment.  It is several decades later and more is known.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-19 17:58:54
Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.  IOW people seemed to need to be refreshed as to what same and different sounded like.  Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.
 

I disagree. If you want maximum sensitivity to tiny differences you should completely eliminate  ANY need for the test subject, the listener, to either invoke their memory circuits or perform some task, even if you deem it to be mundane, like "These are called A, B, X, (and Y). Your task is to listen to all of them and then map out for me which should correctly be paired with each other." Yikes, that is making it way more complex than it needs to be and sounds like the convoluted arguments given by the scaremongers, like Stuart, to terrify people that they will need to jungle many concepts at once, in this FORCED task. (Forced being a scary word used to intimidate people that they shouldn't download FB2K/ABX because they will be forced into something. IT WORKS. Only a handful of people actually did so and posted results in the "AIX records test" AVSForum thread, for example.)

I challenge anyone to find me any web reference which describes, for example, a true/false test as being a "2AFC test", even though it actually is one. Nobody wants to be forced into anything and THAT's why they, Stuart et al, have invoked the terminology, if you ask me.

Quote
That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.
But they aren't mutually exclusive. I listen to A for a while. I listen to B for a while. I listen to a rapid fire transition from A to B, and maybe B to A, and maybe even A to A and/or B to B, just to be sure I have a good feel for what zero difference will sound like when I transition for the perception testing phase of the trial. AND THEN, I'm ready for the actual perception test when I rapid fire switch between A and X, and ask myself one very simple question:"Is there a difference?" Once I've made up my mind after applying my focused listening attention and 100% concentration, only then do I have to switch gears and do some actual thinking, not just perceiving, and process the concept "OK, I heard no difference on that last transition I repeated over and over again, so that means I'm supposed to select this box to vote". Easy, no memory involved, no concept juggling, no cognitive load from being forced to map anything to anything else during the perception stage, at all, and examples A, B, A to A, A to B, and B to B are all readily available both before and after the perception stage, should I choose to invoke them.

Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-19 18:39:41
Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.  IOW people seemed to need to be refreshed as to what same and different sounded like.  Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.
 

I disagree. If you want maximum sensitivity to tiny differences you should completely eliminate  ANY need for the test subject, the listener, to either invoke their memory circuits or perform some task, even if you deem it to be mundane, like "These are called A, B, X, (and Y). Your task is to listen to all of them and then map out for me which should correctly be paired with each other.


I showed how as much as is possible of the above is not necessary with an ABX Comparator.  There is no need to listen to A, B, X, and possibly Y.  All anybody who can actually reliably hear the difference at hand needs to listen to is either A or B (but not necessarily both) and compare the one they pick if only by random, to each X.

The process of doing an A/B test demands some means of evaluating the sound, but that means does not have to be based on comparing memorized sounds.

The golden ears can easily go on and on about obvious differences in sound quality. They obviously believe that they have cracked the code and know exactly what the audible differences are - thy write pages of prose about just that. The tiny little problem they have is that even if they can read all that wonderful prose while they are listening, it does nothing to improve their actual accuracy as listeners. They are some of the most likely to be random guessers!

Furthermore if a listener uses ABX and memory, he  only needs to memorize one sound, probably one of the known sounds. When he listens he compares the sound of the unknown he is listening to, to the sound he memorized but he does the comparison in real time so no memory of the unknown sound is required.

If you can reliably discern differences in sound quality, for example the difference between more bass and less bass (which the golden ears tell us in massive missives exists), then one should be able to use that ability to discern whether each X is either one of the knowns. It either has the same amount of bass as the known or it has a different amount of bass.

One of the tricks to ABX if there are any tricks at all is to realize that you can hear differences without any memory of the music.  If you use memory, you at worst need to memorize one musical sample, one of the two references.  The perceptual load is thus minimized either way. 

All the other stuff was put into ABX to facilitate learning what the audible difference is, and learning what the audible difference is, is an irreducible part of the problem.  The only reason why it isn't such an apparent problem with sighted evaluations is the fact that sighted evaluations aren't really tests. No test, no bother!

There are two stages to hearing differences - one is knowing what the difference is, and the other is applying that knowledge to the music at hand. If step two can't be executed to obtain a statistically significant score, then the results of step one are in doubt.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-19 19:12:56
As I subsequently pointed out an ABX test can and often is completed quite successfully without listening to either A or B, and they not infrequently completed successfully without listening to either.
Quote
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Quote
Agreed that the constraints on a particular ABX test can be added or relaxed per the preferences of the designer.  We have long known that whatever constraints we put on a test, someone would make a big thing about how our results would have proven their hobby horse theories if only...  We found that most people run out of energy for critical listening soon enough...
I don't see a response to the fairly frequent case where only A or B but not both are listened to, or the less frequent but documented cases where neither is listened to for all or part of the trials.

It is easy enough to reduce an ABX test trial to the simple question: "Does X sound like (pick one) A or B", which is a simple true or false question.  The remaining options were put in to make things easier for the listener

Well.... then you've changed the discrimination task to a detection task. Then it's a yes/no detection test, and it's inappropriate to call it ABX. ...perhaps AX, yes/no detection. This would (should) be used when different goals, analysis and type of results are planned.
Quote
Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.  IOW people seemed to need to be refreshed as to what same and different sounded like.
Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.  So we changed the question to the listener's choice of "Does X sound like A" or "Does X sound like B?"  do that all the listener had to do is listen to either A or B and then X.  In many cases we let the user just identify X by its sound without any further listening because that was what he wanted to do and once we scored the results we found that he was doing it accurately.

These are the sorts of things that happen when you compare two things that actually sound different. ;-)

It sounds as though you explored many things... but it also sounds as though some weren't really designed to give a meaningful result... you were just exploring. If A and B are so blatantly different that you don't even need to listen to both, why are you doing a test?

Quote
The point is that many critics of ABX including most of the golden ears such as Hartley and the Meridian/Dolby gang make up their own rules for ABX and then say its a bad test because of those rules.

Criticizing ABX as a tool is like criticising a hammer; if used well, no reason for criticism, if used poorly, criticize the design, not the tool. I don't know to which ABX tests you're referring, but if the results are accepted by a journal, it is worth evaluating critically, if not, who cares?
Quote
In 1982 when Clark's paper was published quite a bit of what is now known or suspected about long and short term memory was either not well known or known at all. Therefore we agreed among ourselves to not constrain people with our possibly mistaken ideas about how to listen in an ABX environment.  It is several decades later and more is known.

Well, yes, Massaro's work was in the mid-70's, but that and Cowan's work from the 80's show why the ITU standards recommend that an experienced researcher should do the experimental design, unless you're just messing around... in which case you can do what you want. :-)

EDIT: Sorry, I'm delayed. Have horrible internet on my cell phone... I'll get back in the loop in about 12 hours.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-19 19:45:37
As I subsequently pointed out an ABX test can and often is completed quite successfully without listening to either A or B, and they not infrequently completed successfully without listening to either.
Quote
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Quote
Agreed that the constraints on a particular ABX test can be added or relaxed per the preferences of the designer.  We have long known that whatever constraints we put on a test, someone would make a big thing about how our results would have proven their hobby horse theories if only...  We found that most people run out of energy for critical listening soon enough...
I don't see a response to the fairly frequent case where only A or B but not both are listened to, or the less frequent but documented cases where neither is listened to for all or part of the trials.

It is easy enough to reduce an ABX test trial to the simple question: "Does X sound like (pick one) A or B", which is a simple true or false question.  The remaining options were put in to make things easier for the listener

Well.... then you've changed the discrimination task to a detection task. Then it's a yes/no detection test, and it's inappropriate to call it ABX. ...perhaps AX, yes/no detection. This would (should) be used when different goals, analysis and type of results are planned.


The name ABX had nothing to do with the details of the listening task because the details of that task were unknown until after the box was built and names. The name had to do with what the box did.
Also remember that an ABX Comparator can and should be used as either a training tool or a testing tool.

I think its funny when people demand precise orthodoxy in the development of something almost 40 years after it was developed. In this case more orthodoxy was possible, as there were papers published in 1975 that were miles ahead of us in terms of orthodoxy, but appear in retrospect to have been headed down the primrose path. 

Quote
Quote
Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.  IOW people seemed to need to be refreshed as to what same and different sounded like.
Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.  So we changed the question to the listener's choice of "Does X sound like A" or "Does X sound like B?"  do that all the listener had to do is listen to either A or B and then X.  In many cases we let the user just identify X by its sound without any further listening because that was what he wanted to do and once we scored the results we found that he was doing it accurately.

These are the sorts of things that happen when you compare two things that actually sound different. ;-)


It sounds as though you explored many things... but it also sounds as though some weren't really designed to give a meaningful result... you were just exploring.


Actually, we were giving the Golden Ears much more credibility than they were found to deserve. We didn't know for sure that so many of their ideas couldn't possibly give a meaningful result.  In those days masking was not nearly as well understood as it is today. Many in those days thought that the Fletcher and Munson thresholds of hearing set the limits to audibility, but they imply much lower thresholds than we observed and we were concerned about that.

Quote
If A and B are so blatantly different that you don't even need to listen to both, why are you doing a test?


How do you know that without a DBT?  People say the darndest things, especially in a stereo shop. If I had a nickel for every time someone told me that A and B are so blatantly different that you don't even need to listen to both, and everybody ended up randomly guessing in a DBT...  ;-)  Ever hear the story about Steve Zipser and Tom Nousaine?

Quote
Quote
The point is that many critics of ABX including most of the golden ears such as Hartley and the Meridian/Dolby gang make up their own rules for ABX and then say its a bad test because of those rules.

Criticizing ABX as a tool is like criticising a hammer; if used well, no reason for criticism, if used poorly, criticize the design, not the tool. I don't know to which ABX tests you're referring, but if the results are accepted by a journal, it is worth evaluating critically, if not, who cares?


IME people may care more about many things that never get accepted to a journal than many things that do get accepted. ;-)

Title: How do you listen to an ABX test?
Post by: eric.w on 2015-03-19 19:52:43
As a newbie who's only tried a couple ABX tests, the general pattern I follow is:

I have "Keep playback position when changing tracks" unchecked in foo_abx, so whenever I click the A/B/X/Y buttons, playback jumps to the cue point and continues playing. At the start, I alternately click "Play A" and "B" on 1-2 second intervals, so I get a short section of the track looping and switching between A and B on each loop. Once I think I hear a difference, I might do X/A and X/B comparisons, Y/A, Y/B, etc.

Also, I found taking 5 second pauses is important - If A and B sound the same, stopping for 5 seconds seems to help sometimes.

In my limited experience, it doesn't feel like there's any memory involved when doing the test this way, you're just getting a stream of sound and listening for either a change or no change. I found it really important that pressing the A/B/X/Y buttons also jumps the playback back to the cue point. Otherwise, when you switch from A to B, you're hearing B but on a part of the clip you might not have heard for a while, so it feels like memory plays a bigger role.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-20 09:17:45
The name ABX had nothing to do with the details of the listening task because the details of that task were unknown until after the box was built and names. The name had to do with what the box did.
Also remember that an ABX Comparator can and should be used as either a training tool or a testing tool.

I think its funny when people demand precise orthodoxy in the development of something almost 40 years after it was developed. In this case more orthodoxy was possible, as there were papers published in 1975 that were miles ahead of us in terms of orthodoxy, but appear in retrospect to have been headed down the primrose path.
Wow! So in your usage, "ABX test" is any test done with an "ABX box". That's not a usage with which I'm familiar. If I measure current with an old VOM, is that measurement a "Volt-Ohm" or an "ampere" measurement? Who named the box "ABX" and why, if not to do an "ABX" test (2AFC match-to-sample)?
Everywhere else I've seen "ABX" used (but I haven't read everything! ;-), it means a 2AFC match-to-sample test.
I don't know what happened 40 years ago that you refer to it, but Psychophysics was created over 150 years ago by G.T. Fechner and "forced-choice" (particularly 2AFC) and even "ABX" as an auditory implementation began over 60 years ago. Since Blackwell first describing them in 1952, these methods have been refined and applied to all sensory systems since then. What happened 40 years ago?

I thought you presented ABX as a scientific method, and presented yourself as someone wanting to do scientifically valid tests (especially since you say "DBT" so much). If so, you wouldn't mock orthodoxy, when scientific validity often requires it.
Quote
Quote
If A and B are so blatantly different that you don't even need to listen to both, why are you doing a test?


How do you know that without a DBT?
I can easily distinguish two blatantly different sounds without a DBT, can't you? If the sounds are so dissimilar that an ABX test (the common usage, not yours) isn't needed (only an AX, as you describe), I doubt I need a DBT. Any reasonable attempt to "blind" the listener would prove convincing and a rigorous test with a statistically significant number of trials is unneeded. I guess I'm making assumptions about the goals of the test. Clearly you don't wish to publish it... is it for a purchase, or to bolster or discredit someone, or...?
Quote
Ever hear the story about Steve Zipser and Tom Nousaine?
No, what happened?

Quote
IME people may care more about many things that never get accepted to a journal than many things that do get accepted. ;-)
You are right: scientific validity and popularity are not the same. I misunderstood your goals... sorry :-)
Title: How do you listen to an ABX test?
Post by: xnor on 2015-03-20 10:38:52
I can easily distinguish two blatantly different sounds without a DBT, can't you?

That's not the problem.
What if a person claims that some lossy codecs cause blatant differences and therefore no DBT is needed - you should just accept his claim.
What if it is amplifiers? Or cables? Digital cables? Different audio players? Different buffering settings in audio players?

In some cases there may be big differences in sound, but in others there will not be real audible differences.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-20 12:18:59
I can easily distinguish two blatantly different sounds without a DBT, can't you?

That's not the problem.
What if a person claims that some lossy codecs cause blatant differences and therefore no DBT is needed - you should just accept his claim.
What if it is amplifiers? Or cables? Digital cables? Different audio players? Different buffering settings in audio players?

Sorry, I have to admit I'm rather confused about who is doing a DBT or ABX and why... my fault.
"a person claims" is exquisitly vague though. Who? Why do they claim it? Are they selling something? Is it published in a high-impact peer-reviewed journal? I guess I can say I typically wouldn't accept anyone's claim without a better understanding as to why they make the claim.

Is there a common scenario that would apply?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-20 12:43:48
So in your usage, "ABX test" is any test done with an "ABX box".


Not at all. For example you can do a sighted evaluation with an ABX box (ignore the X's), and that is obviously not an ABX test.

Quote
Everywhere else I've seen "ABX" used (but I haven't read everything! ;-), it means a 2AFC match-to-sample test.


That's the ca. 1950s test.

Wikipedia is wrong when it says: "A subject is presented with two known samples (sample A, the first reference, and sample B, the second reference) followed by one unknown sample X that is randomly selected from either A or B. The subject is then required to identify X as either A or B. "  That is the 1950 test.

The test I invented in ca. 1975 works like this:

A subject is presented with the opportunity to listen to two known samples (sample A, the first reference, and sample B, the second reference) and an unknown sample X (which is either A or B) at will. To complete a trial, the subject must identify X as being either A or B by listening to all 3 samples in full or in part or not at all, as he wishes. Repeating and editing samples is allowed and even encouraged as long as the edits are precisely applied equally to all 3 samples. Trials are repeated as was initially planned. Standard statistical tests are used to determine whether the listener was successfully detecting a difference or just guessing.

Quote
I don't know what happened 40 years ago that you refer to it, but Psychophysics was created over 150 years ago by G.T. Fechner and "forced-choice" (particularly 2AFC) and even "ABX" as an auditory implementation began over 60 years ago. Since Blackwell first describing them in 1952, these methods have been refined and applied to all sensory systems since then. What happened 40 years ago?


Some rebellious children of the 1960s took audio testing into their own hands because the audio establishment (The AES and IEEE, not the ASA) had failed, and somewhat independently created a reasonably disciplined and scientific test that gave as many advantages as is reasonably possible to the listener.

Quote
I thought you presented ABX as a scientific method, and presented yourself as someone wanting to do scientifically valid tests (especially since you say "DBT" so much). If so, you wouldn't mock orthodoxy, when scientific validity often requires it.


I have heard the opinion granted that you can do Science without wearing a strait jacket. ;-)


Quote
I can easily distinguish two blatantly different sounds without a DBT, can't you?


All the golden ears say that, and then they can't back their claims up as soon as we make them actually do a listening test instead of a sham (sighted evaluation).


Quote
Any reasonable attempt to "blind" the listener would prove convincing and a rigorous test with a statistically significant number of trials is unneeded. I guess I'm making assumptions about the goals of the test. Clearly you don't wish to publish it... is it for a purchase, or to bolster or discredit someone, or...?


That's what the golden ears say.

I say that claims of audibility are most easily judged for audible effects that have readily measurable relevant parameters, which among other things excludes lossy encoders. But it does include power amplifiers and cables.  For these things, the thresholds of hearing for easily measured artifacts are known or knowable. We've run a lot of these things through DBTs and we know what is clearly audible and what is clearly inaudible to a useful degree. Use measurements to judge them, because it is so fast and easy.

For everything else, in those cases where you have doubts that the audible effect is well-described by measurements, do DBTs.

Quote
Quote
Ever hear the story about Steve Zipser and Tom Nousaine?
No, what happened?


Steve Zipser was the owner of an audio store in a house on US 1 on the south side of Miami named Sunshine Stereo  (ironic name because there were a lot of shady deals reported) who posted on the Usenet rec.audio.opinion forum back in the 1990s when it was a vibrant and relevant place.  Every few months Steve would come up with some new wunder amplifier designed by Nelson Pass or someone like that, that sounded as he would say "Mind Blowingly Better".  He would hoot and holler about it online every night for weeks, he would sell some, and he would get the new purchasers to post what a great amp and what a great dealer Steve was.

Nousaine was writing for a number of audio publications and did a certain amount of debunking both online and in print publications. He challenged Steve to a DBT which was talked about for weeks and eventually happened. Tom (who lived in the Detroit area) went down to Miami with a friend and an ABX box and set up an ABX test of Steve's current darling amplifier in Steve's favorite system in Steve's house with Steve's favorite recordings.  Steve of course did a brilliant job of guessing randomly.  Steve was pretty honest about the test and its results for about a week, but then he started making excuses and tried to spin the story.

Tom's life continued on until his untimely death probably related to his diabetes, just lately. He kept writing, getting published, and doing fun things. Steve's life went down hill. A number of months maybe a year of more after the DBTs  Steve was found dead in his home or found alive but very ill by his wife who called the EMS and Steve was DOA. Stories vary, but that was the end of Steve, and shortly after that was the end of Sunshine Stereo.  This can all be confirmed from posts in Google Groups.

The point is that the claim "This difference is so great that a DBT is unnecessary" was first made a few months after we started doing DBTs in the mid 1970s, and it has ended up biting a lot of people who believed it. It has also led to the  conversion some of them (like Tom Nousaine) from Golden Earism to Science.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-20 14:19:09
Quote
Quote

Quote

Ever hear the story about Steve Zipser and Tom Nousaine?

No, what happened?


Steve Zipser was the owner of an audio store in a house on US 1 on the south side of Miami named Sunshine Stereo  (ironic name because there were a lot of shady deals reported) who posted on the Usenet rec.audio.opinion forum back in the 1990s when it was a vibrant and relevant place.  Every few months Steve would come up with some new wunder amplifier designed by Nelson Pass or someone like that, that sounded as he would say "Mind Blowingly Better".  He would hoot and holler about it online every night for weeks, he would sell some, and he would get the new purchasers to post what a great amp and what a great dealer Steve was.


To correct some factual errors:

Sunshine Stereo was at  9535 BISCAYNE BLVD MIAMI SHORES FL 33138 which is on the North side of Miami.

The ABX test with Nousaine was on August 25th 1997, and Zipser passed on Dec 31, 2000 so there were several years in between.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-03-20 14:45:24
"a person claims" is exquisitly vague though. Who? Why do they claim it? Are they selling something? Is it published in a high-impact peer-reviewed journal? I guess I can say I typically wouldn't accept anyone's claim without a better understanding as to why they make the claim.

Is there a common scenario that would apply?

Everyone interested in audio, who wants to make informed buying decisions, will eventually stumble upon such claims in any audio forum. Even here on HA where there are strict rules such claims will appear occasionally. In many other audio forums you will see this as the rule rather than the exception - and asking for evidence can even get you banned or your thread moved into a marginalized section. Or look into some audio magazines, not just the ridiculous cable ads but the articles themselves contain such claims.

Those are the sources where the average audio-interested Joe will get his "information" from. And while repeated assertion (http://en.wikipedia.org/wiki/Proof_by_assertion) does not make something true, people will still tend to believe (http://en.wikipedia.org/wiki/Illusory_truth_effect) it.

Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-20 15:58:30
The name ABX had nothing to do with the details of the listening task because the details of that task were unknown until after the box was built and names. The name had to do with what the box did.

So in your usage, "ABX test" is any test done with an "ABX box".


Not at all. For example you can do a sighted evaluation with an ABX box (ignore the X's), and that is obviously not an ABX test.

Well, now I’m confused as to how you use “ABX”! Is it the type of test or the box? I see that ABX without X is not ABX. Do you see that ABX without B is not ABX?

Some rebellious children of the 1960s took audio testing into their own hands because the audio establishment (The AES and IEEE, not the ASA) had failed, and somewhat independently created a reasonably disciplined and scientific test that gave as many advantages as is reasonably possible to the listener.

Quote
I thought you presented ABX as a scientific method, and presented yourself as someone wanting to do scientifically valid tests (especially since you say "DBT" so much). If so, you wouldn't mock orthodoxy, when scientific validity often requires it.


I have heard the opinion granted that you can do Science without wearing a strait jacket. ;-)

I *don’t* want to argue about the definition of “do Science”, but I can tell you haven’t published anything. That is not meant as a slight; I just believe we use some words differently. Let’s step way back and ask “why would you do an ABX or DBT anyway?” I assume you want to convince someone of something. Maybe it’s you, before a purchase, wanting to be sure of your decision. You would not need to be concerned with scientific validity, any more than needed to convince yourself. Very similar would be your friends or people who trust you. Do enough to convince them given the assumptions they grant you. But if you want to convince strangers, skeptics or opponents, the power of scientific rigour, meaning using sound experimental design, is quite the opposite of a “straight jacket”; it “frees” you to draw conclusions that will be convincing to the skeptic. But that may not be your goal, which is fine. You and I live in different worlds.

All the golden ears say that, …

Quote
Any reasonable attempt to "blind" the listener would prove convincing and a rigorous test with a statistically significant number of trials is unneeded. I guess I'm making assumptions about the goals of the test. Clearly you don't wish to publish it... is it for a purchase, or to bolster or discredit someone, or...?


That's what the golden ears say.

HEY! My ears are made of flesh and bone and have an appropriate, non-metallic color. So I don’t care what golden ears say. My comments stand on their own. If you are sure you know the result of a test without doing it, why do it? There is a world of room between no test and a scientifically valid one. I would choose a useful balance of convenience and rigour, depending on my goal. That may include sighted, or “sort of” blind, or definitely blind but only 5 trials… on and on, depending on the goal.

I say that claims of audibility are most easily judged for audible effects that have readily measurable relevant parameters, which among other things excludes lossy encoders. But it does include power amplifiers and cables.  For these things, the thresholds of hearing for easily measured artifacts are known or knowable. We've run a lot of these things through DBTs and we know what is clearly audible and what is clearly inaudible to a useful degree. Use measurements to judge them, because it is so fast and easy.

For everything else, in those cases where you have doubts that the audible effect is well-described by measurements, do DBTs.

I question your expertise in understanding auditory perception, but I’m open to being convinced! :-) By the way, you say “we” a lot; is that a royal “we” or who are your collarborators?

Quote
(snip)
It has also led to the  conversion some of them (like Tom Nousaine) from Golden Earism to Science.
That’s a nice ABX story. I like it. *That* is a good use of ABX. Do you have any stories with *your* ABX tests?
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-20 16:12:25
"a person claims" is exquisitly vague though. Who? Why do they claim it? Are they selling something? Is it published in a high-impact peer-reviewed journal? I guess I can say I typically wouldn't accept anyone's claim without a better understanding as to why they make the claim.

Is there a common scenario that would apply?

Everyone interested in audio, who wants to make informed buying decisions, will eventually stumble upon such claims in any audio forum. Even here on HA where there are strict rules such claims will appear occasionally. In many other audio forums you will see this as the rule rather than the exception - and asking for evidence can even get you banned or your thread moved into a marginalized section. Or look into some audio magazines, not just the ridiculous cable ads but the articles themselves contain such claims.

Those are the sources where the average audio-interested Joe will get his "information" from. And while repeated assertion (http://en.wikipedia.org/wiki/Proof_by_assertion) does not make something true, people will still tend to believe (http://en.wikipedia.org/wiki/Illusory_truth_effect) it.

Of course. That makes sense. Here (HA) you need to provide proof of a claim. Elsewhere, I would, and would recommend, ignoring implausible, unverified claims. And to protect those who can't ignore it, we should criticize false claims. But when we do so, we must not make similar mistakes. Our counter-claims should not be equally flimsy. Doing an ABX, where you "know" the result before doing it, introduces its own biases. Publishing a null result when you "know" there is no audible difference is suspect, easily and correctly challenged, and only confuses the person we're trying to help.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-03-20 16:25:31
It's hard to reason a person out of a position that he/she did not reason into. That's one reason why I've mostly given up on making counter arguments.

The burden of proof is on the one making the claim. Ask for evidence, over and over and over again, until either the person weasels out, manages to censor you, admits to failing to provide it or actually provides it.
Only in the last case we can investigate further to see if the claim really is true.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-20 16:29:22
I can easily distinguish two blatantly different sounds without a DBT, can't you?

That's not the problem.
What if a person claims that some lossy codecs cause blatant differences and therefore no DBT is needed - you should just accept his claim.
What if it is amplifiers? Or cables? Digital cables? Different audio players? Different buffering settings in audio players?


"A person claims" is exquisitely vague though.


Only true if you haven't seen it happen in many places at many times. Many of us have been there and done that. I am surprised that you are surprised... ;-)

Start here: Neil Young Hates MP3s for fun and profit (http://www.wired.com/2012/02/why-neil-young-hates-mp3-and-what-you-can-do-about-it/)

The highest profile example of this sort of thing we've seen lately is probably the story of Neil Young, the Kickstarter web site, and  the Pono digital player which you are invited to Google.


Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-23 13:25:45
Only true if you haven't seen it happen in many places at many times. Many of us have been there and done that. I am surprised that you are surprised... ;-)

Start here: Neil Young Hates MP3s for fun and profit (http://www.wired.com/2012/02/why-neil-young-hates-mp3-and-what-you-can-do-about-it/)

The highest profile example of this sort of thing we've seen lately is probably the story of Neil Young, the Kickstarter web site, and  the Pono digital player which you are invited to Google.

Hey, thanks for the invitation to use Google... ;-)
Your being surprised at my being surprised made me try to answer my questions above on my own, so I took you up on your invite and read a bunch of stuff. I was already familiar with most of it, but I read some new stuff, including the Wired article (new to me). The flaws in that article are so numerous that listing *all* of them and explaining why each is a flaw would result in a post longer than the article. I'm not opposed to writing, but that task would be boring... you know, fish in a barrel..

But I'm still left with multiple questions, the biggest being: why do *YOU* do ABX tests? By *YOU*, I mean the 7 who have answered on this thread (mzil, castleofargh, Cubist Castle, pelmazo, xnor, krabapple, eric.w) and the OP (Arnold B. Krueger). I can think of 3 answers for me personally: 1-I want to buy something (perhaps expensive) and I want to be sure; 2-I want to make a claim of an audible difference on a hobbyist website (like HA) or in a non-scientific magazine (like Wired); 3-I want to publish in a journal. For me, practicality would be a big thing. To validly compare 2 files would be easy to describe and easy to do. To compare hardware would be easy to describe, but a big hassle to do. So comparing formats (AAC vs. RBCD vs. HiRes) for cases 1 (buy from iTunes Store or CD or HDTracks?) and 2 (write about it informally) would be clear. But case 3 and *all* hardware comparisons would require a cost-benefit decision that I'm trying to imagine the 8 people who have posted would have done. Why have *YOU* done it? Telling me about salespeople's claims or bad articles doesn't explain why you would. This is not meant as a challenge; I'm curious.

Also about the Wired link: when I buy a "mixed" book of puzzles, I skip the "easy" and go for challenging (to me, of course). The Wired article is a waste of anybody's time, IMO. It's not clear why you linked it. What article or claim (about Pono or HiRes or whatever you find relevant) do you find most challenging to answer? And what is your answer? (I don't expect you to repeat a large effort... I'd guess you could just link a challenging article and link a response you've already made.... if you have time.) TIA
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-23 18:37:16
[Your being surprised at my being surprised made me try to
But I'm still left with multiple questions, the biggest being: why do *YOU* do ABX tests? By *YOU*,
I mean the 7 who have answered on this thread (mzil, castleofargh, Cubist Castle, pelmazo, xnor, krabapple, eric.w) and the OP (Arnold B. Krueger).
I can think of 3 answers for me personally:
1-I want to buy something (perhaps expensive) and I want to be sure;
2-I want to make a claim of an audible difference on a hobbyist website (like HA) or in a non-scientific magazine (like Wired);
3-I want to publish in a journal. For me, practicality would be a big thing.
To validly compare 2 files would be easy to describe and easy to do.
To compare hardware would be easy to describe, but a big hassle to do.
So comparing formats (AAC vs. RBCD vs. HiRes) for cases 1 (buy from iTunes Store or CD or HDTracks?) and 2 (write about it informally) would be clear. B
ut case 3 and *all* hardware comparisons would require a cost-benefit decision that I'm trying to imagine the 8 people who have posted would have done.
Why have *YOU* done it? Telling me about salespeople's claims or bad articles doesn't explain why you would. This is not meant as a challenge; I'm curious.


(1) Lately one of the big drivers for doing DBTs has been the fact that the thresholds for the audibility of jitter are complex and not nailed down tightly enough to please me.
(2) the general driver for me doing DBTs is that I don't always trust my more-casual perceptions and need periodic reality checks.

Quote
Also about the Wired link: when I buy a "mixed" book of puzzles, I skip the "easy" and go for challenging (to me, of course).
The Wired article is a waste of anybody's time, IMO. It's not clear why you linked it.
What article or claim (about Pono or HiRes or whatever you find relevant) do you find most challenging to answer?


To answer them for a reasonable, well informed person: None of them are challenging.

However the world is full of poorly informed people who may also have poor critical reasoning skills.

Quote
And what is your answer? (I don't expect you to repeat a large effort...
I'd guess you could just link a challenging article and link a response you've already made.... if you have time.) TIA


The last such thing that I have put much effort into is this article:

AES Conference paper about alleged problems with digital players and high resolution audio (https://secure.aes.org/forum/pubs/conventions/?ID=416)


Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-23 18:48:28
One of the reasons why I conduct ABX tests on myself is to document to others that I can easily replicate, with strong statistical significance, the ability to distinguish between two files that are part of a posted challenge of Hi-res audio vs. standard CD quality versions (16/44), the former being claimed by some of the "golden-eared"/"trained" con artists and snake oil peddlers which frequent the audio forums as being "better", so as to discredit their posted test results and expose to all that there was simply some tiny difference with, for example, minor level differences and/or time alignment between the two files. I did this in the AVS forum last year with their AIX records challenge.
Title: How do you listen to an ABX test?
Post by: castleofargh on 2015-03-23 20:22:05
Only true if you haven't seen it happen in many places at many times. Many of us have been there and done that. I am surprised that you are surprised... ;-)

Start here: Neil Young Hates MP3s for fun and profit (http://www.wired.com/2012/02/why-neil-young-hates-mp3-and-what-you-can-do-about-it/)

The highest profile example of this sort of thing we've seen lately is probably the story of Neil Young, the Kickstarter web site, and  the Pono digital player which you are invited to Google.

Hey, thanks for the invitation to use Google... ;-)
Your being surprised at my being surprised made me try to answer my questions above on my own, so I took you up on your invite and read a bunch of stuff. I was already familiar with most of it, but I read some new stuff, including the Wired article (new to me). The flaws in that article are so numerous that listing *all* of them and explaining why each is a flaw would result in a post longer than the article. I'm not opposed to writing, but that task would be boring... you know, fish in a barrel..

But I'm still left with multiple questions, the biggest being: why do *YOU* do ABX tests? By *YOU*, I mean the 7 who have answered on this thread (mzil, castleofargh, Cubist Castle, pelmazo, xnor, krabapple, eric.w) and the OP (Arnold B. Krueger). I can think of 3 answers for me personally: 1-I want to buy something (perhaps expensive) and I want to be sure; 2-I want to make a claim of an audible difference on a hobbyist website (like HA) or in a non-scientific magazine (like Wired); 3-I want to publish in a journal. For me, practicality would be a big thing. To validly compare 2 files would be easy to describe and easy to do. To compare hardware would be easy to describe, but a big hassle to do. So comparing formats (AAC vs. RBCD vs. HiRes) for cases 1 (buy from iTunes Store or CD or HDTracks?) and 2 (write about it informally) would be clear. But case 3 and *all* hardware comparisons would require a cost-benefit decision that I'm trying to imagine the 8 people who have posted would have done. Why have *YOU* done it? Telling me about salespeople's claims or bad articles doesn't explain why you would. This is not meant as a challenge; I'm curious.

Also about the Wired link: when I buy a "mixed" book of puzzles, I skip the "easy" and go for challenging (to me, of course). The Wired article is a waste of anybody's time, IMO. It's not clear why you linked it. What article or claim (about Pono or HiRes or whatever you find relevant) do you find most challenging to answer? And what is your answer? (I don't expect you to repeat a large effort... I'd guess you could just link a challenging article and link a response you've already made.... if you have time.) TIA


I do some abx tests when in doubt. I did a series of abx over the years to pick my file format on portable gears.  some decide they were born knowing, some ask on the web and trust answers from all over the place. I decided in such occasions that as it was for my ears, I should be the one doing the test. I knew from a few delusional failures how sighted tests are more often than not in audio, just useless crap! so I go for abx when it can answer my question, and something else when it cannot or my answer could be satisfied by measurements.

I also use ABX when I'm curious about my own limits, like making tracks with added noise or music over another music at different loudness to find out where noise really matters to me in practice. Arny posted a few cool files over the years that my curiosity just couldn't resist(last one I tried was with jitter I guess). those kind of stuff. I'm ignorant and curious so opportunities are all around me .

and I guess I'm the opposite of mzil as I never ever do an abx to prove something to anybody. in fact I have never published one result on a forum. at best I would mention that I failed or passed an abx. but that's it.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-24 14:07:52
Thanks Arnold B. Krueger! (can I call you Arny? Everyone else seems to.) I'll read the Stuart paper and the 2 threads about it on HA that I found.
Thanks mzil, I'll read the AIX thread on AVS. Are you m.zillch on AVS?
Wow that'll be a lot of reading... but I enjoy it.
Thanks castleofargh. Your usage seems like the closest to what I would do. I wouldn't put sighted testing (my own) at the bottom of the list, although I know the high risk of biases. Salespeople, listening claims from people with unknown or suspect motives, and zero information fall below it for me. But when I use the word blatant, I know how I mean it, and some things would be blatant to me (iPhone speaker vs. Audioengine 2 = no brainer, blatant). Because I know I'm prone to bias, I too will do some blind ABXing of audio file formats. Did you mention in the forums your ABX pass/fail results for your portable file format choice?
Any suggestions on how ABXphile people test headphones? I can't really do that blind. Or can I?

EDIT: mzil, did you mean "AVS/AIX High-Resolution Audio Test: Take 2" or are your ABX comments in the first part?
Title: How do you listen to an ABX test?
Post by: pdq on 2015-03-24 14:13:34
I doubt that most people would bother ABX testing of headphones because headphones do sound very different from each other. Testing of headphones would fall into the category of personal preference, and you are certainly entitled to your personal preference.

Of course, sighted evaluation of headphones is subject to bias, but this may be a case where it is OK to be influenced by factors other than how they sound.
Title: How do you listen to an ABX test?
Post by: castleofargh on 2015-03-24 15:30:25
Thanks Arnold B. Krueger! (can I call you Arny? Everyone else seems to.) I'll read the Stuart paper and the 2 threads about it on HA that I found.
Thanks mzil, I'll read the AIX thread on AVS. Are you m.zillch on AVS?
Wow that'll be a lot of reading... but I enjoy it.
Thanks castleofargh. Your usage seems like the closest to what I would do. I wouldn't put sighted testing (my own) at the bottom of the list, although I know the high risk of biases. Salespeople, listening claims from people with unknown or suspect motives, and zero information fall below it for me. But when I use the word blatant, I know how I mean it, and some things would be blatant to me (iPhone speaker vs. Audioengine 2 = no brainer, blatant). Because I know I'm prone to bias, I too will do some blind ABXing of audio file formats. Did you mention in the forums your ABX pass/fail results for your portable file format choice?
Any suggestions on how ABXphile people test headphones? I can't really do that blind. Or can I?

EDIT: mzil, did you mean "AVS/AIX High-Resolution Audio Test: Take 2" or are your ABX comments in the first part?

I'm very suspicious of myself because of my own cognitive biases, and the ideas I tend to develop without control on them. it's easy enough to get an opinion on something when I get it and try it without control. most of the times unless there is something really wrong, I will just confirm that I was right about my preconceptions(that's what a brain does all day long). then I'll put together some matter of a controlled test like abx when I can. and poof! half of it blows up to my face. so from my own experiences, not using a controlled test is taking a big chance at being wrong and make a fool of myself. something that very obviously doesn't bother many audiophiles...
but I guess it's not ignorance toward audio, but more ignorance about how humans really are. and big ego will always be in the way of auto-evaluation.
to me, most serious studies in sciences are done with blind tests for a reason. the same reason there are marketing schools .

I doubt  that most people would bother ABX testing of headphones because  headphones do sound very different from each other. Testing of  headphones would fall into the category of personal preference, and you  are certainly entitled to your personal preference.

Of course,  sighted evaluation of headphones is subject to bias, but this may be a  case where it is OK to be influenced by factors other than how they  sound.

you can't abx headphones. even if they were to feel the same on my head, and I closed my eyes, the delay to switch would be too much for that purpose.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-24 16:53:36
Thanks mzil, I'll read the AIX thread on AVS. Are you m.zillch on AVS? Wow that'll be a lot of reading... but I enjoy it.
Yes. This post cuts to the chase:

http://www.avsforum.com/forum/91-audio-the...ml#post28355562 (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1598417-avs-aix-high-resolution-audio-test-take-2-a-4.html#post28355562)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-24 19:43:08
Thanks mzil, I'll read the AIX thread on AVS. Are you m.zillch on AVS? Wow that'll be a lot of reading... but I enjoy it.
Yes. This post cuts to the chase:

http://www.avsforum.com/forum/91-audio-the...ml#post28355562 (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1598417-avs-aix-high-resolution-audio-test-take-2-a-4.html#post28355562)


Does this post reference the defective files that Amir and Fremer listened to and then claimed that they obtained world changing positive results?

BTW I have disqualified the test files that I provided that were mentioned in post #150 same thread, on the grounds that their downsampling involved digital filters with unrealistically narrow transition bands.

The transition band that I used for that particular downsampling job was about 100 Hz wide, while a typical real world 44.1 KHz DAC has a transition band that might be several KHz wide. The lowest quality that CEP 2.1 provides is pretty close to delivering a 1.5 KHz transition band @ 44.1 KHZ. The highest quality setting is what I used.

Graphics here: Link to Transition band graphics in uploads forum (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=893391)
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-24 22:34:40
Thanks mzil, I'll read the AIX thread on AVS. Are you m.zillch on AVS? Wow that'll be a lot of reading... but I enjoy it.
Yes. This post cuts to the chase:  http://www.avsforum.com/forum/91-audio-the...ml#post28355562 (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1598417-avs-aix-high-resolution-audio-test-take-2-a-4.html#post28355562)
  Does this post reference the defective files that Amir and Fremer listened to and then claimed that they obtained world changing positive results?
The two files are the exact same time code, about 1m52s into the AIX records' song, provided for the AVS forum challenge by Dr. Waldrup, called "Mosaic". These are the newer A2 and B2 versions which are said to have corrected a small level mismatch found in the original released versions, hence the number "2" in both of their designations.

I stopped reading any material from Fremer in the 1980/90s and Amir some time last year, so I can't comment on their propaganda. Krabapple may know more about what they claim.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-03-25 21:22:35
But I'm still left with multiple questions, the biggest being: why do *YOU* do ABX tests? By *YOU*, I mean the 7 who have answered on this thread (mzil, castleofargh, Cubist Castle, pelmazo, xnor, krabapple, eric.w) and the OP (Arnold B. Krueger).


To find out if something really sounds different to me, typically when there debate about whether it sound different to *anyone*.

Sometimes it does (p<0.05) sometimes not.

If you hang around audio forums long enough, plenty of such debates arise.

Perhaps you are a bit new to this ? 
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-25 22:04:53
People don't talk about it very much but fb2k ABX is a fantastic sighted listening aid as much as it is a double blind testing tool [just click A and B and never even examine X]. It let's you pick whatever files you want, synchronizes their playback [assuming they were made properly], applies DSP or Replaygain optionally, switches at any point you want, loops a favorite section, and most importantly switches nearly instantaneously between A and B at the listener's discretion. Echoic memory is fleeting and being able to flip between two options so quickly and easily greatly improves one's sensitivity.

Putting this switching control in the listener's hand also is important. When I read about a/b testing where the switchovers are not in the control of the test listener but instead are done at predetermined time marks or at the control of the test administrator, I always think to myself how much better the listeners would have done had they been the ones pressing the button. Bob Stuart's recent paper's trashing of CD quality sound in order to promote Hi-re$ examination of digital filters would be an example of that.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-26 10:33:51
… most serious studies in sciences are done with blind tests for a reason.
Well, I know what you mean - human studies where knowledge of test parameters could influence results - but *most* studies don’t fall in this category. You don’t need to do blind testing with protons, yeast or fruit flies… you just need to control the variables. ;-)
I doubt  that most people would bother ABX testing of headphones because  headphones do sound very different from each other. Testing of  headphones would fall into the category of personal preference, and you  are certainly entitled to your personal preference.

Of course,  sighted evaluation of headphones is subject to bias, but this may be a  case where it is OK to be influenced by factors other than how they  sound.
you can't abx headphones. even if they were to feel the same on my head, and I closed my eyes, the delay to switch would be too much for that purpose.
Sorry about picking nits, but I *could* ABX headphones if I wanted to publish the results, but it would just be too much trouble for a personal purchase. That was why I am asking some basic questions, trying to understand how others do it, and whether there are shortcuts (hardware/software) to minimise the hassle. I find the responses here informative. Thanks, all.
Echoic memory is fleeting and being able to flip between two options so quickly and easily greatly improves one's sensitivity.
It seems a lot of people are very focussed on echoic memory. It would seem that this is crucially important for *some* tests, but inappropriate, perhaps counterproductive for others. If I want to compare two pure tones with a:slightly differing amplitudes, constant pitch, or b:slightly differing pitch, constant amplitude. I would certainly be concerned with echoic memory. But if I wanted you to identify which instrument is played for an E plucked on a lute or mandolin, you need a long enough sample to identify it and could probably remember your answer for hours or days. My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesised auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long. For headphones or speakers, am I looking for subtle differences that will disappear from memory quickly or are the differences “abstractable” so I can remember them? I suspect the latter may be important to me. I agree with pdq that non-sonic factors will play an important subjective role, e.g. comfort and (for me) cost.
If you hang around audio forums long enough, plenty of such debates arise.

Perhaps you are a bit new to this ?
LOL, yes, I’m new to audio forums (since Dec.) and only recently started posting. There is a low SNR on most forums, so I thought I’d ask some questions of my own. I have to admit that there is some short-lived entertainment value (ala Jerry Springer Show) in the “punch-outs”. But they tend to distract when seeking info. Thanks for responding about your use.
People don't talk about it very much but fb2k ABX is a fantastic sighted listening aid as much as it is a double blind testing tool [just click A and B and never even examine X]. It let's you pick whatever files you want, synchronizes their playback [assuming they were made properly], applies DSP or Replaygain optionally, switches at any point you want, loops a favorite section, and most importantly switches nearly instantaneously between A and B at the listener's discretion.
That’s a useful tip. Thanks. I use Macs and downloaded a program called ABXTester (not *nearly* as nice as fb2k that you describe). I also have Parallels (running Windows 7) on one of my Macs. Does anyone know if that works well?

… I always think to myself how much better the listeners would have done had they been the ones pressing the button. Bob Stuart's recent paper's trashing of CD quality sound in order to promote Hi-re$ examination of digital filters would be an example of that.
Are you saying you believe their results would have been better with listener-switching? I’ll read the paper soon, and the HA threads about it, to further my education.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-26 12:13:30
I doubt that most people would bother ABX testing of headphones because headphones do sound very different from each other.


AFAIK there is no controversy over whether or not headphones sound different from each other. The measured differences in important areas are well above known and even highly conservative thresholds of audibility. Furthermore headphones feel different on the head so there are irreducable non-audible factors in the evaluation.

ABX was not designed for headphone or speaker testing. It was designed for those situations where there is a serious question as to whether an audible difference even exists.

Quote
Testing of headphones would fall into the category of personal preference, and you are certainly entitled to your personal preference.


I think that some attributes of headphones rise above mere personal preference such as comfort particularly long term comfort, dynamic range, nonlinear distortion, isolation, and smoothness of response. 

Frequency response at the ear drum is subject to natural variations based on how the headphones and particularly earphones interface with the ear.

Quote
Of course, sighted evaluation of headphones is subject to bias, but this may be a case where it is OK to be influenced by factors other than how they sound.


IME people's preferences for headphones are strongly affected by frequency response which is easy to manage with equalization provided the headphones are reasonably smooth, extended, and have good dynamic range.  Unlike speakers the frequency response of headphones is well described by a single value per frequency for the acoustic signal, which is air pressure in the hearing canal.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-26 12:32:24
Are you saying you believe their results would have been better with listener-switching?


One of the founding principles of ABX testing as most people here know it is that the most sensitive results are obtained when the listening test allows the listener to interact with the process.

Experience shows that the selection of the segment of music used in the comparison is a very important parameter.

To establish context, I'm referring to this article:

The Audibility of Typical Digital Audio Filters in a High-Fidelity Playback System (https://secure.aes.org/forum/pubs/conventions/?ID=416)

The conference paper is based on a straw man argument against ABX. The ABX test that they criticize is the 1950 version which was not interactive. The ABX test that has been widely used for audio component testing during the past approximate 38 years is highly interactive.

I find it curious that in his recent 3/16/2015 response, John Stuart continues to make this rather grotesque error, even though he has been publicly corrected for it since 2/23/2015. Old dogs, new tricks or tacit admission that without the error, his criticism of ABX simply has no basis?

If you look at Stuart's 3/16/2015 one might find a number of criticisms of the listening test that he used in his conference paper. The music segments used in the conference paper appear to have been highly arbitrarily chosen and listened to non-interactively.

Quote
I’ll read the paper soon, and the HA threads about it, to further my education.


You may wish to review these posts about modern ABX testing of audio components:

Forum post explaining modern ABX testing #1 (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&view=findpost&p=892938)

Forum post explaining modern ABX testing #2 (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&view=findpost&p=892991)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-26 12:53:45
Echoic memory is fleeting and being able to flip between two options so quickly and easily greatly improves one's sensitivity.

It seems a lot of people are very focused on echoic memory.


As is a lot of the scientific literature.

Quote
It would seem that this is crucially important for *some* tests, but inappropriate, perhaps counterproductive for others. If I want to compare two pure tones with a:slightly differing amplitudes, constant pitch, or b:slightly differing pitch, constant amplitude. I would certainly be concerned with echoic memory.


I disagree. Those are all IME simple attributes that can be dealt with by abstract memory. I don't think that people have to memorize pure tones at every possible frequency to know what both pure and impure tones sound like.  I'm under the impression that when listening to tones I discern abstract properties such as steadiness of pitch and loudness in something that is pretty close to real time and I don't have to have a memory of what every different frequency sounds like to do this. Along the lines I have learned different rules for what a pure tone sounds like at vastly different frequencies but over wide ranges, the its the rules, not any specific memory that dictates my judgement.

Quote
But if I wanted you to identify which instrument is played for an E plucked on a lute or mandolin, you need a long enough sample to identify it and could probably remember your answer for hours or days.


I again disagree. A lot hinges on what properties of the lute note are changing between the samples. If the changing property is one that is familiar to me such as pitch, loudness or timbre, then things work more or less as is suggested above. But if the property that is changing between the samples is unfamiliar to me then at least initially echoic memory probably has a lot to do with it. As I listen to the comparison more often I often find that the property that is changing is added to my internal multidimensional list list of properties of lute notes, and then I can detect those new differences based on synthesized auditory memories.
 

Quote
My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesized auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long.


Agreed. That is a workable list. It may not be complete.

Quote
For headphones or speakers, am I looking for subtle differences that will disappear from memory quickly or are the differences “abstractable” so I can remember them? I suspect the latter may be important to me. I agree with pdq that non-sonic factors will play an important subjective role, e.g. comfort and (for me) cost.


I think that things like purchase decisions would ideally be based on abstractable differences. I think that most subjective reviews pretend to be based on abstractable differences but due to the low quality listening tests that those reviews are based on, many of the purported abstractable differences are either wrong, poorly expressed, or purely based on imagination.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-26 13:37:05
It seems a lot of people are very focused on echoic memory.

As is a lot of the scientific literature.

Hmmm. References? I just double-checked: auditory memory researchers do not focus so heavily on echoic memory. It is one link in the chain. Check out Nelson Cowan's work. He's the best known (and most cited) researcher in this area.
Quote
It would seem that this is crucially important for *some* tests, but inappropriate, perhaps counterproductive for others. If I want to compare two pure tones with a:slightly differing amplitudes, constant pitch, or b:slightly differing pitch, constant amplitude. I would certainly be concerned with echoic memory.

I disagree. Those are all IME simple attributes that can be dealt with by abstract memory. I don't think that people have to memorize pure tones at every possible frequency to know what both pure and impure tones sound like.  I'm under the impression that when listening to tones I discern abstract properties such as steadiness of pitch and loudness in something that is pretty close to real time and I don't have to have a memory of what every different frequency sounds like to do this. Along the lines I have learned different rules for what a pure tone sounds like at vastly different frequencies but over wide ranges, the its the rules, not any specific memory that dictates my judgement.

In order to do a differential threshold test, as I describe, you must use echoic memory. You can't abstract a loudness. Sorry, it won't work. If you allow even a few seconds between samples, the threshold value will be incorrectly too high.

Quote
But if I wanted you to identify which instrument is played for an E plucked on a lute or mandolin, you need a long enough sample to identify it and could probably remember your answer for hours or days.

I again disagree. A lot hinges on what properties of the lute note are changing between the samples. If the changing property is one that is familiar to me such as pitch, loudness or timbre, then things work more or less as is suggested above. But if the property that is changing between the samples is unfamiliar to me then at least initially echoic memory probably has a lot to do with it. As I listen to the comparison more often I often find that the property that is changing is added to my internal multidimensional list list of properties of lute notes, and then I can detect those new differences based on synthesized auditory memories.

Your identification of the lute, as a lute, relies on abstract memory, and your remembering your identification has nothing to do with echoic memory - pure abstract memory.

Quote
My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesized auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long.

Agreed. That is a workable list. It may not be complete.

Nelson Cowan calls it complete. How would you complete it?

Quote
For headphones or speakers, am I looking for subtle differences that will disappear from memory quickly or are the differences “abstractable” so I can remember them? I suspect the latter may be important to me. I agree with pdq that non-sonic factors will play an important subjective role, e.g. comfort and (for me) cost.

I think that things like purchase decisions would ideally be based on abstractable differences. I think that most subjective reviews pretend to be based on abstractable differences but due to the low quality listening tests that those reviews are based on, many of the purported abstractable differences are either wrong, poorly expressed, or purely based on imagination.
I think I mostly agree, but what do you mean with "due to the low quality listening tests that those reviews are based on"? For me, many of the pretty words used in audio reviews don't have meaning. But some things like "soundstage", I assume to mean an ability to localise the source of sounds in 3D (2D?). I have done this with live recordings of a small number of instruments. I think the term so often used in the VR literature is "presence". If I can abstract some ideas that relate to a feeling of immersion or presence, I might be able to compare headphones. But as has been extensively discussed in the VR literature, these factors are *heavily* influenced by other factors (biases). So then I come full circle: using very bias-able methods to decide. Oh well, that's life.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-26 15:29:31
In order to do a differential threshold test, as I describe, you must use echoic memory. You can't abstract a loudness.

If you allow even a few seconds between samples, the threshold value will be incorrectly too high.


That hinges on what is considered "Too high".  For a few tenths of a dB I do need to hear the samples very close together. For several dB, I can walk into a room stone cold and guess the SPL value with a reasonable tolerance.  Most important differences in quality usually involve a fair number of dB.

Quote
Your identification of the lute, as a lute, relies on abstract memory, and your remembering your identification has nothing to do with echoic memory - pure abstract memory.


I was not talking about the identification of a lute sound as sounding like a lute, but rather I was talking about some more subtle difference in the a particular lute sound that differs from the usual lute sound.  Note that a lute sounds like a lute over a fairly wide range of SPLs, timbres  and fundamental frequencies even though you may have never heard that timbre, fundamental frequency or SPL before.

Real world example. An amplifier or a MP3 encoder makes an unfamiliar kind of audible error when processing a certain kind of lute note. It is a new kind of error of a class that I've never heard before which particularly MP3 encoders are prone to do.  It sounds like a lute but it sounds wrong in a new and different way.

What I'm describing is shifting reliance on auditory memory to learning an abstraction and then to relying on the abstraction.

Quote
Quote
My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesized auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long.

Agreed. That is a workable list. It may not be complete.

Nelson Cowan calls it complete. How would you complete it?


What about working memory?  Cowan seems to believe in that.

Quote
I think I mostly agree, but what do you mean with "due to the low quality listening tests that those reviews are based on"?


Listening tests that are not based on close, matched comparisons but need to be.
Listening tests done by people who aren't really that familiar with listening tests or the item being compared.
Listening tests that actually involve small differences and need to be done blind to be valid, but aren't.

Quote
But some things like "soundstage", I assume to mean an ability to localise the source of sounds in 3D (2D?). I have done this with live recordings of a small number of instruments. I think the term so often used in the VR literature is "presence".


Agreed.  IME soundstaging is frequently abused. It can be a catch-all.

Quote
If I can abstract some ideas that relate to a feeling of immersion or presence, I might be able to compare headphones.


You might, but first you might want answer the question - if you hold everything else reasonably constant, do most headphones even soundstage differently?

Why should headphones of a general kind ( say closed and sealed to the head) even soundstage differently?

How much of the perception of different soundstaging be due to unmatched frequency response or just unmatched levels?

Quote
But as has been extensively discussed in the VR literature, these factors are *heavily* influenced by other factors (biases). So then I come full circle: using very bias-able methods to decide. Oh well, that's life.


In reality - get things right within a few dB and as the listener listens, his FR biases and preferences change, and things start sounding more familiar and therefore right to him.

Some call that "Equipment break-in". ;-)
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-26 16:24:09
In order to do a differential threshold test, as I describe, you must use echoic memory. You can't abstract a loudness.

If you allow even a few seconds between samples, the threshold value will be incorrectly too high.

That hinges on what is considered "Too high".  For a few tenths of a dB I do need to hear the samples very close together. For several dB, I can walk into a room stone cold and guess the SPL value with a reasonable tolerance.  Most important differences in quality usually involve a fair number of dB.

"Too high" means wrong. For testing with human subjects, you can't measure a differential threshold if the person can't directly "compare", which requires echoic memory. You may have an unusual ability to estimate and therefore abstract the SPL, but most people can't. I'm not talking about "most important differences in quality", I named a specific test that would require attention to echoic memory limits. But not for you ;-)

Quote
Your identification of the lute, as a lute, relies on abstract memory, and your remembering your identification has nothing to do with echoic memory - pure abstract memory.

I was not talking about the identification of a lute sound as sounding like a lute,

But I was, as an example (lute vs. mandolin) where echoic memory can be ignored.

Quote
Quote
My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesized auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long.

Agreed. That is a workable list. It may not be complete.

Nelson Cowan calls it complete. How would you complete it?


What about working memory?  Cowan seems to believe in that.

He doesn't "believe" in it, so much as that's the current understand of general memory. Working memory is not specific to the auditory system, and an auditory experience must have been processed to "generated abstract memory" before you can "place" it in working memory or long-term memory.

Quote
If I can abstract some ideas that relate to a feeling of immersion or presence, I might be able to compare headphones.

You might, but first you might want answer the question - if you hold everything else reasonably constant, do most headphones even soundstage differently?
...
How much of the perception of different soundstaging be due to unmatched frequency response or just unmatched levels?

Last question first: I wouldn't be surprised if most or all of soundstaging relates to FR, and therefore to the first question, since FRs are very different for all headphones, I expect they would soundstage differently.

So the normal differential threshold is nonlinear, but constant above about 70dB, where it between 0.3 and 0.5 dB. Can you distinguish that difference after say 10sec? If so, I'd be truly impressed. But some subjects are outside the normal range. We could use you instead of an SPL meter. ;-)
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-26 22:34:07
I doubt that most people would bother ABX testing of headphones because headphones do sound very different from each other.
  AFAIK there is no controversy over whether or not headphones sound different from each other. ...  ABX was not designed for headphone or speaker testing. It was designed for those situations where there is a serious question as to whether an audible difference even exists.


Just because B might be easily distinguishable from A doesn't mean we no longer need to worry if some form of bias might be influencing listeners in their decision making regarding sound quality evaluations. Although I agree ABX testing itself, "Is there any difference or not?", may be a waste of time in both headphone and speaker testing (in most circumstances), don't take my agreement as any form of endorsement that double blind testing itself isn't still very necessary with headphone/speaker testing [not that I'm claiming it is easy or even possible for most of us to pull off]. Double blind protocols are still VERY important and it is why researchers (like S. Olive) use them in both speaker and headphone quality/preference testing [or at least attempt to as much as possible].
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-26 23:28:50
I doubt that most people would bother ABX testing of headphones because headphones do sound very different from each other.
  AFAIK there is no controversy over whether or not headphones sound different from each other. ...  ABX was not designed for headphone or speaker testing. It was designed for those situations where there is a serious question as to whether an audible difference even exists.


Just because B might be easily distinguishable from A doesn't mean we no longer need to worry if some form of bias might be influencing listeners in their decision making regarding sound quality evaluations.


I totally agree with that. When a difference is known to exist, preference testing makes sense, but good preference testing takes other forms than ABX.

Quote
Although I agree ABX testing itself, "Is there any difference or not?", may be a waste of time in both headphone and speaker testing (in most circumstances), don't take my agreement as any form of endorsement that double blind testing itself isn't still very necessary with headphone/speaker testing [not that I'm claiming it is easy or even possible for most of us to pull off]. Double blind protocols are still VERY important and it is why researchers (like S. Olive) use them in both speaker and headphone quality/preference testing [or at least attempt to as much as possible].


While strictly speaking they are not preference tests, I wonder how ABX/hr and Mushra would work out for situations where it is correctly assumed that audible differences exist.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-26 23:36:57
"Too high" means wrong.


This isn't about right or wrong.

Quote
For testing with human subjects, you can't measure a differential threshold if the person can't directly "compare", which requires echoic memory.


Looks like proof by assertion to me. Can you do better?

Quote
You may have an unusual ability to estimate and therefore abstract the SPL, but most people can't.


Decades of training were required. Not everybody wants to do that, and not everybody has the opportunity.


Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 09:52:22
Just because B might be easily distinguishable from A doesn't mean we no longer need to worry if some form of bias might be influencing listeners in their decision making regarding sound quality evaluations. Although I agree ABX testing itself, "Is there any difference or not?", may be a waste of time in both headphone and speaker testing (in most circumstances), don't take my agreement as any form of endorsement that double blind testing itself isn't still very necessary with headphone/speaker testing [not that I'm claiming it is easy or even possible for most of us to pull off]. Double blind protocols are still VERY important and it is why researchers (like S. Olive) use them in both speaker and headphone quality/preference testing [or at least attempt to as much as possible].

Certainly someone doing research as their job (e.g. Olive) would need to put in the time/expense/effort, which his job would give him. And Olive has done lots of interesting and IMO important work. Are you suggesting blind protocols are needed for personal headphone decisions? I'm familiar with their "virtual headphone" method, where they inverse filter Senn HD 518s, and play models of other headphones through them. Do you know of other methods they may have used?

OT question: why is double blind always stated, when for example, fb2k isn't double? When I use "blind" at work, it's always assumed that no cues from any source (including people) are provided, other than the controlled stimulus, be it just the subject alone (technically single, I guess), or additionally the experimenter (then double) and sometimes the person doing analysis (triple)? Just curious.

Also, do you know anything about fb2k on a VM on a mac, or a similar program for a mac?
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 10:02:29
While strictly speaking they are not preference tests, I wonder how ABX/hr and Mushra would work out for situations where it is correctly assumed that audible differences exist.

ABC/HR and MUSHRA were intended to make qualitative assessments of impairments to an "original" or reference, as you know. But as you point out the qualitative scoring system seems a plausible method for 2 devices, without a specific reference. A reread of the ITU docs should reveal if different analysis is required (do you have no reference, or do you arbitrarily assign one device to be the reference...).
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 10:45:59
"Too high" means wrong.

This isn't about right or wrong.

Well, you are very experienced with ABX testing. If I were to tell you I did a test with the old AIX AVS files (with unmatched levels), and I said "I know the levels are different, but I took that into account before responding". Would you tell me I did it "wrong" or would you be more gentle. Would you say "best practice would indicate..." or "I'd suggest a better way" or would you blast me? I've read many of your posts, and I can't use the word "gentle" to describe them. You brought up the significance of memory in posts 6 and 10 above. mzil mentions echoic memory in post 37. I respond that echoic memory is important for certain types of tests, but not all, and question the strong focus by so many on it. In post 41, you say the scientific literature focusses on it.
.... and now you seem to argue, it can be ignored (if you are the subject). I'm confused about your position.

Quote
For testing with human subjects, you can't measure a differential threshold if the person can't directly "compare", which requires echoic memory.

Looks like proof by assertion to me. Can you do better?

Of course. Guilty as charged. I'm glad you point out "proof by assertion" and I hope whenever I do it, I'm called on it. I apologize that I can't provide a list of references today, but I'm happy to do so on the weekend. Of course, you have also done "proof by assertion" several times above, and when I request references or clarification, you ignore me. That's okay, it's not your job to help me... just kinda thought it'd be nice. I suspect that some of what I'll provide would have been in your response to my request (post 42) for references about echoic memory.
Quote
You may have an unusual ability to estimate and therefore abstract the SPL, but most people can't.

Decades of training were required. Not everybody wants to do that, and not everybody has the opportunity.

Sounds like you have "golden ears". No problem with TOS #8 though, I'd guess, because you make no claim of quality. I'm just happy that you would agree that Peter Aczel's lie #10 is no lie. He says: "The Golden Ears want you to believe that their hearing is so keen, so exquisite, that they can hear tiny nuances of reproduced sound too elusive for the rest of us." I don't know about "Golden Ears"(capitalized) , but normal variability, plus as you point out, training, do make some people more sensitive than the rest of us.
;-) Don't get angry. I'm playing with you a little. I know you use  "Golden Ears" as a derogatory term, and I just want to underscore your example of yourself, as someone who hears certain characteristics better than most. :-)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 12:39:17
"Too high" means wrong.

This isn't about right or wrong.

Well, you are very experienced with ABX testing. If I were to tell you I did a test with the old AIX AVS files (with unmatched levels), and I said "I know the levels are different, but I took that into account before responding". Would you tell me I did it "wrong" or would you be more gentle.


You've changed the context of the discussion pretty dramatically for no apparent reason.

Quote
You may have an unusual ability to estimate and therefore abstract the SPL, but most people can't.

Decades of training were required. Not everybody wants to do that, and not everybody has the opportunity.

Sounds like you have "golden ears". No problem with TOS #8 though, I'd guess, because you make no claim of quality. I'm just happy that you would agree that Peter Aczel's lie #10 is no lie. He says: "The Golden Ears want you to believe that their hearing is so keen, so exquisite, that they can hear tiny nuances of reproduced sound too elusive for the rest of us." I don't know about "Golden Ears"(capitalized) , but normal variability, plus as you point out, training, do make some people more sensitive than the rest of us.
;-) Don't get angry. I'm playing with you a little. I know you use  "Golden Ears" as a derogatory term, and I just want to underscore your example of yourself, as someone who hears certain characteristics better than most. :-)


Let's look at what I actually wrote:

For a few tenths of a dB I do need to hear the samples very close together. For several dB, I can walk into a room stone cold and guess the SPL value with a reasonable tolerance.


I guess that you are so completely unfamiliar with audio that you don't know that claiming the audiblity of differences on the order of "Several dB" is consistent with current scientific knowledge about hearing and therefore is outside the area of concern of TOS #8.

Letsee, "several dB" could be 6 dB. Here is an ABX test log for two files whose only difference is that their  level is different by 6 dB:

foo_abx 2.0 beta 4 report
foobar2000 v1.3.5
2015-03-27 08:53:02

File A: tuttabella_org- 6 dB.flac
SHA1: 737de5dccbd485ca54653eb5183b4270cd496048
File B: tuttabella_org.flac
SHA1: b1875cd08a96b24c300f7a2e4907d995b857ff99

Output:
DS : Primary Sound Driver

08:53:02 : Test started.
08:53:22 : 01/01
08:53:31 : 02/02
08:53:44 : 03/03
08:53:50 : 04/04
08:53:55 : 05/05
08:54:02 : 06/06
08:54:28 : 07/07
08:54:36 : 08/08
08:54:45 : 09/09
08:54:54 : 10/10
08:54:58 : 11/11
08:55:04 : 12/12
08:55:12 : 13/13
08:55:19 : 14/14
08:55:23 : 15/15
08:55:54 : 16/16
08:55:54 : Test finished.

----------
Total: 16/16
Probability that you were guessing: 0.0%

-- signature --
7fdf7be356ef6720a807ad04a13708b2f1ab579b

Compliance with TOS 8 completed, FWIW.

BTW I performed this test by simply running the X's. I never compared any X to A or B, and only listened to A and B once at the beginning to confirm that they had the expected level difference.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 12:58:34
"Too high" means wrong.

This isn't about right or wrong.

Well, you are very experienced with ABX testing. If I were to tell you I did a test with the old AIX AVS files (with unmatched levels), and I said "I know the levels are different, but I took that into account before responding". Would you tell me I did it "wrong" or would you be more gentle.


You've changed the context pretty dramatically for no apparent reason than debating trade points. That in my book is trolling. Have a nice day!

I'd guess that you're not going to respond to me, but I'll say this: I have no intention of being a troll. I'm quite accustomed to challenging what people say AND being challenged, ala Socratic learning. Sorry if I have offended you... not the intent. Challenge yes; offend no.
You have a nice day, too! :-)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 13:04:34
I'd guess that you're not going to respond to me, but I'll say this: I have no intention of being a troll.



I did respond by demonstrating a DBT that came as close to supporting my claims that I could think of.

One of the symptoms of trolling is denying the validity or even the existence of support for a point no matter how well supported, whether with logic or actual empirical evidence.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 13:39:34
I'd guess that you're not going to respond to me, but I'll say this: I have no intention of being a troll.



I did respond by demonstrating a DBT that came as close to supporting my claims that I could think of.

One of the symptoms of trolling is denying the validity or even the existence of support for a point no matter how well supported, whether with logic or actual empirical evidence.

What point did I deny validity? If you mean my stating that testing auditory differential threholds requires a short (<1s) interstimulus gap... I stand by that and if you want I'll provide refs this weekend. If I misunderstood or had an offensive tone or I wasn't clear... sorry, all that happens more often than I'd like.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 15:34:29
I'd guess that you're not going to respond to me, but I'll say this: I have no intention of being a troll.



I did respond by demonstrating a DBT that came as close to supporting my claims that I could think of.

One of the symptoms of trolling is denying the validity or even the existence of support for a point no matter how well supported, whether with logic or actual empirical evidence.

What point did I deny validity?


I guess you don't read your own posts. You called it non responsive.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-03-27 15:41:45
Certainly someone doing research as their job (e.g. Olive) would need to put in the time/expense/effort, which his job would give him. And Olive has done lots of interesting and IMO important work. Are you suggesting blind protocols are needed for personal headphone decisions? I'm familiar with their "virtual headphone" method, where they inverse filter Senn HD 518s, and play models of other headphones through them. Do you know of other methods they may have used?



It's kind of  a badly-formed question.  What is 'needed' depends on what level of knowledge/certainty you seek, and what claims you hope to make.  Most people don't really think this through.  So they audition two headphone sets hanging on the wall at their local Best Buy, decide 'this one sounds better' , think 'therefore it is better', and that's it. 

And *unless* they come on a forum like this , and claim, I tried X and Y and X is the better headphone, no one is going to care or challenge their 'method'.


Quote
OT question: why is double blind always stated, when for example, fb2k isn't double? When I use "blind" at work, it's always assumed that no cues from any source (including people) are provided, other than the controlled stimulus, be it just the subject alone (technically single, I guess), or additionally the experimenter (then double) and sometimes the person doing analysis (triple)? Just curious.



ABX is 'effectively' double blind since the app administering the test isn't human  ;>
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 15:47:54
I guess you don't read your own posts. You called it non responsive.

OK, I'm lost. I have read my posts. I searched the thread for "non responsive", even "responsive". I don't understand you. We're talking past each... oh well...
Have a nice day! :-)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 15:52:13
OT question: why is double blind always stated, when for example, fb2k isn't double?


Say what?

How is FB2K ABX not a DBT?

Quote
When I use "blind" at work, it's always assumed that no cues from any source (including people) are provided, other than the controlled stimulus, be it just the subject alone (technically single, I guess), or additionally the experimenter (then double) and sometimes the person doing analysis (triple)? Just curious.


What non-audible cues do you get about the unknowns that are presented by FB2K?

Quote
Also, do you know anything about fb2k on a VM on a mac, or a similar program for a mac?


Ummm, google it?

I have no Mac and no experience with these but they claim to be DBT test coordinators like ABX on FB2K.

ABX Tester for the Mac (https://itunes.apple.com/us/app/abxtester/id427554135?mt=12)

Lacinato ABX/Shootout-er blind testing audio software for cross platforms including the Mac (http://lacinato.com/cm/software/othersoft/abx)

Past HA discusisons about cross-platform DBT software (http://www.hydrogenaud.io/forums/index.php?showtopic=95308&hl=)
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 15:56:58
ABX is 'effectively' double blind since the app administering the test isn't human  ;>

I would argue that exactly because the app is not human, "double" doesn't apply. I'd say "blind", but that's just me, and I'm burned out on arguing (especially from my iPad), so...
DBT yeah!
;-) Have a good weekend.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 16:16:00
OT question: why is double blind always stated, when for example, fb2k isn't double?


Say what?

How is FB2K ABX not a DBT?

Quote
When I use "blind" at work, it's always assumed that no cues from any source (including people) are provided, other than the controlled stimulus, be it just the subject alone (technically single, I guess), or additionally the experimenter (then double) and sometimes the person doing analysis (triple)? Just curious.


What non-audible cues do you get about the unknowns that are presented by FB2K?

"Double" in DBT refers to both the subject and experimenter not knowing what X is. The experimenter can involuntarily give non-audible cues to the subject. If I'm alone with fb2k, there's no both to doubly blind. The way my colleagues and I use "blind" and "double blind" at work, fb2k is not DBT. Semantics. Not worth pursuing.
Thanks for the mac tip. I know how to google, but wanted recommendations, not a list. In post 38(?), I mentioned I have ABXTester for mac; it's not so full featured as fb2k. I got a very useful PM for fb2k on mac. Thanks though.
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-27 16:58:17
ABX is 'effectively' double blind since the app administering the test isn't human  ;>

I would argue that exactly because the app is not human, "double" doesn't apply.


I think you misunderstand what blinding means.  Take a look at the wiki:

http://en.wikipedia.org/wiki/Blind_experim...le-blind_trials (http://en.wikipedia.org/wiki/Blind_experiment#Double-blind_trials)

This is a double blinded test because it is shielded from bias by both parties in the test. 

Semantics. Not worth pursuing.


This is not a semantic argument, and it is worth pursuing.  Specifically, understanding why double blinded tests like ABX are used is important.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 17:17:36
ABX is 'effectively' double blind since the app administering the test isn't human  ;>

I would argue that exactly because the app is not human, "double" doesn't apply.


I think you misunderstand what blinding means.  Take a look at the wiki:

http://en.wikipedia.org/wiki/Blind_experim...le-blind_trials (http://en.wikipedia.org/wiki/Blind_experiment#Double-blind_trials)

This is a double blinded test because it is shielded from bias by both parties in the test. 

Semantics. Not worth pursuing.


This is not a semantic argument, and it is worth pursuing.  Specifically, understanding why double blinded tests like ABX are used is important.

Thank you. Thank you. Thank you.
That is exactly how I would define "blind" "to blind" "single" "double" and "triple"!!!! Where I bold your quote, who are the 2(both) parties? Keep in mind, when it's you alone with the computer, there is no researcher.
From the article:
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 17:19:24
"Double" in DBT refers to both the subject and experimenter not knowing what X is. The experimenter can involuntarily give non-audible cues to the subject. If I'm alone with fb2k, there's no both to doubly blind.



Now I get our communication problem. What double talk!  The word experimenter was unnecessarily added and it  creates a problem. If I wanted to spend the rest of my life with useless arguments about semantics...

Many common definitions don't make this trivial mistake. For example: Here is what Wikipedia has to say:

"A blind or blinded experiment is an experiment in which information about the test that might lead to bias in the results is concealed from the tester, the subject, or both until after the test."

Another way to look at it is that if you are alone with FB2K, the experimenter's role is filled by the software. The fact that it is not a living breathing homo sapiens is irrelevant to most people. Another case where the implementation details are irrelevant and performance is all-important.

The problem of hardware and software stand-ins for the experimenter that give reliable cues to the identity of the unknown of their own making has been around for quite some time.  My original ABX box of 1977 was pretty noisy when it changed state, but the noise followed no discernable pattern. It did not compromise the blindness of the test. QSC's ca. 1990s ABX Comparator could be aced with no other equipment attached. That's a serious non-semantic problem.
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-27 17:22:15
From the article:
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.


I'm not even sure how to respond to such a bizarre post.  Are you trying to argue that ABX is not double blind?  If so, I'd suggest that you are REALLY confused on what blinding means.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 17:29:05
From the article:
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.


I'm not even sure how to respond to such a bizarre post.  Are you trying to argue that ABX is not double blind?  If so, I'd suggest that you are REALLY confused on what blinding means.



It's called misdirection. Whether the test is computer-controlled or not is given great importance by mentioning it first, but in fact the implementation of the test is irrelevant. How the test works is the most important thing.

The problem of cuing the test subject with the identity of the unknown has at least been around since Clever Hans, the talking horse, No computers required!
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 17:37:23
From the article:
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.


I'm not even sure how to respond to such a bizarre post.  Are you trying to argue that ABX is not double blind?  If so, I'd suggest that you are REALLY confused on what blinding means.

Wow, this is exhausting! Read the article. Read what I say in my posts. I never argue all ABXs aren't DBTs.
ABX is a type of experiment. A and B can be files, cables, amps, etc. If you are using files, you can use fb2k!!! If you have a human experimenter in the room conducting the experiment, you can do a DBT. If you alone use fb2k as your method of doing an ABX without an experimenter, YES I'M SAYING IT'S NOT A DBT. THAT DOES NOT MEAN ALL ABX's aren't, just fb2k used alone. And the Wikipedia article agrees!!
All dogs are animals; not all animals are dogs, but some are. Not all DBTs are ABXs but some are and not all ABXs are DBTs but some are.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 17:59:30
From the article:
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.


I'm not even sure how to respond to such a bizarre post.  Are you trying to argue that ABX is not double blind?  If so, I'd suggest that you are REALLY confused on what blinding means.

Wow, this is exhausting! Read the article. Read what I say in my posts. I never argue all ABXs aren't DBTs.


You don't have to argue that all ABX's aren't DBTs.  You can create that impression with sentences like this one:

"Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments"

I just explained to you how that is commonly interpreted as saying what you now say you don't want to say, and you didn't respond to the post so I take it that you dismissed it.  This post looks also looks like you are dismissing the other post.

Giving you the benefit of the doubt, I think you may have wanted to say:

Poorly implemented experiments are sometimes  erroneously referred to as double-blind experiments.

That is not too interesting because it is a truism.

Quote
ABX is a type of experiment. A and B can be files, cables, amps, etc. If you are using files,


or pudding, potato chips, soda pop, or beer. They can be pornographic pictures.  They can be anything that can be perceived and practically manageable.

Quote
you can use fb2k!!!


Thanks for that!

Quote
If you have a human experimenter in the room conducting the experiment, you can do a DBT.


Just put him out of sight and keep him quiet.

Quote
If you alone use fb2k as your method of doing an ABX without an experimenter, YES I'M SAYING IT'S NOT A DBT.


That seems to be incorrect to the point of being bizarre.

Quote
THAT DOES NOT MEAN ALL ABX's aren't, just fb2k used alone. And the Wikipedia article agrees!!


Which Wikipedia article?

Please explain.

Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-27 17:59:45
FB2k ABX is double blind. The true identities of the UUTs are not conveyed directly to the test subject, the listener, through any of their their own senses nor indirectly (and possibly inadvertently) through the robotic test administrator.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-27 18:04:44
FB2k ABX is double blind. The true identity of the UUTs are not conveyed directly to the test subject, the listener, through their own senses nor indirectly (and possibly inadvertently) through the robotic test administrator.


However a FB2K or similar test can unblind itself.

Let's say that you are comparing two files whose CPU load for decoding varies quite a bit from each other, and one turns on the CPU fan or makes it run louder.

This is a potential problem with laptops, many of which need to run their fans all the time and throttle the CPU and fan pretty closely to save energy and minimize size and weight.

Failing weird stuff like that, the general rule is that a FB2K test is utterly double blind.
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-27 18:08:03
From the article:
Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.


I'm not even sure how to respond to such a bizarre post.  Are you trying to argue that ABX is not double blind?  If so, I'd suggest that you are REALLY confused on what blinding means.

Wow, this is exhausting! Read the article. Read what I say in my posts. I never argue all ABXs aren't DBTs.
ABX is a type of experiment. A and B can be files, cables, amps, etc. If you are using files, you can use fb2k!!! If you have a human experimenter in the room conducting the experiment, you can do a DBT. If you alone use fb2k as your method of doing an ABX without an experimenter, YES I'M SAYING IT'S NOT A DBT. THAT DOES NOT MEAN ALL ABX's aren't, just fb2k used alone. And the Wikipedia article agrees!!


You have absolutely no idea what you are talking about and should immediately stop arguing, read very carefully how an ABX test works, and then come back here once you understand what you are attempting to talk about. 
Title: How do you listen to an ABX test?
Post by: Wombat on 2015-03-27 18:23:45
Very entertaining read. Lets see if some pinhead like me now learns what Socratic learning is about.
Title: How do you listen to an ABX test?
Post by: pdq on 2015-03-27 18:28:03
If FB2K ABX is single blind, then show me the experimenter who knows the correct answer and is inadvertently passing that information to the subject through subtle means.
Title: How do you listen to an ABX test?
Post by: eric.w on 2015-03-27 18:40:39
Regarding ABX testing on Mac OS X, I've done mine with foobar2k and foo_abx 2.0 running under wine (http://winebottler.kronenberg.org) with no issues.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-27 18:41:10
However a FB2K or similar test can unblind itself.


It would be interesting if some clever person discovered they could use their harddrive activity LED's flash pattern to ID X and Y! Since foobar seems to load both A and B into its own, private memory area, prior to testing, this would seem rather unlikely to me though. [Still, a patch of black tape over that light might be in order for formal testing, just to be dead sure.]
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-27 19:18:06
My head is spinning. I fell down a rabbit hole and the chess pieces have arisen to tell me where to go. There's a hookah smoking caterpillar, the white knight is talking backwards and logic and proportion have fallen sloppy dead....
I've got to get out and here's my plan.
I contend that there exist 3 groups.
One group has been reading audio forums for a long time and has seen "DBT" perhaps hundreds of times. Whether each member is working with the correct definition, I can't say. But whatever their definition is, they are confident they have been reading and writing it correctly. If a wild-eyed-weirdo, whose opinion they don't respect, tells them their definition is wrong, obviously they reject that. But the cool part is, that their confirmation bias influences their ability to read a Wikipedia article without smashing the round peg words into their square hole existing definition. No chance for change here.
The second group are our non-scientist friends, family, coworkers, who have never read an audio forum, and have never seen "DBT", except maybe in the newspaper article. I contend that if you ask them to read the Wikipedia article, or the section on blind testing in any scientific methods book, and somehow bribe them to read all my posts in this thread, that they will find no errors, deceptions, misdirections or misleading arguments in what I have said about DBTs. I would really be interested in the result, if this actually happened.
The third group is practicing, publishing scientists. If they do blind testing, they won't need to read anything. If they don't regularly do blind tests, they may need to brush up with an experimental design text. And then the same as group 2: I contend that they will find no errors, deceptions, misdirections or misleading arguments in what I have said about DBTs.
I say groups, because I contend a consensus, not one individual will help me stay out of the rabbit hole. Some of the last few posts have been surreal.
I'm not ignoring any posts, I have just been travelling all day and will now spend some time with my family. Mr Krueger, I'll respond tomorrow. Sorry.
Saratoga, your lecturing me in post 70 above has me ROTFLMAO.
FYI, the Wikipedia article is linked in Saratoga's post number 61 above.
Title: How do you listen to an ABX test?
Post by: pdq on 2015-03-27 19:32:57
From that same Wikipedia article:

"In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party."

In other words, it is perfectly valid for software to provide DBT testing, as long as the part that plays the sound is not influenced by the part that selects X at random.

Edit: added quotes
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-27 19:36:10
Whether each member is working with the correct definition, I can't say.


Really?  You seem to have been pretty confident when you said otherwise above.  Actually, you seem more than capable of saying, just not understanding.

The problem of course is that you made up your mind before you understood what was being discussed.  This is a terrible idea in general (it will cause you to do foolish things that you would otherwise have avoided), but its an especially terrible idea when talking to people with a deeper understanding than yourself. 
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-03-27 21:08:30
My head is spinning. I fell down a rabbit hole and the chess pieces have arisen to tell me where to go. There's a hookah smoking caterpillar, the white knight is talking backwards and logic and proportion have fallen sloppy dead....
I've got to get out and here's my plan.
I contend that there exist 3 groups.
One group has been reading audio forums for a long time and has seen "DBT" perhaps hundreds of times. Whether each member is working with the correct definition, I can't say. But whatever their definition is, they are confident they have been reading and writing it correctly. If a wild-eyed-weirdo, whose opinion they don't respect, tells them their definition is wrong, obviously they reject that. But the cool part is, that their confirmation bias influences their ability to read a Wikipedia article without smashing the round peg words into their square hole existing definition. No chance for change here.
The second group are our non-scientist friends, family, coworkers, who have never read an audio forum, and have never seen "DBT", except maybe in the newspaper article. I contend that if you ask them to read the Wikipedia article, or the section on blind testing in any scientific methods book, and somehow bribe them to read all my posts in this thread, that they will find no errors, deceptions, misdirections or misleading arguments in what I have said about DBTs. I would really be interested in the result, if this actually happened.
The third group is practicing, publishing scientists. If they do blind testing, they won't need to read anything. If they don't regularly do blind tests, they may need to brush up with an experimental design text. And then the same as group 2: I contend that they will find no errors, deceptions, misdirections or misleading arguments in what I have said about DBTs.
I say groups, because I contend a consensus, not one individual will help me stay out of the rabbit hole. Some of the last few posts have been surreal.
I'm not ignoring any posts, I have just been travelling all day and will now spend some time with my family. Mr Krueger, I'll respond tomorrow. Sorry.
Saratoga, your lecturing me in post 70 above has me ROTFLMAO.
FYI, the Wikipedia article is linked in Saratoga's post number 61 above.


I contend that you are making distinctions without a difference.

Sean Olive's setup at Harman uses an automated system to present loudspeakers in random order, behind a curtain.  The subject controls when the 'switches' take place, and his answers are recorded and tallied by software.

Are you suggesting because it's done with no human tester intervention, it's not *effectively* double blind?

The wiki article saratoga cited, btw is badly written and  incoherent (as often happens in wikipedia, due to chaotic editing).  Let's parse this mess:

Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject.
-  this poorly-written sentence means to say: "While some contend that computer-controlled experiments do not allow the researcher to influence the subject, that contention is erroneous'.  Well, yes, *that* contention *would* be erroneous, because it ignores the possibility of bad 'computer-controlled' design .    But just because computer-controlled experiments *can* allow researcher bias doesn't mean they CAN NEVER BE researcher bias-free. 

Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system.

- This sentence supplies examples of how the researcher may bias the subject even though the 'test' itself is administered mechanical/electronic/digitally.  However, in the case of surveys, the same bias could be built into purely human-administered 'double-blind' surveys' as well.  Does that mean that  human-controlled experiments are 'erroneously' referred to as double blind?

In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party.

- True

An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B. 

- True.  And  one *could* develop ABX software that 'biases' the answers.  But that doesn't mean that software ABX *necessarily* does.  Proper software ABX is as *effectively* double blind as a properly designed human-administered ABX would be.
Title: How do you listen to an ABX test?
Post by: castleofargh on 2015-03-27 21:44:00
so if my whisky is distilled by a machine, is it single malt or double blinded?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-28 08:01:57
If FB2K ABX is single blind, then show me the experimenter who knows the correct answer and is inadvertently passing that information to the subject through subtle means.


I think the argument is that machine/software driven ABX is single blind because there is just one person to blind.

There is a bit of a problem here because the original 1950s Fletcher and Gardiner ABX test was also machine-driven and could be performed by just one person. It was programmed via a teletypewriter machine's paper tape. As long as the participant did not peek at the paper tape, he was blinded to the identities of the audio samples.  The literature of science is pretty unambiguous about Fletcher and Gardiner's ABX being a DBT.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-28 08:24:59
[An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B. 

- True.


I'm still trying to educate people to understand that there are two different ABX tests. There is the ABX test that was described by Munson and Gardiner in a landmark JASA paper in 1950, and the one described by Clark in an JAES paper in 1982. They primarily differ in terms of interactivity.

However, it appears that both tests are misunderstood as to the function of presenting known samples A and B and their impact on the identification phase of the test. I don't know the thought patterns of Munson and Gardiner, but I do know what mine were in 1977 when I did my first ABX test. I conceived of ABX as a same/different test and it can and is performed that way today.

ABX is a two-phase test. Phase one is learning to hear the difference between A and B, and phase two is correctly identifying X. Once phase one has been completed satisfactorily it can be dispensed with for the rest of the trials. During an ABX test a well-trained listener need only listen to one X at a time during each trial.  The sound of that test is presumably processed by that area of the brain known as Working Memory A Nelson Cowan paper Paper about Working Memory (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2864034/).

Both A and B are not presented as parts of the skill testing question, but for the purpose of enhancing the ability of the listener to train himself to hear differences. The discretionary presentation of A and B was added to create the ABX test to facilitate the listener training himself to hear the difference between the two. Both A and B are not required, only one suffices for the actual test.  Presenting either A or B (but not necessarily either or both) always 100% suffices for the purpose of identifying X by means of the obvious same/different comparison. As I have demonstrated with an ABX test on this thread, if the listener is well-trained to reliably detect the difference between A and B, he can discern it with perfect reliability by only listening to the X's.

Quote
And  one *could* develop ABX software that 'biases' the answers.


ABX Comparator hardware that potentially biases the answers was produced by QSC.  If I sit in a quiet room with a QSC ABX Comparator I can and have discerned its allegedly hidden sequence of X's by simply listening to the box change state.  BTW, I inherited Tom Nousiane's QSC ABX Comparator courtesy of his estate's execuitor so now I have 4 hardware ABX Comparators:: The 1977 ABX Comparator that I built, a prototype ABX Comparator as described by Clark's 1982 JAES paper, the QSC ABX Comparator, and one other that I also inherited from Tom that I am still trying to identify.

Just for grins I should try to line up an ASR 33 Teletype Machine and try to rebuild Munson and Gardiner's ABX Comparator. ;-)

Quote
But that doesn't mean that software ABX *necessarily* does.  Proper software ABX is as *effectively* double blind as a properly designed human-administered ABX would be.


Agreed. Just because a defective test of a kind has ever been done doesn't mean that all tests of that kind are defective.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-28 10:50:03
Good morning! It’s nice and sunny now, but it’s supposed to rain this afternoon, so I want to get out with my son to get some fresh air. Before that I’ll clear up a few points about the “DBT” discussion from my side.

From my point of view, we are having a semantic debate. We have not discussed the best way to do “good science”, or whether and when fb2k and/or DBTs fit in. All that I have written about DBTs in this thread reflects my questioning the use of “double”. After rereading many of the posts, I realize that many here see that as a derogatory challenge to the quality of fb2k as a tool, or the quality of an experiment using fb2k while listening alone in a room. It is not. From what I’ve read, fb2k looks like an outstanding tool, when used properly (hence my asking for tips on using it on or finding a work-alike for macs - thanks for your tip eric.w). And of course, in the right context, a DBT is the sine qua non. I seem to sense that for many here “DBT” is equivalent to “good science” or “good methods”, that it is a necessary and sufficient condition. It seems like a seal of approval or a certificate of quality. This entire paragraph describes my impression of some posts; so one has used these words!

From my perspective, fb2k, ABX and DBTs are powerful tools; that can be used well or poorly. It is also possible to do “good science”, “valid science”, without any of them. It depends on the experimental design, the question(s) to be answered, the other methods needed, and the analysis that is planned. Science is larger that listening-to-audio-stuff-tests. By asking the question “what is good science?”, I know I’m opening the door to a whole new debate, but I want to mention one simple, but incomplete, definition: acceptance by experts in the field. This can take on 2 important forms: acceptance by a peer-reviewed journal, and citations by experts in other peer-reviewed articles. I hope this doesn’t devolve into a new debate, but I know it’s a risk.

I have authored and co-authored articles in high-impact, peer-reviewed journals and they have been cited in other high-impact, peer-reviewed journals. The experiments have included non-blind, simple blind and double blind tests. I don’t believe we have ever used the term “blind”. What we did is obvious from the “Methods” section. What does a non-blind, but valid test look like? Physiology: I tell the subject exactly what I’ll do, what stimuli they’ll get, what I’m measuring and it’s repeated in the “Informed Consent” form, and then I do it. I measure some physiological response. DBT makes no sense here and would probably not get past the ethics committee. Although the reviewer had plenty of comments, not one mention of “blindness” popped up. Since I remain anonymous, you could easily challenge this paragraph as made up. Whether you believe the “I” above doesn’t matter. You should believe that this is how it works in one sub-field of science (neuroscience).

In discussions with colleagues, we must have the same definitions in order to communicate. We use “blind”, as in the Wikipedia article, to mean blocking all cues from any sense organ about what is happening, other than the intended stimulus, of course. But just as in Poker, people have “tells”, so if someone in the room *could* give the subject a cue, any cue about the stimulus, that must be avoided. Anyone with knowledge of the current stimulus must be outside the room (simple blind), or be “blind” themselves if in the room. I’m not sure I’ve used “single blind” in discussions before, but I would use it to mean a test that should be DBT, but a person with knowledge *must* be in the room (e.g. a mother, when the subject is a baby/toddler). Since you can’t “blind” a machine (without AI, machines have no “knowledge”), they don’t count. Of course, Arnold B. Krueger is right, a machine can give cues to the subject, or otherwise pollute the test. The 2 solutions are: avoid it or mask it. This is always a problem in my experiments. The machine gives of unavoidable, stimulus-dependent sound, so we always have the subject wear headphones, usually with a masking sound (white noise or a waterfall, etc.).

So I have no criticism of fb2k, ABX, or DBTs as tools when used appropriately. I object to “double” being used as a word when there are not 2 people (or more or animals) that need to be blinded. If because of your familiarity with labelling things DBT, even if there is a single person, you want to continue, I consider the semantic argument not worth much more effort. I’d be very interested in any continuation of good experimental design.

My son is starting to play Minecraft. Better get outside before it rains. I may want to address some further points from other posts when I get back.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-28 14:58:57
Quote
Quote
I'd guess that you're not going to respond to me…
I did respond by demonstrating a DBT that came as close to supporting my claims that I could think of.

One of the symptoms of trolling is denying the validity or even the existence of support for a point no matter how well supported, whether with logic or actual empirical evidence.
What point did I deny validity?
I guess you don't read your own posts. You called it non responsive.

OK, misunderstanding. Before you edited you post, you called me a troll and said “Have a nice day”. I thought you would ignore any subsequent posts I wrote, so I apologized/clarified. You thought I expected you to not respond to a previous post. Water under the bridge now. Thanks for continuing to respond AND for posting you results from foobar.

Another way to look at it is that if you are alone with FB2K, the experimenter's role is filled by the software. The fact that it is not a living breathing homo sapiens is irrelevant to most people. Another case where the implementation details are irrelevant and performance is all-important.

It's called misdirection. Whether the test is computer-controlled or not is given great importance by mentioning it first, but in fact the implementation of the test is irrelevant. How the test works is the most important thing.

If the experimenter’s role is filled by the computer, great, this is not a flaw in the design per se, but it does obviate the need to blind the experimenter. Again, this is not a flaw, and I don’t criticize this method, I criticize the use of the word “double”, when you have correctly only blinded the subject. The “experimenter” doesn’t need it.
When you say (twice) “implementation details are irrelevant”, I can be certain you’ve never published in a journal.

I never argue all ABXs aren't DBTs.
You don't have to argue that all ABX's aren't DBTs.  You can create that impression with sentences like this one:
"Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments"
I just explained to you how that is commonly interpreted as saying what you now say you don't want to say, and you didn't respond to the post so I take it that you dismissed it.  This post looks also looks like you are dismissing the other post.
Giving you the benefit of the doubt, I think you may have wanted to say:
Poorly implemented experiments are sometimes  erroneously referred to as double-blind experiments.

1. Saying something is not a DBT is not derogatory. 2. Computer-controlled experiments are erroneously referred to as double-blind experiments when only one person needs blinding. If the experimenter IS in the room and unaware of the answer, then it IS double blind. Both he/she and the subject are correctly blinded.
However a FB2K or similar test can unblind itself.
You can’t blind fb2k and it can’t unblind itself. All your good examples must be eliminated as part of careful experimental design. Removing those design flaws is not called blinding; it’s called good practice.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-28 15:01:25
FB2k ABX is double blind. The true identities of the UUTs are not conveyed directly to the test subject, the listener, through any of their their own senses nor indirectly (and possibly inadvertently) through the robotic test administrator.
Why? Because you say so? Mr Krueger correctly called me on “proof by assertion” above. The “robotic test administrator” doesn’t need to be blinded. This is a good thing, not a flaw/criticism. What is “double” about it? Technically, if the robot could be blinded, it hasn’t been because it “knows” the answer. But as you point out, there is no risk of his giving it away, so like an experimenter who's outside of the room, there is no need to blind it. So, not double, but just blind, and therefore that aspect of the design is taken care of perfectly. I think this whole line of argument about the “robotic test administrator” is superfluous. But you brought it up.

I think the argument is that machine/software driven ABX is single blind because there is just one person to blind.

Exactly. But I would say blind, not single blind. See my long post above.
Quote
There is a bit of a problem here because the original 1950s Fletcher and Gardiner ABX test was also machine-driven and could be performed by just one person. It was programmed via a teletypewriter machine's paper tape. As long as the participant did not peek at the paper tape, he was blinded to the identities of the audio samples.  The literature of science is pretty unambiguous about Fletcher and Gardiner's ABX being a DBT.
If the researcher or an associate is in the room AND doesn’t peek at the paper tape, it IS DBT.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-28 15:21:02
I contend that you are making distinctions without a difference.

Sean Olive's setup at Harman uses an automated system to present loudspeakers in random order, behind a curtain.  The subject controls when the 'switches' take place, and his answers are recorded and tallied by software.

Are you suggesting because it's done with no human tester intervention, it's not *effectively* double blind?

I'm saying that if Dr. Olive or an associate was in the room AND unaware of what's playing, it IS double blind. If during the whole test, the subject is alone, there is no second person to blind. The level of automation, or lack thereof, is irrelevant to the terminology regarding blinding. If the subject doesn't get cues from the researcher OR the setup, the naming of the level of blindness is also not important; that aspect is well designed.
Quote
The wiki article saratoga cited, btw is badly written and  incoherent (as often happens in wikipedia, due to chaotic editing).
OK, rather than parse a mess, let's stick with some clear parts: Double-blind describes an especially stringent way of conducting an experiment which attempts to eliminate subjective, unrecognized biases carried by an experiment's subjects (usually human) and conductors.
...
In these double-blind experiments, neither the participants nor the researchers know ...
Double-blind methods can be applied to any experimental situation in which there is a possibility that the results will be affected by conscious/unconscious bias on the part of researchers, participants, or both. For example, in animal studies both the carer of the animals and the assessor of the results have to be blinded; otherwise the carer might treat control subjects differently and alter the results.

So the subject (participant) can be a human or an animal. The conductors (researchers) are human.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-28 17:17:34
FB2k ABX is double blind. The true identities of the UUTs are not conveyed directly to the test subject, the listener, through any of their their own senses nor indirectly (and possibly inadvertently) through the robotic test administrator.
Why? Because you say so? 


Me and every other person I have ever encountered previously.

"Blinded" should not be taken literally. For one thing, it is not just about sight, ALL the senses need to be obscured from the test subject, the listener, any of which might inadvertently bias their decision making by revealing either the true IDs of the UUTs, or clues about them, other than what is being actually tested, their perception of the sound. If you do this not just to the test subject but also the test administrator(s) they interact with during the testing* then that is two levels of protection from bias corrupting the results you have guarded against, hence the word "double".

I'm not interested in further discussion on "robotic test administrators aren't human and therefore don't count as a second category needing blinding, or that should be counted as having been blinded", or whatever it is you are on about.  I'm gone.

P.S. When Bateson coined the term "double blinded", even though there were tests that used the protocol going back to at least the 1800's, robotic test administrators which could carry out the entire test, start to finish, weren't commonplace.


* In fact the test conductors should additionally have their hearing blocked so should they grimace in reaction to a test signal sound, for example, the test subjects don't pick up on that.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-28 18:18:01
"Blinded" should not be taken literally. For one thing, it is not just about sight, ALL the senses need to be obscured from the test subject, the listener, any of which might inadvertently bias their decision making by revealing either the true IDs of the UUTs, or clues about them, other than what is being actually tested, their perception of the sound.

I know. Sounds like what I said.
We use “blind”, as in the Wikipedia article, to mean blocking all cues from any sense organ about what is happening, other than the intended stimulus, of course.

If you do this not just to the test subject but also the test administrator(s) they interact with during the testing* then that is two levels of protection from bias corrupting the results you have guarded against, hence the word "double".

I'm not interested in further discussion on "robotic test administrators aren't human and therefore don't count as a second category needing blinding, or that should be counted as having been blinded", or whatever it is you are on about.  I'm gone.

P.S. When Bateson coined the term "double blinded", even though there were tests that used the protocol going back to at least the 1800's, robotic test administrators which could carry out the entire test, start to finish, weren't commonplace.

* In fact the test conductors should additionally have their hearing blocked so should they grimace in reaction to a test signal sound, for example, the test subjects don't pick up on that.

If we agree to stop talking about the "robotic test administrators” (yes, let’s), then there is no test administrator in the room if fb2k is used alone in a room (the specific example I’ve always used). I have never said it is always true for all ABXs. So there is no need for “two levels of protection from bias corrupting the results you have guarded against”. There is no second, so no double. To repeat myself yet again: that is not a flaw but a sound method. The method is good; the word “double” is not valid and doesn't imply a better method.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-28 20:38:31
Another way to look at it is that if you are alone with FB2K, the experimenter's role is filled by the software. The fact that it is not a living breathing homo sapiens is irrelevant to most people. Another case where the implementation details are irrelevant and performance is all-important.

It's called misdirection. Whether the test is computer-controlled or not is given great importance by mentioning it first, but in fact the implementation of the test is irrelevant. How the test works is the most important thing.

If the experimenter’s role is filled by the computer, great, this is not a flaw in the design per se, but it does obviate the need to blind the experimenter. Again, this is not a flaw, and I don’t criticize this method, I criticize the use of the word “double”, when you have correctly only blinded the subject. The “experimenter” doesn’t need it.


The the fact that a well-designed and executed ABX machine-experimenter doesn't need additional blinding due to the fact that it was blinded by design.

There's your double blind!

Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-28 20:57:32
"Blinded" should not be taken literally. For one thing, it is not just about sight, ALL the senses need to be obscured from the test subject, the listener, any of which might inadvertently bias their decision making by revealing either the true IDs of the UUTs, or clues about them, other than what is being actually tested, their perception of the sound.


I know. Sounds like what I said.
We use “blind”, as in the Wikipedia article, to mean blocking all cues from any sense organ about what is happening, other than the intended stimulus, of course.


This is still wrong.  You are taking blinding to be a literal blocking of senses.  In the vast majority of blind trials, no actual sensory information is blocked.  Its actually pretty rare that this is done at all.  Instead, blinding refers to a blocking of information about the status of a stimulus/subject/etc in such a way that bias is prevented.  For example, in a double blind trial of cold medicine, neither experimenters nor subjects wear blindfolds.  Instead, a computer is typically used to randomize the drug given to each patient by the experimenter.  All involved have full sensory information, but are still "blinded". 

The actual definition of a trial as single blind, double blind, or even triple blind is functional, not semantic.  Can the subject's  information bias the outcome?  If no, then it is at least single blinded.  Can the experimenter's information bias the outcome?  If no, then it is double blind.  Note that no actual blocking of senses is necessarily involved, and no assumptions about the number of people or machines involved is made.  There can be one or a thousand.  It makes no difference because the "single" and "double" refer to the ability of information to cause statistical bias, not the number of parties involved. 

If we agree to stop talking about the "robotic test administrators” (yes, let’s), then there is no test administrator in the room if fb2k is used alone in a room (the specific example I’ve always used). I have never said it is always true for all ABXs. So there is no need for “two levels of protection from bias corrupting the results you have guarded against”. There is no second, so no double. To repeat myself yet again: that is not a flaw but a sound method. The method is good; the word “double” is not valid and doesn't imply a better method.


Every thing you just said is completely and indefensibly wrong.  The double and single do not refer to a specific number of entities involved, and being double or single blind does imply being better or worse.  A double blind test is necessarily non-inferior to its single blind equivalent. 

This is why I am strongly encouraging you to take a few minutes to read about these concepts rather than just making things up as you go along.  They are not difficult to understand, but they will take a small amount of effort on your part.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 03:27:11
Every thing you just said is completely and indefensibly wrong.
Hmm. Hyperbole much? …and wrong since I just defended it.
This is why I am strongly encouraging you to take a few minutes to read about these concepts rather than just making things up as you go along.  They are not difficult to understand, but they will take a small amount of effort on your part.
Saratoga, dude, I learned the definitions in the 70’s and they have served me well for the first few decades of my scientific career. I expect they’ll serve me well (in science, not HA) until I retire (if I retire). I have no plan to change them at your behest.

BTW, full credit to S&M for perseverance in the face of adversity
Thanks. I prefer SAM, but at this point the M in the S&M joke seems to apply.
But despite your encouragement, I am reminded of an argument I had about 6 months ago. We’re collaborating with a group from the psych dept. on a set of experiments. I used the word “habituation” with the specific definition used in neuroscience textbooks and literature, and the way I have always used it. The prof “corrected” me, with the definition as used in the psych world. We argued, but agreed to disagree (different worlds). ..lasted 10 min. For me the solution was simple: of the 3 or 4 papers we’ll write for the first 2 experiments, in the 1 or 2 I write, I’ll either leave the word out or use it my way (I’ll submit to neuro journals). His grad student will write 2 for psych journals and she’ll use his way.

So, I accept the 2 different worlds of scientific journals and HA. I won’t convince anyone here, and so far I’m not convinced by anyone here on this issue. But I will stop challenging "double" on HA (a world that can decide its own definitions). I’m happy to admit when I’m shown wrong (it always means I learned something). I wonder if others here feel the same.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-29 03:54:41
I see my posts have been deleted but never mind

I understand your questioning of the use of "double" blind in relation to Foobar ABX testing as being a misuse of terminology in scientific circles but I also understand the defence of this as being a purely semantic differentiation. As you say there are different criteria that need to be addressed, depending on what audience you are addressing.

But remember also that words have their own powers to influence & "double blind" is so much more convincing a term than just "blind" - it's like double-wrapped or doubly sure!
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-29 04:49:58
Every thing you just said is completely and indefensibly wrong.
Hmm. Hyperbole much? …and wrong since I just defended it.


I don't think the "Hmm." counts as a defense.

This is why I am strongly encouraging you to take a few minutes to read about these concepts rather than just making things up as you go along.  They are not difficult to understand, but they will take a small amount of effort on your part.
Saratoga, dude, I learned the definitions in the 70’s and they have served me well for the first few decades of my scientific career.


An appeal to your own authority works best if you haven't just discredited yourself.

If you don't wish to understand these things, then you are welcome to remain ignorant.  But you also don't need to tell us that.  You can simply continue to not understand and others will continue to passively ignore you.

So, I accept the 2 different worlds of scientific journals and HA. I won’t convince anyone here, and so far I’m not convinced by anyone here on this issue. But I will stop challenging "double" on HA (a world that can decide its own definitions). I’m happy to admit when I’m shown wrong (it always means I learned something). I wonder if others here feel the same.


If you think that the design of scientific studies is of interest to HA but not scientific journals, you are going to have a rough time when you finally submit those papers.  Actually, if these are real studies involving people, you are likely to have trouble when you meet your first IRB and have to pass the statistical review.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-29 04:56:43
If we agree to stop talking about the "robotic test administrators" (yes, let's), then there is no test administrator in the room if fb2k is used alone in a room (the specific example I've always used).
  Would you prefer instead for me to call one of them a "comparator"? It is not a human, just an automated machine, yet clearly it is the second entity which is said to be "blinded", besides just the human listener of an ABX test, hence the terminology of this JAES paper's very title, by use of the word "double":

High-Resolution Subjective Testing Using a Double-Blind Comparator (http://www.aes.org/e-lib/browse.cfm?elib=3839)
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 09:00:21
If we agree to stop talking about the "robotic test administrators" (yes, let's), then there is no test administrator in the room if fb2k is used alone in a room (the specific example I've always used).
  Would you prefer instead for me to call one of them a "comparator"? It is not a human, just an automated machine, yet clearly it is the second entity which is said to be "blinded", besides just the human listener of an ABX test, hence the terminology of this JAES paper's very title, by use of the word "double":

High-Resolution Subjective Testing Using a Double-Blind Comparator (http://www.aes.org/e-lib/browse.cfm?elib=3839)

You know I'm wrong!!
I know you're wrong!!!  ...and the world keeps spinning.
Agree to disagree? I think the semantics have grown boring and unproductive. Don't you? (I win though cuz I have 3 !'s... ;-)

It is interesting to me how many of us have read the Wiki article and still disagree. Reminds me of the white/gold - blue/black dress thing. My son got mad at me because my wife and I see white/gold and he see's blue/black and he thinks I was pulling his leg. We look at the exact same thing and have different interpretations.

I did find the discussion of using ABX in one's private life earlier in the thread quite useful. I will get fb2k running on one of our macs, and like castleofargh, I'll test file formats with it on myself and my son (if he cooperates). We'll decide if we want to rerip all our CDs and what we'll buy in the future. I don't think I'll do anything fancy in choosing headphones though. I will try to figure out Sean Olive's best performers. He won't say, but several people have speculated. He uses the Senn HD 518, I think, for his virtual headphone setup. That might be worth a listen.

If you think that the design of scientific studies is of interest to HA but not scientific journals, you are going to have a rough time when you finally submit those papers.  Actually, if these are real studies involving people, you are likely to have trouble when you meet your first IRB and have to pass the statistical review.
Either English isn't your native tongue or, more likely, you have not read all my posts. Understandable, since there are too many and some are too long.
I'm either a middle-aged scientist with a long list of already published articles on my CV, or I'm a pimply-faced teen pounding on the keyboard hoping to score HA points instead of trying to get laid... You don't know. You won't know.
IRB? Internal Review Board? I don't know if we have one, but if so, they are not involved in submitting papers. I'm certain that ignoring your "helpful tips" won't have any impact on the acceptance of future papers.
FYI, I don't have any pimples. Have a nice day! :-)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-29 09:06:54
Every thing you just said is completely and indefensibly wrong.
Hmm. Hyperbole much? …and wrong since I just defended it.


I would call it damning with faint praise, not defending.

I would call it defending past indefensible misleading statements.

This is why I am strongly encouraging you to take a few minutes to read about these concepts rather than just making things up as you go along.  They are not difficult to understand, but they will take a small amount of effort on your part.
Saratoga, dude, I learned the definitions in the 70’s and they have served me well for the first few decades of my scientific career. I expect they’ll serve me well (in science, not HA) until I retire (if I retire). I have no plan to change them at your behest.


This may be  the core of the problem. It would appear that some peculiar applications of definitions control everything. A common problem with academics who write papers but don't waste time solving real world problems. The alleged scientific career seems to have lacked practical application.

It would appear that I'm dealing with a troll who posts under a nym and therefore can invent any resume he wishes and call it his own.

I suffer from being a real person who picks up the natural limits of being real. There is something about being real and practical that just suckers me in. ;-)

Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 09:32:11
This may be  the core of the problem. It would appear that some peculiar applications of definitions control everything. A common problem with academics who write papers but don't waste time solving real world problems. The alleged scientific career seems to have lacked practical application.

It would appear that I'm dealing with a troll who posts under a nym and therefore can invent any resume he wishes and call it his own.

I suffer from being a real person who picks up the natural limits of being real. There is something about being real and practical that just suckers me in. ;-)

LOL
Mr. Krueger, I live in the same real world as you. In fact, the results of my current project will be built into future products of one company. That is what pays my salary at the moment. Very real world.... but!!...
as you point out, I could be making it all up. My original intention was to never reveal that much about myself and let my arguments stand on their own. I wanted to avoid the type of ad hominem attack in which you are now engaging. But I was getting too much "helpful" advice treating me like a kid, and I lost it and gave in. I was too weak.

But please, please, feel free to always ignore details of my life and answer what I write. Treat me as a nobody, a villager saying "the emperor has no clothes", and show me your clothes. Or in reverse, challenge me to back up anything I say. I tried and failed to convince with the "double" thing, but if I give an opinion, respect it as you'd want me to respect yours. If I claim a "fact", make me back it up. You know, the whole Socratic thing. It should work the same whether I'm a scientist or a kid.
Have a nice day, and thanks for some useful info you provided earlier. :-)

EDIT: a couple of days ago, I received a PM with helpful tips with the title "not really an audio newbie, are you ?". After rereading some of my posts, I see why there is some question. Full disclosure: I've been into electronics, music and audio since I was a teen. In college, I had a job designing and building simple electronics in one of the engineering labs. So, not new to audio or electronics. But I did say that I'm new to audio forums and that's correct. I've decided to upgrade my setup and started reading late last year (Nov.-Dec., I think). Then I joined HA, CA and AVS with this login. I tried to keep my personal info private to avoid it being an issue, but it may seem I was being deceptive. Sorry. My bad.
Title: How do you listen to an ABX test?
Post by: Case on 2015-03-29 10:05:56
This is ridiculous. A useless debate thanks to SoundAndMotion misreading a wikipedia article. Same source for ABX test (http://en.wikipedia.org/wiki/ABX_test) shows more clearly its double-blind nature. Here's (http://techland.time.com/2012/03/02/can-you-hear-the-difference-between-lossless-and-lossy-audio/) a reference about ABX software being double-blind from Time. Here's (https://books.google.fi/books?id=HPbpAgAAQBAJ&pg=PA306&lpg=PA306&dq=abx+double+blind+software&source=bl&ots=NPi8ynL1Ji&sig=cFC_Ck-wvMdVRPF8OUxVkH-Iy1A&hl=fi&sa=X&ei=6b0XVdjWCsa4OOHrgPgG&ved=0CGcQ6AEwCTgU#v=onepage&q=abx%20double%20blind%20software&f=false) a snippet from a book. Here (http://www.aes.org/events/115/papers/SessionI.cfm) you have references from AES. Here (http://www.researchgate.net/publication/23638819_Introducing_perceptual_coding_using_a_double-blind_AB_testing_demonstration) is another study. And this (http://self.gutenberg.org/articles/abx_test) is the last I bothered to dig up.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-29 10:41:42
This may be  the core of the problem. It would appear that some peculiar applications of definitions control everything. A common problem with academics who write papers but don't waste time solving real world problems. The alleged scientific career seems to have lacked practical application.

It would appear that I'm dealing with a troll who posts under a nym and therefore can invent any resume he wishes and call it his own.

I suffer from being a real person who picks up the natural limits of being real. There is something about being real and practical that just suckers me in. ;-)

LOL
Mr. Krueger, I live in the same real world as you.


Yes Mr. Anonymous Troll posting under an alias.  BTW I'm really Bill Gates. ;-)

Quote
EDIT: a couple of days ago, I received a PM with helpful tips with the title "not really an audio newbie, are you ?". After rereading some of my posts, I see why there is some question. Full disclosure: I've been into electronics, music and audio since I was a teen. In college, I had a job designing and building simple electronics in one of the engineering labs. So, not new to audio or electronics. But I did say that I'm new to audio forums and that's correct. I've decided to upgrade my setup and started reading late last year (Nov.-Dec., I think). Then I joined HA, CA and AVS with this login. I tried to keep my personal info private to avoid it being an issue, but it may seem I was being deceptive. Sorry. My bad.


Of course the above is nothing like full disclosure, and parading it as such is umm, very informative. ;-)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-29 10:45:23
This is ridiculous. A useless debate thanks to SoundAndMotion misreading a wikipedia article. Same source for ABX test (http://en.wikipedia.org/wiki/ABX_test) shows more clearly its double-blind nature. Here's (http://techland.time.com/2012/03/02/can-you-hear-the-difference-between-lossless-and-lossy-audio/) a reference about ABX software being double-blind from Time. Here's (https://books.google.fi/books?id=HPbpAgAAQBAJ&pg=PA306&lpg=PA306&dq=abx+double+blind+software&source=bl&ots=NPi8ynL1Ji&sig=cFC_Ck-wvMdVRPF8OUxVkH-Iy1A&hl=fi&sa=X&ei=6b0XVdjWCsa4OOHrgPgG&ved=0CGcQ6AEwCTgU#v=onepage&q=abx%20double%20blind%20software&f=false) a snippet from a book. Here (http://www.aes.org/events/115/papers/SessionI.cfm) you have references from AES. Here (http://www.researchgate.net/publication/23638819_Introducing_perceptual_coding_using_a_double-blind_AB_testing_demonstration) is another study. And this (http://self.gutenberg.org/articles/abx_test) is the last I bothered to dig up.


IME one gets this sort of double talk from a certain kind of academic.  Freshly minted PHD's especially, but some old ones never seem to learn.

Professional Journals relating to education are legendary for this sort of argument.

Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-29 10:56:26
So, I accept the 2 different worlds of scientific journals and HA.


I've cited two very influential (in the real world of audio that is) scientific journals, and of course its all dismissed.

Quote
I won’t convince anyone here,


That's because we know better and have the journal and other publication cites to back us up.

Quote
and so far I’m not convinced by anyone here on this issue.


Right, that would involve admitting to less than perfection. ;-)

Quote
But I will stop challenging "double" on HA (a world that can decide its own definitions). I’m happy to admit when I’m shown wrong (it always means I learned something). I wonder if others here feel the same.


There are dozens of people on the web, some posting right now on this thread who can't be convinced of much of anything real, no matter how much academia, science and practice is provided. Nothing new.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 11:27:14
Yes Mr. Anonymous Troll posting under an alias.  BTW I'm really Bill Gates. ;-)

You consider me a troll. It's not my intent, but I can't help you with that. I'm not sure about your definition. I'm not trying to be provocative... just wanting to challenge and be challenged.
As to anonymity, why don't you suggest to the admins that they require all members to give full names? I see lots of aliases here. I'm impressed you give a real name. I'll tell you why I will remain anonymous. Some members on other sites, who used to use use real names reported people contacting their employers. Some people went to another member's employer's web site and posted stuff on the forum from there. And the straw that broke to real-name-camel's back was an exchange between John Atkinson and yourself on Usenet about 12-14 years ago. You were both rude and nasty... no big deal for me, but some posters (you called them JA's sock puppets) were really scary, making threats and all. I have a family and I don't want my employer involved. If you convince the admins to change HA's policy, great, I'll be gone. Otherwise, please respect my decision.
Also, do you need to attack me? Can't you just continue attacking what I have said and will say? Or just ignore me... another option.
I won't ignore you. You have useful information, abrasive and caustic though it may be presented sometimes. You are often very helpful with your knowledge to people with a question or problem.
I sometimes put smilies or winkies in sincerely. I guess you are mocking me. That's cool. I'll stop putting them in. :-(
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 11:38:56
Right, that would involve admitting to less than perfection. ;-)

Quote
But I will stop challenging "double" on HA (a world that can decide its own definitions). I’m happy to admit when I’m shown wrong (it always means I learned something). I wonder if others here feel the same.

I guess I wasn't clear with what I said. Not only am I happy to admit I'm wrong, but it happens all the time. I am quite far from perfection and never claimed perfection. You seem to have a chip on your shoulder about academia.
Quote
There are dozens of people on the web, some posting right now on this thread who can't be convinced of much of anything real, no matter how much academia, science and practice is provided.

That's okay. I enjoy your posts anyway. I do think I should stop answering you for now. We aren't contributing anything of substance.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-03-29 12:49:23
But I did say that I'm new to audio forums and that's correct. I've decided to upgrade my setup and started reading late last year (Nov.-Dec., I think). Then I joined HA, CA and AVS with this login.

Hi SAM anonymous, on what basis will the "upgrade" be made?

Btw, if you insist on taking only the high road and refusing to resort to slogging, exactly what fear would you have from someone reporting to your employer?
Fear of audiophiles?  Have you seen these type folks in person? 
My apologies for digression, this thread was about "ABX", so my first question holds.

cheers

A(mmar) J(adusingh)
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 13:21:03
Hi SAM anonymous, on what basis will the "upgrade" be made?

Hi AJ,
Long boring story cut short: All my "good stuff" is in storage in the US. I now live in Germany and we've been making do with the living room computer and Audioengine 2's. It is not in the plans to ship the old stuff soon and I want a better setup, especially for my son to learn to appreciate music. That's the justification for my wife, anyway. I bought most of the old stuff in the 80's (very happy with it) under the influence of sighted listening and salespeople in audiophile salons.... and maybe some chemicals. My answer to the question "how much do you want to spend?' is always "I don't know. Enough to get something good" (useless, I know). So now, I don't know if I want to spend 2000 or 20000. But I do know that before I steal a future PS4 game from my kid and donate it the salesperson's kid, I want to be convinced it is worth it. For that, I'm in need of squishing out most, or hopefully all, of my biases. Well that's the goal. I have always bought a piece at a time, so I'll start with headphones.

Quote
Btw, if you insist on taking only the high road and refusing to resort to slogging, exactly what fear would you have from someone reporting to your employer?
Fear of audiophiles?  Have you seen these type folks in person? 
A(mmar) J(adusingh)

I've already shown that I can lose my temper and say things I don't want to. One of these days I'm sure I'll call someone a "jerk" or worse. Just as I have buttons that can be pushed, I may push someone's.
Yes, I have seen modern-day, closed-minded(!) audiophiles, but in the zoo, and thank God, behind steel bars. ;-)
I feel it is impolite to use your name without responding with mine... but... well.. for now at least...
Cheers,
SAM
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-29 14:15:41
Yes Mr. Anonymous Troll posting under an alias.  BTW I'm really Bill Gates. ;-)

You consider me a troll. It's not my intent, but I can't help you with that.


I don't think that a lot of people intend to be useless trolls and damage their credibility, but it sometimes just works out that way.

The "I can't help you with that" comment shows a basic lack of desire to accept responsibility for one's actions, and well it seems like a pattern.

Quote
...just wanting to challenge and be challenged.


You were challenged and then came the hair-splitting and ignorance of reliable evidence from reliable sources that went against your comments, all delivered from behind a mask.

Quote
I'll tell you why I will remain anonymous. Some members on other sites, who used to use use real names reported people contacting their employers. Some people went to another member's employer's web site and posted stuff on the forum from there. And the straw that broke to real-name-camel's back was an exchange between John Atkinson and yourself on Usenet about 12-14 years ago. You were both rude and nasty.


Nice job of covering up for the golden ears who made death threats, harassing phone calls at all times of day or night, posted 100's of libelous comments based on my son's death, and made aggressive use of child pornography.  Atkinson bought these guys lunch and provided them with other perks. They loved it. Never happened anyplace else.

I was just rude and nasty? I'll take that as a complement!
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-29 14:34:46
Nice job of covering up for the golden ears who made death threats, harassing phone calls at all times of day or night, posted 100's of libelous comments based on my son's death, and made aggressive use of child pornography.  Atkinson bought these guys lunch and provided them with other perks. They loved it. Never happened anyplace else.

I was just rude and nasty? I'll take that as a complement!

Well, yeah. You weren't scary like the others. I left out the gory details, but thanks for making my point for me.

So, any recommendations for headphone I should check out? Do you know the "winners" in Sean Olive's tests? Anything, back on topic? I guess I should go start a thread or read one about headphones. This is ABX.
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-03-29 17:26:50
If you think that the design of scientific studies is of interest to HA but not scientific journals, you are going to have a rough time when you finally submit those papers.  Actually, if these are real studies involving people, you are likely to have trouble when you meet your first IRB and have to pass the statistical review.
Either English isn't your native tongue or, more likely, you have not read all my posts. Understandable, since there are too many and some are too long.
I'm either a middle-aged scientist with a long list of already published articles on my CV, or I'm a pimply-faced teen pounding on the keyboard hoping to score HA points instead of trying to get laid... You don't know. You won't know.
IRB? Internal Review Board?


Not quite.  See here:

http://en.wikipedia.org/wiki/Institutional_review_board (http://en.wikipedia.org/wiki/Institutional_review_board)

An IRB is the scientific board that, at least in the US and Europe, would have to approve the psychology studies you mentioned doing above.  They provide an a priori assessment of the scientific, procedural and ethical aspects of research, and review things like procedures, statistics, etc.  In the US at least, they also provide standardized definitions of general scientific terms (including blinding) that are used to prevent misunderstandings like the ones in this thread.

I don't know if we have one, but if so, they are not involved in submitting papers.


Yes you have one, although its acronym may be slightly different if translated out of English.  It would be required for you to publish your research in a reputable journal, and most likely by the laws of whichever country this is happening in.  Since it sounds like your collaborator is handling the actual experimental work, you should ask him or her.  Most likely they have handled this for you.  Well that or you're about to be really disappointed and/or investigated when you try to publish.

As for your publications, I get the impression that you mostly let other people figure out how to handle these things.  Hopefully you continue to have good collaborators who can shield you from experimental, statistical and legal requirements so that you can focus on whatever it is they put up with your ego for.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-29 19:04:15
ABX is an outstanding tool (Thanks for your part in bringing it to us Arny!), but I'm sure many are like me in that doing automated ABX testing of actual hardware, say for example external EQ boxes or speaker wires, is fraught with so many problems that us average consumers don't have the means to do it easily and properly by ourselves and without assistants. Of course the main setback is that other than Arny, I don't think any of us OWN any ABX switchers such as the QSC, etc.

Although full blown ABX  switcher machines  internally have the means to do precise level matching and to simultaneously switch both inputs AND outputs, simpler ABX devices which force the experimenter to level match by external means and that do only one A/B switch, say inputs OR outputs, would still be quite useful. It seems to me that many of us already have two of the major components necessary to achieve these goals: a stereo receiver with electronic input selection (and ideally electronically selected A/B speaker outs), plus an outboard computer to do the blind selection, score tallying, X source randomization, etc. The only thing missing is the interface which could allow some computer program to externally control our receiver (or AVR, preamp/amp, whatever) and a big piece of black tape to obscure our receiver's front panel display! [Yes, there is a way to cheat, by peeking under the tape, but this isn't for publication purposes; it's just for fun.]

The computer's external control could be via RS-232 port, ethernet port*, or mini jack remote control communication port (which some brands have) or perhaps more universally through an IR blaster which gets placed under that black tape we put over the front face of our receiver, the other end obviously wired to our computer.

Arny, has anyone tried to rig up such a thing or has it ever been marketed?

*Even some modestly priced receivers (http://www.bhphotovideo.com/bnh/controller/home?O=&sku=935839&gclid=CO2x0NyczsQCFQWUfgodtIoA-w&Q=&is=REG&A=details) sport network control these days. I know I'd buy a receiver based ABX test app for my cellphone in a heart beat if it were put on the market! Heck, receiver remote control apps are already out there and in many cases free. Case, or any of you other expert software guys, could you do this? Please?
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-29 19:26:07
A very simple test one could do with this setup would be to test short runs of speaker wires, where to the best of my knowledge, assuming you are using adequate gauges, external level matching isn't usually necessary since the dB  loss from one adequate wire to the next (running a short distance) is negligible. Hook your, let's say, $7,250 Pear Anjou speaker wires to your receiver's speaker output A and in parallel to this you simultaneously wire your $20 hardware store bought, thick gauge cord to speakers output B, both terminating at the same speaker's binding posts. Run the test. Show a strong statistical difference?
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-03-29 23:53:57
I bought most of the old stuff in the 80's (very happy with it) under the influence of sighted listening and salespeople in audiophile salons.... and maybe some chemicals.

Well, now that you know what you know (2015), what is your intended method for parsing? You mentioned joining three forums. Are the issues related?

My answer to the question "how much do you want to spend?

Not sure if its residuals from said chemicals, but I asked no such thing. Perhaps you learned scrying while away? 
A budget is certainly useful if you are going to float for advice on forums....but I'm not clear if that is your intent, or how you might perceive such data, given your day job. It most certainly won't be "DBT" type derived advice on most forums.

For that, I'm in need of squishing out most, or hopefully all, of my biases. Well that's the goal. I have always bought a piece at a time, so I'll start with headphones.

If you consider software ABX to be single blind, will that be sufficient? I'm a bit skeptical of blind testing headphones the way I've seen done, even by Olive et al.
Then again I am a skeptic.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-30 02:58:10
Nice job of covering up for the golden ears who made death threats, harassing phone calls at all times of day or night, posted 100's of libelous comments based on my son's death, and made aggressive use of child pornography.  Atkinson bought these guys lunch and provided them with other perks. They loved it. Never happened anyplace else.

I was just rude and nasty? I'll take that as a complement!


Well, yeah. You weren't scary like the others. I left out the gory details, but thanks for making my point for me.


I don't think any such point was made. What happened on rec.audio.opinion (RAO) in the late 90s and early Y2K seems to have been a very isolated event. I think that there were serious plans at the time to co-opt RAO as a sales tool. Later on magazine-owned conferencing web sites provided an alternative. RAO is still a wasteland.

A friend of mine is spending time in court defending himself from a driving charge when he was miles away at the time. There is only one eyewitness, the alleged harmed party.

The lesson is that anybody can cause you a lot of inconvenience over just about anything if they so desire. While the RAO crowd  couldn't go after me at my work, they did go after me at my church and with my local police department. It was all just words.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-30 03:06:34
ABX is an outstanding tool (Thanks for your part in bringing it to us Arny!), but I'm sure many are like me in that doing automated ABX testing of actual hardware, say for example external EQ boxes or speaker wires, is fraught with so many problems that us average consumers don't have the means to do it easily and properly by ourselves and without assistants. Of course the main setback is that other than Arny, I don't think any of us OWN any ABX switchers such as the QSC, etc.

Although full blown ABX  switcher machines  internally have the means to do precise level matching and to simultaneously switch both inputs AND outputs, simpler ABX devices which force the experimenter to level match by external means and that do only one A/B switch, say inputs OR outputs, would still be quite useful. It seems to me that many of us already have two of the major components necessary to achieve these goals: a stereo receiver with electronic input selection (and ideally electronically selected A/B speaker outs), plus an outboard computer to do the blind selection, score tallying, X source randomization, etc. The only thing missing is the interface which could allow some computer program to externally control our receiver (or AVR, preamp/amp, whatever) and a big piece of black tape to obscure our receiver's front panel display! [Yes, there is a way to cheat, by peeking under the tape, but this isn't for publication purposes; it's just for fun.]

The computer's external control could be via RS-232 port, ethernet port*, or mini jack remote control communication port (which some brands have) or perhaps more universally through an IR blaster which gets placed under that black tape we put over the front face of our receiver, the other end obviously wired to our computer.

Arny, has anyone tried to rig up such a thing or has it ever been marketed?


Seems to me that the easiest way to do a hardware ABX box today is base something on an Arduino processor driving one of the USB relay boards I've talked about here before.

USB Relay board (http://www.amazon.com/SainSmart-Eight-Channel-Relay-Automation/dp/B0093Y89DE/ref=sr_1_1?ie=UTF8&qid=1427681091&sr=8-1&keywords=usb+relay+board)

(http://ecx.images-amazon.com/images/I/511tS5UM-8L.jpg)

Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-30 06:05:43
I don't think any such point was made. What happened on rec.audio.opinion (RAO) in the late 90s and early Y2K seems to have been a very isolated event. I think that there were serious plans at the time to co-opt RAO as a sales tool. Later on magazine-owned conferencing web sites provided an alternative. RAO is still a wasteland.

A friend of mine is spending time in court defending himself from a driving charge when he was miles away at the time. There is only one eyewitness, the alleged harmed party.

The lesson is that anybody can cause you a lot of inconvenience over just about anything if they so desire. While the RAO crowd  couldn't go after me at my work, they did go after me at my church and with my local police department. It was all just words.

I think the point you made was that you went through much more than "a lot of inconvenience". You went through hell at an especially difficult time (personal tragedy). My sincere sympathies. And why? Because you expressed your viewpoint (the truth in your mind - and I won't disagree) about issues that can inflame extreme passions: audio. There is no excuse for what you went though, but I take a lesson from it. I'm not nearly as high-profile as you are and have been, but when it comes to audio, I'm not willing to suffer even a little inconvenience for having my views. Therefore anonymous. You may find that cowardly and I can see that. But there are many issues for which I would put my convenience, reputation, and even life on the line, but audio is not one of them.
I knew it was bad for you, but I didn't realize how bad. Sorry for bringing up something you'd probably rather forget. I could have defended anonymity without dragging you in.

As for RAO, it's good for when you're in a Jerry Springer mood, but can't find reruns.
Have a good day.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-30 06:49:20
My answer to the question "how much do you want to spend?

Not sure if its residuals from said chemicals, but I asked no such thing. Perhaps you learned scrying while away? 
A budget is certainly useful if you are going to float for advice on forums....but I'm not clear if that is your intent, or how you might perceive such data, given your day job. It most certainly won't be "DBT" type derived advice on most forums.

LOL I guess my long story wasn't cut short enough. ;-) That quote and my "response" was a verbose way to say: I don't have a set budget. That combined with an unwillingness to waste money was partially, in my mind, a direct answer to your question: "on what basis will the upgrade be made?"
For that, I'm in need of squishing out most, or hopefully all, of my biases. Well that's the goal. I have always bought a piece at a time, so I'll start with headphones.

If you consider software ABX to be single blind, will that be sufficient?

I don't consider software ABX to be blind, single blind or double blind. I consider describing a test using software ABX as some level of "blind" to be dependent on the circumstances. If the experimenter and subject are 2 people (not only a hobbyist at home=blind), then a test using software ABX could be blind (subject alone), single blind (subject and experimenter in room using software that shows the experimenter what is playing), or double blind (both in room, neither knows, e.g. using fb2k). In this case, with all else equal, blind and double blind are equivalent in terms of quality, validity, strength. But other things may not be equal. I wonder why the experimenter is in the room if using fb2k. If the experimenter is fidgety, walking around the room, and generally distracting, the double blind version is worse than just blind. "Double" describes the circumstances not the quality. If both are in the room, then the quality depends on it being double. The experimenter would be in the room sometimes. If someone claims a 100% ability to hear the difference between WAV and AIFF, and Randi offers the prize, you can be sure the subject won't be alone to hack, or copy a "successful" result file to the computer. fb2k would be a perfect tool there.

None of this differs from what I've already said. It is a semantic argument about the meaning of "double blind", not a quality judgement.

I feel as though I joined a photography forum and someone writes "a group of us went to the Sistine Chapel. Michelangelo's ceiling is one of the best photos ever". I jump in with "it's a painting, not a photo" and the forum jumps on me for being ridiculous. Someone offers to hold my hand and explains "a photo is a picture of a scene" and I lose it and let slip I'm a professional artist and I know the difference between paint, silver emulsion and a camera sensor. Then, Mr. Egotistical Anonymous "Artist" Man needs to join the real world and learn to appreciate photography as a valid art form, if he even is an artist!... with all kinds of proof about the beauty and importance of the Sistine Chapel. It was a semantic point about the meaning of a word, not a judgement on quality.

None of this differs from what I've already said. ...*sigh*
Cheers, SAM
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-30 09:42:32
OT question: why is double blind always stated, when for example, fb2k isn't double?


Do you now concede you were mistaken when you wrote that and admit that fb2k ABX IS double blind? Or do you contend it is single blind, "some variety of blind but without a quantity", or "it depends on the exact circumstances of how it is administered"?
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-30 09:55:35
OT question: why is double blind always stated, when for example, fb2k isn't double?


Do you now concede you were mistaken and that fb2k ABX IS double blind? Or do you contend it is single blind?

Oops, yes thanks! I concede that my wording was poor right there and therefore I was wrong in that quote**. Inexcusable sloppiness! And since that was my first mention, now I understand the hoopla. Again, my fault for quick writing. I doubt I'll be believed, but I truly didn't intend to be provocative. But how else would someone interpret it? Yep, my bad! I bet I may have been sloppy elsewhere too. I'll look, and if so, again apologies.

A clearer version of my view is here:
I don't consider software ABX to be blind, single blind or double blind. I consider describing a test using software ABX as some level of "blind" to be dependent on the circumstances.

and then I explain what I mean by circumstances. Obviously, fb2k can do software ABX, and can be used in a DBT, among other things. Didn't you point out it could be used for sighted tests too?
Yes, here:
People don't talk about it very much but fb2k ABX is a fantastic sighted listening aid as much as it is a double blind testing tool [just click A and B and never even examine X]. It let's you pick whatever files you want, synchronizes their playback [assuming they were made properly], applies DSP or Replaygain optionally, switches at any point you want, loops a favorite section, and most importantly switches nearly instantaneously between A and B at the listener's discretion. Echoic memory is fleeting and being able to flip between two options so quickly and easily greatly improves one's sensitivity.

Not a DBT when used like this. It depends on the circumstances.

**Late edit: I should have said "..not always double, depending on the nature of the test."
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-30 10:05:02
Well, now that you know what you know (2015), what is your intended method for parsing? You mentioned joining three forums. Are the issues related?

I’m guessing that what first looked like interest and an offer to help, was really an attempt to bait me into saying something you’d find amusing. If so, success! See my first response to you.

If not, and you are interested (my mistake). My questions were: how can I remove bias from my decisions and what role would ABX play for me at home. After reading all the posts in this thread and a couple of PMs, I realized:
I did find the discussion of using ABX in one's private life earlier in the thread quite useful. I will get fb2k running on one of our macs, and like castleofargh, I'll test file formats with it on myself and my son (if he cooperates). We'll decide if we want to rerip all our CDs and what we'll buy in the future. I don't think I'll do anything fancy in choosing headphones though. I will try to figure out Sean Olive's best performers. He won't say, but several people have speculated. He uses the Senn HD 518, I think, for his virtual headphone setup. That might be worth a listen.

So, personal-life-ABX for file format tests, but although the effort to ABX headphones is not insurmountable, the effort is too great, so: normal sighted headphone listening, along with reading others’ opinions and considerations like cost and comfort. (I thought I said thanks to eric.w somewhere, but thanks again) How it works for future decisions is not yet set.

I'm a bit skeptical of blind testing headphones the way I've seen done, even by Olive et al.
Then again I am a skeptic.

Great, me too! Did he do more than the virtual headphone. I thought he did, but didn’t find it with a quick and dirty search. Why are you skeptical?
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-30 10:18:35
 
Not DBT when used like this
In fact used this way I described, never even clicking X, it isn't even a test. There is no question to answer, you never get asked one,  and the user never inputs any response. It is just a nifty way to listen to A and B, with great flexibility to repeat certain favorite parts, and the identities of A an B are labelled, i.e. told to you before you even click the button to start their playback. [When you select any two songs to A/B compare from your foobar playlist the one on the top of the list is always A, the one below, adjacent or many cuts away, is always B.]
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-03-30 10:30:02
Not DBT when used like this
In fact used this way I described, never even clicking X, it isn't even a test. There is no question to answer, you never get asked one,  and the user never inputs any response. It is just a nifty way to listen to A and B, with great flexibility to repeat certain favorite parts, and the identities of A an B are labelled, i.e. told to you before you even click the button to start their playback. [When you select any two songs to A/B compare from your foobar playlist the one on the top of the list is always A, the one below, adjacent or many cuts away, is always B.

Well, there is zero chance I'll get into the definition of "test" (e.g. sighted listening test). You call it an "aid" and that sounds good. I'll go with that.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-03-30 15:07:14
I don't consider software ABX to be blind, single blind or double blind.....It was a semantic point about the meaning of a word, not a judgement on quality.

Have a cookie, you'll be right as rain. Unsure if said residuals creating paranoia, but I'm actually largely in agreement. You can read my posts on this very forum where I was highly skeptical of (unsupervised) software ABX results of some highly unscrupulous known shysters of the industry.

I’m guessing that what first looked like interest and an offer to help, was really an attempt to bait me into saying something you’d find amusing. If so, success! See my first response to you.

Well, can't help you with that paranoia thing and if those PMs happen to be from certain individuals with strong pecuniary interests in peddling $50k amps and $2k magic DACs designed by Biologists...well, can't help you there either. Up to you to figure out why they may want to pound the drums of uncertainty about blind tests under the guise of scientific rigor...except for their own jewelry.
You did get one thing right. I do allow audiophiles to supply all their own rope for my use, much to my amusement. 

If not, and you are interested (my mistake). My questions were: how can I remove bias from my decisions and what role would ABX play for me at home.

Unless you plan on multiple purchases with returnable policies, I'd be curious how you do that too. Plus the setup of such a test, including switching apparatus.

Great, me too! Did he do more than the virtual headphone. I thought he did, but didn’t find it with a quick and dirty search. Why are you skeptical?

Don't recall all the details, not a headphone guy, so cursory interest only. But I don't think anything was done to remove the physical differences between the things being placed on head, feel, weight, comfort etc. IOW, identifying characteristics beyond sound.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-03-30 15:39:30
I contend that you are making distinctions without a difference.

Sean Olive's setup at Harman uses an automated system to present loudspeakers in random order, behind a curtain.  The subject controls when the 'switches' take place, and his answers are recorded and tallied by software.

Are you suggesting because it's done with no human tester intervention, it's not *effectively* double blind?

I'm saying that if Dr. Olive or an associate was in the room AND unaware of what's playing, it IS double blind.



You are indeed spending an incredible amount of verbiage on the semantics of 'double'.

And yes, some of us already understand what single and double blind tests are.

So stop.

Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-30 19:09:28
ABX is an outstanding tool (Thanks for your part in bringing it to us Arny!), but I'm sure many are like me in that doing automated ABX testing of actual hardware, say for example external EQ boxes or speaker wires, is fraught with so many problems that us average consumers don't have the means to do it easily and properly by ourselves and without assistants. Of course the main setback is that other than Arny, I don't think any of us OWN any ABX switchers such as the QSC, etc.  Although full blown ABX  switcher machines  internally have the means to do precise level matching and to simultaneously switch both inputs AND outputs, simpler ABX devices which force the experimenter to level match by external means and that do only one A/B switch, say inputs OR outputs, would still be quite useful. It seems to me that many of us already have two of the major components necessary to achieve these goals: a stereo receiver with electronic input selection (and ideally electronically selected A/B speaker outs), plus an outboard computer to do the blind selection, score tallying, X source randomization, etc. The only thing missing is the interface which could allow some computer program to externally control our receiver (or AVR, preamp/amp, whatever) and a big piece of black tape to obscure our receiver's front panel display! [Yes, there is a way to cheat, by peeking under the tape, but this isn't for publication purposes; it's just for fun.]  The computer's external control could be via RS-232 port, ethernet port*, or mini jack remote control communication port (which some brands have) or perhaps more universally through an IR blaster which gets placed under that black tape we put over the front face of our receiver, the other end obviously wired to our computer.  Arny, has anyone tried to rig up such a thing or has it ever been marketed?
  Seems to me that the easiest way to do a hardware ABX box today is base something on an Arduino processor driving one of the USB relay boards I've talked about here before.  USB Relay board (http://www.amazon.com/SainSmart-Eight-Channel-Relay-Automation/dp/B0093Y89DE/ref=sr_1_1?ie=UTF8&qid=1427681091&sr=8-1&keywords=usb+relay+board)
  Easiest? For those of us who own or are willing to buy a network addressable receiver, which brand new cost under $300 and open box and refurbs  I've seen for as little as $199, simply plugging an Ethernet cord into the back and running an app on our smart phones, tablets, or computers seems markedly easier to me; there's nothing to solder, build, or assemble. The existing control apps are usually free and exist for Denon, Yamaha, Onkyo, Marantz, Sony, and Pioneer, they just lack an ABX test interface to randomly select between two user assigned inputs for the X, and a way to tally votes. [Although any of us could do that tallying with pen and paper. It is the automated, randomized selection of X by a robotic test administrator which stymies us enthusiasts from running hardware ABX tests on our own, for fun.]

As one example, I bought a Yamaha receiver a step down from this one (http://www.accessories4less.com/make-a-store/item/yamrxv477bl/yamaha-rx-v477-5.1-channel-network-av-receiver/1.html?gclid=CJSeqLnQ0MQCFcRffgodKEoA5g) last year and I'm VERY pleased with it (and I'm quite picky). Its feature set rivals receivers which were more than double this price just a few years ago! [For the benefit of anyone reading this who is unfamiliar with these existing control apps, here's one in action (https://www.youtube.com/watch?v=BOUd6x3Xtcc).]
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-31 12:51:55
Easiest? For those of us who own or are willing to buy a network addressable receiver, which brand new cost under $300 and open box and refurbs  I've seen for as little as $199, simply plugging an Ethernet cord into the back and running an app on our smart phones, tablets, or computers seems markedly easier to me; there's nothing to solder, build, or assemble. The existing control apps are usually free and exist for Denon, Yamaha, Onkyo, Marantz, Sony, and Pioneer, they just lack an ABX test interface to randomly select between two user assigned inputs for the X, and a way to tally votes. [Although any of us could do that tallying with pen and paper. It is the automated, randomized selection of X by a robotic test administrator which stymies us enthusiasts from running hardware ABX tests on our own, for fun.]


I see your point.  It seems limited as to what it can be used to ABX - it is pretty much limited to comparing sources and media. You are locked into the decoders, converters and power amps in the AVR which are usually OK for rational people. However one of the benefits of ABX is its ability to expose irrationality.  The good news is that it seems to be a  practical way to do listening tests involving multichannel media and such  comparisons as can be encapsulated that way.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-31 13:28:25
I see the recent exchanges have been deleted (34 posts now found in the recycle bin (http://www.hydrogenaud.io/forums/index.php?showforum=41)) - no harm, really.

Can I ask a question?
Based on people's descriptions given here, it seems that there is more reliance on non-echoic memory being used in ABX tests than I have seen admitted to elsewhere. The two common reasons given for using ABX testing that I've seen promulgated on audio forums are that a) memory is unreliable & therefore short-term echoic memory is the only reliable way to do A/B comparisons & b) removal of knowledge is needed to eliminate a major biasing factor.

From earlier in the thread this demarcation/categorisation of memory was given "My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesised auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long."

Do people agree with this categorisation & can I ask what are the reliability quota for the various forms of memory used in ABX & any studies that back up these quota?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-31 15:22:13
I see the recent exchanges have been deleted (34 posts now found in the recycle bin (http://www.hydrogenaud.io/forums/index.php?showforum=41)) - no harm, really.

Can I ask a question?
Based on people's descriptions given here, it seems that there is more reliance on non-echoic memory being used in ABX tests than I have seen admitted to elsewhere.


"Admitted to?"  Makes it seem like something to be ashamed of. Not so.

Quote
The two common reasons given for using ABX testing that I've seen promulgated on audio forums are that a) memory is unreliable & therefore short-term echoic memory is the only reliable way to do A/B comparisons.


Simply not true. A false claim! What said that?

A true statement would be that memory and everything else humans do is inherently unreliable. From that mud of perceptions we attempt, sometimes with great success, to pull some truth.

Using short term echoic memory is one of the things that people may use to identify sounds until they adequately learn how to identify sounds all by themselves.

People normally don't need someone they know to tell them a word to properly identify it when someone else they don't know says it.  They may have only read it not ever heard it said by anybody! They have learned what that word sounds like when pronounced by a large number of people, even people with a wide variety of accents. The word may sound vastly different when various people say it, it may even have added or missing vowel and consonant sounds, and it may still be reliably understood.  It would appear that echoic memory as such has nothing to do with it.

The same can be true of non verbal sounds. I don't need a memory of 1 kHz tone sound being played at 90 dB to correctly identify it as being about 1 kHz being played at about 90 dB. If I had perfect pitch, there might not be anything that  approximate about my perception of the sound being at 1 KHz. The perception could be very precise.

This has to do with the fact that hearing is a survival tool. If I needed to have a precise memory of a tiger or enemy sneaking up behind me in the grass, my demise could be hastened. ;-)  Hopefully I can recognize the sound of a tiger sneaking up on me the first time it happens.  There  might not be a second time if the first time goes badly!

Quote
& b) removal of knowledge is needed to eliminate a major biasing factor.


That would be another false claim. Who said that?  In listening tests removal of knowledge of non-audible cues to the identity of the unknown sound being listened to is of the essence. No need to remove other knowledge or learning.  This removes a great many potentially biasing factors.  Whether the listener does it by means of sonic memory or learning may be interesting but usually is of far less importance. The point is that he does is by means of just listening and more significantly not by means of seeing and activating memories of reviews, etc.

A video at this site: Link to video of listener transitioning from memory to learned idetnfication duing an ABX test (http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/)  shows an example of this. (It is a testimonial to the popularity (based on usefulness) and wide familiarity with ABX that this video was made by someone I never knew using an ABX Comparator I never knew existed.)

Quote
From earlier in the thread this demarcation/categorisation of memory was given "My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesised auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long."

Do people agree with this categorisation & can I ask what are the reliability quota for the various forms of memory used in ABX & any studies that back up these quota?


Yet another question based on false claims. The forms of memory and learning used in ABX are the same as humans use for almost any other kind of hearing.  They are hardly unique to ABX. They are the ones that sighted listeners often claim to use but aren't really using because of the wealth of non-audible cues in their listening environment.  The main reason that people stumble over ABX is that they have confused sighted evaluations with hearing. Actually relying on just listening can be a shock, especially the first time you have to do it for real.

So there are no such things as any special kinds of memory used in ABX.  Furthermore, as has been shown ABX tests can be passed and even aced without using any memories of specific sounds. General knowledge of what so-and-so or such-and-such sounds like can and often does suffice.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-03-31 18:35:16
When a DBT/ABX antagonist tries to define DBT/ABX, they tend to get it wrong.

It's a pattern I've noticed.

jkeny, you keep asking folks to just give your boutique DACs a listen.  How about we ask you do to some proctored DBTs?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-03-31 20:44:55
When a DBT/ABX antagonist tries to define DBT/ABX, they tend to get it wrong.

It's a pattern I've noticed.


Yes. Its kinda shocking when people with such great credentials characterize ABX as being a 3AFC test which is essentially what they did.

Their alleged superior alternative was non-interactive and based on seemingly arbitrarily chosen samples that were so long that it is questionable that human working memory can do well with them.

But the ABX bashing no doubt pleased the High End publication journalists.

Then they created a model of a Franken-DAC that was like none ever seen in actual audio gear, but titled their article so that one might think it was representative of the current market.


The comments on the AES web site have been updated by this recent response:

Most recent comment on Meridian Conference Paper (http://www.aes.org/forum/?ID=416&c=2977)

"

Comment posted March 27, 2015 @ 19:56:12 UTC (Comment permalink)

Thank you, Mr. Stuart, for the clarifications. Let me add one of my own before going into the details: I didn't intend to say that the conclusions of the paper go beyond those in the abstract. They are simply different, and if anything, I would tend to say the opposite, namely that the conclusions in the abstract go beyond those in the paper. My main criticism, however, is that they are not adequately supported by the research presented. I am looking forward to seeing your further research that you say will close this gap.

Apart from this, there are two main topics which I would like to address in turn. The first is the criticism aimed at the ABX test method, and the second is your choice of filter charateristics.

I was under the misapprehension, that you were criticising the ABX test method as used in more recent times. The answer by Mr. Krueger, and your choice of references that you provided with your answer, makes it very plausible that you are actually criticising a form of ABX test where the A, B and X stimuli are presented once in this order, and the listener, who has no influence on the test, is then asked whether X was A or X was B. In this case, it is understandable why you are concerned about the strain on the listener. Here, it is indeed necessary for the listener to remember the sounds in order to compare them.

I was not aware that this primitive form of ABX testing was still being used widely, particularly when trying to identify subtle differences. Improved ABX testing procedures and corresponding hardware support have been known and used for decades, which allow the listener to switch at will between A,B and X any number of times and at any point in time, and indeed your own test method allowed for the same, except of course for the lack of a stimulus B. It was your discussion of the Meyer/Moran experiment in particular, which led me to believe that you were actually criticising their way of doing ABX. Not so, as I realize by now.

It does, however, beg the question why you didn't simply resort to a more modern form of ABX, which doesn't have the problems you suspect, instead of dismissing it entirely. In any case, the question whether "modern" ABX is inferior to other approaches, such as yours, remains unanswered, whilst the criticism you have aimed at ABX has been addressed a long time ago by introducing ABX switching hardware operated by the listener.

Regarding the second topic, namely the choice of filter characteristics, I have to support Mr. Krueger. I tried to find A/D converter chips amongst my collection of data sheets, which offered a transition band as narrow as the one you used for your experiment. Apart from a chip by ESS which had a freely programmable decimator, I only encountered wider transition bands, even when the chips offered several choices. It was only a cursory look, perhaps a more thorough search would have uncovered some more examples, but I don't understand how you come to your opinion that your choice represents a typical situation encountered in the field. My own perception of the market has been for quite some time now, that the transition bands have become wider, sometimes beyond the point where I would find the risk of aliasing effects to be justifiable. So my fear is that the market is more likely to err on the side of too wide a transition band.

Your research is of course valuable in showing that too narrow a transition band may have a negative effect, too. If this leads to a realization of what transition band is "right" for which given sampling rate, it can only advance the state of the art. My own feeling, however, is that the existing converter chips are in their majority already quite close to this best choice. Yet I would find it most welcome to investigate the root cause of the differences that your experiment found audible. You offer some hypotheses that would need substantiation.

I still believe that you are making way too much of your findings. It is far from clear that your result can be seen as pointing towards a deficiency of the CD format. If you weren't implying to judge the format as such anyway, as you say, your wording of the abstract and of the introduction was certainly unhelpful. This is also evidenced by the public reaction it has attracted. I hope that this can and will be put right in the upcoming episodes.

Kind regards

Stefan Heinzmann
"
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-31 21:19:14
Easiest? For those of us who own or are willing to buy a network addressable receiver, which brand new cost under $300 and open box and refurbs  I've seen for as little as $199, simply plugging an Ethernet cord into the back and running an app on our smart phones, tablets, or computers seems markedly easier to me; there's nothing to solder, build, or assemble. The existing control apps are usually free and exist for Denon, Yamaha, Onkyo, Marantz, Sony, and Pioneer, they just lack an ABX test interface to randomly select between two user assigned inputs for the X, and a way to tally votes. [Although any of us could do that tallying with pen and paper. It is the automated, randomized selection of X by a robotic test administrator which stymies us enthusiasts from running hardware ABX tests on our own, for fun.]
  I see your point.  It seems limited as to what it can be used to ABX - it is pretty much limited to comparing sources and media. You are locked into the decoders, converters and power amps in the AVR which are usually OK for rational people.
  Not necessarily: you can completely bypass the receiver's preamp, processor, and power amp, if one deems them not up to par, and here's how. The receiver's input selector (controlled by my proposed ABX app) does the switching of any analog RCA carrying source of choice, but you don't use your receiver's speaker outs and instead you use the receiver's "record out" jacks (tape monitor outs they used to call them) as your ABX comparator's output  to then feed your outboard, audiophile approved system of choice. 

Although you are on you own for level matching, if need be, [although some receivers actually have rudimentary input level trim for their sources, even my $169 Yamaha does, albeit in 1dB increments which I realize is too crude for good ABX level matching] this is true with your suggested USB controlled relay box gizmo too.

Think of how many things this setup could ABX test (assuming you can synchronize and level match them on your own): the analog outs of two battling CD players, standalone outboard DACs, preamps [probably limited to not much more than let's say 2V output settings, or so, so as to avoid overloading the receiver's line level input stage], DVDs, SACD players, CD vs SACD [you could even attempt to replicate the Meyer Moran test, for example, if you had an on the fly A/D to D/A loop like their HHB CD recorder], RCA wire interconnects, LP vs CD [pretending they were from the same master and that pops, clicks, etc. wouldn't be a tell  ]...

If your receiver has electronically controlled speaker outs, A and B [or "zone B", for example this one (http://www.amazon.com/Yamaha-7-2-Channel-Receiver-Discontinued-Manufacturer/dp/B00B981F4M/ref=sr_1_1?s=electronics&ie=UTF8&qid=1427834552&sr=1-1&keywords=rx-v575)], you could additionally run ABX tests on speaker wires, or heck, speakers! [Level matching could be achieved by simultaneously switching between inputs that had had their levels tweaked, by outboard means, and I'd suggest a "speaker shuffler" of course, if one can swing that, he-he.]

About the only things that I'd want to ABX, but it wouldn't be possible this way, is power amps and I guess headphones.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-31 23:05:37
When a DBT/ABX antagonist tries to define DBT/ABX, they tend to get it wrong.

It's a pattern I've noticed.
Must try harder, must try harder, must try harder!!

I'm giving you what I think the widespread opinion of ABX is, as I've experienced it on audio forums - a) it relies on echoic memory b) it deprives the listener of visual knowledge of which device/track is being used. According to the typical proponent of DBTs both of these pillars ensure that DBTs are the "gold standard" of audio tests.

I'm not saying that this is correct - I'm just reporting what I have gathered over many years of being on audio forums. As you are suggesting that this is very wide of the mark, I would suggest that you really need to educate the people who argue for DBTs/ABX on audio forums as they are doing ABX a disservice with their mis-information & obviously using ABX incorrectly as a result

Quote
jkeny, you keep asking folks to just give your boutique DACs a listen.  How about we ask you do to some proctored DBTs?
Where did I ask people to give my DAcs a listen? As Clark Gable said ""Frankly, my dear, I don't give a damn"
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-31 23:18:07
When a DBT/ABX antagonist tries to define DBT/ABX, they tend to get it wrong.  It's a pattern I've noticed.


Quite often (although not always) it is because they have no familiarity with FB2K ABX since they've never conducted even one test on themselves, and proved so by posting it. Take for example: jkeny.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-31 23:21:30
Quote
The two common reasons given for using ABX testing that I've seen promulgated on audio forums are that a) memory is unreliable & therefore short-term echoic memory is the only reliable way to do A/B comparisons.


Simply not true. A false claim! What said that?

A true statement would be that memory and everything else humans do is inherently unreliable. From that mud of perceptions we attempt, sometimes with great success, to pull some truth.

Using short term echoic memory is one of the things that people may use to identify sounds until they adequately learn how to identify sounds all by themselves.
OK, so in your opinion short-term echoic memory is mostly used for identification training & not essential to A/B listening?

Quote
People normally don't need someone they know to tell them a word to properly identify it when someone else they don't know says it.  They may have only read it not ever heard it said by anybody! They have learned what that word sounds like when pronounced by a large number of people, even people with a wide variety of accents. The word may sound vastly different when various people say it, it may even have added or missing vowel and consonant sounds, and it may still be reliably understood.  It would appear that echoic memory as such has nothing to do with it.

The same can be true of non verbal sounds. I don't need a memory of 1 kHz tone sound being played at 90 dB to correctly identify it as being about 1 kHz being played at about 90 dB. If I had perfect pitch, there might not be anything that  approximate about my perception of the sound being at 1 KHz. The perception could be very precise.
But the "about" in the above is surely the point - what is being tested in A/B listening are small differences between A & B so "about" is not nearly specific enough as a criteria for differentiation. To use your "word" example above - we are not trying to identify the "word", we are trying to identify any difference in intonation between the same two words.

Quote
This has to do with the fact that hearing is a survival tool. If I needed to have a precise memory of a tiger or enemy sneaking up behind me in the grass, my demise could be hastened. ;-)  Hopefully I can recognize the sound of a tiger sneaking up on me the first time it happens.  There  might not be a second time if the first time goes badly!
Sure, evolution has everything to do with how are senses operate - if they are not giving us the necessary signals from the physical world that will help us survive then......

Quote
& b) removal of knowledge is needed to eliminate a major biasing factor.

Quote
That would be another false claim. Who said that?  In listening tests removal of knowledge of non-audible cues to the identity of the unknown sound being listened to is of the essence. No need to remove other knowledge or learning.  This removes a great many potentially biasing factors.  Whether the listener does it by means of sonic memory or learning may be interesting but usually is of far less importance. The point is that he does is by means of just listening and more significantly not by means of seeing and activating memories of reviews, etc.
"knowledge of non-audible cues" would also include a listeners pre-conditioning - so if digital cables were being tested, many would be pre-biased towards not hearing any difference & the test would need to deal with this bias.

Quote
A video at this site: Link to video of listener transitioning from memory to learned idetnfication duing an ABX test (http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/)  shows an example of this. (It is a testimonial to the popularity (based on usefulness) and wide familiarity with ABX that this video was made by someone I never knew using an ABX Comparator I never knew existed.)
Thanks, I'll look at that later

Quote
From earlier in the thread this demarcation/categorisation of memory was given "My understanding is that auditory memory goes through 3 stages: perceptual auditory storage (aka echoic memory), which lasts up to 300 ms; synthesised auditory memory, lasting 1 to 30 sec; and generated abstract memory, which can last very long."

Do people agree with this categorisation & can I ask what are the reliability quota for the various forms of memory used in ABX & any studies that back up these quota?

Quote
Yet another question based on false claims. The forms of memory and learning used in ABX are the same as humans use for almost any other kind of hearing.  They are hardly unique to ABX. They are the ones that sighted listeners often claim to use but aren't really using because of the wealth of non-audible cues in their listening environment.  The main reason that people stumble over ABX is that they have confused sighted evaluations with hearing. Actually relying on just listening can be a shock, especially the first time you have to do it for real.
In my experience the reason people "stumble over ABX" is mainly because it entails a different type of listening - it's no longer the normal, relaxed listening they are used to. This can probably be addressed by the test design but I seldom see it done.

Quote
So there are no such things as any special kinds of memory used in ABX.  Furthermore, as has been shown ABX tests can be passed and even aced without using any memories of specific sounds. General knowledge of what so-and-so or such-and-such sounds like can and often does suffice.

OK, I can relate to what you say here - one can ace an ABX test simply from "General knowledge of what so-and-so or such-and-such sounds like" In other words we have learned & stored abstract models of how the physical objects in the world sound & have quite detailed & complex models that we use for comparison. As AJ quoted me - this is the "organic sound" that I referenced in the posts that were recently deleted.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-31 23:24:06
When a DBT/ABX antagonist tries to define DBT/ABX, they tend to get it wrong.  It's a pattern I've noticed.


Quite often (although not always) it is because they have no familiarity with FB2K ABX since they've never even taken one test, and proved so by posting it. Take for example: jkeny.

False accusation based on zero evidence as usual - I've posted at least one FB2K ABX set of results
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-31 23:26:21
Link with results please.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-31 23:30:35
Link with results please.

I can't help you with your bad memory - it was on an AVS thread that you participated on & probably even commented on my results - go figure
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-31 23:32:38
No need to knock yourself out finding exactly where you posted it. Simply taking a blank test and guessing each time to get fast results, like I've just done a moment ago ,will prove to us you do at least know HOW to do it. Example:

foo_abx 2.0 report
foobar2000 v1.3.3
2015-03-31 15:28:47

File A: naim-test-2-flac-24-192000-44.1andback.flac
SHA1: 58336a1216fe1c3cd53d9f2d928f67f137e97599
File B: naim-test-2-flac-24-192000.flac
SHA1: 54fd277d2f70d4960fe73db58799ecfdb424293d

Output:
DS : Primary Sound Driver
Crossfading: NO

15:28:47 : Test started.
15:28:55 : 00/01
15:28:58 : 00/02
15:29:01 : 01/03
15:29:04 : 01/04
15:29:06 : 02/05
15:29:08 : 03/06
15:29:11 : 03/07
15:29:14 : 03/08
15:29:14 : Test finished.

----------
Total: 3/8
Probability that you were guessing: 85.5%

-- signature --
1418b6ad63a8684b26c53403e8f80ad74094399b
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-03-31 23:35:44
No need to knock yourself out finding exactly where you posted it. Simply taking a blank test and guessing each time to get fast results, like I've just done a moment ,will prove to us you do at least know how to do it.

foo_abx 2.0 report
foobar2000 v1.3.3
2015-03-31 15:28:47

File A: naim-test-2-flac-24-192000-44.1andback.flac
SHA1: 58336a1216fe1c3cd53d9f2d928f67f137e97599
File B: naim-test-2-flac-24-192000.flac
SHA1: 54fd277d2f70d4960fe73db58799ecfdb424293d

Output:
DS : Primary Sound Driver
Crossfading: NO

15:28:47 : Test started.
15:28:55 : 00/01
15:28:58 : 00/02
15:29:01 : 01/03
15:29:04 : 01/04
15:29:06 : 02/05
15:29:08 : 03/06
15:29:11 : 03/07
15:29:14 : 03/08
15:29:14 : Test finished.

----------
Total: 3/8
Probability that you were guessing: 85.5%

-- signature --
1418b6ad63a8684b26c53403e8f80ad74094399b

Sure, from some recent posts that seems to be the norm for your ABX tests - random guessing - might as well just send a monkey to do such tests

My positive ABX test results showed 100% that I was not guessing when comparing high-res with RB jangling keys files
Title: How do you listen to an ABX test?
Post by: mzil on 2015-03-31 23:41:40
A quick search I just did at AVS for, in quotes, "foo_abx", which appears at the top of all test results, at least the current ones, finds only one post by forum member "jkeny", in "results by post" mode, and it is simply a quote of some test that ARNY took, and a gripe about how quickly he did the test, so it is of course from Arny's results, not jkeny's:

"Hmmm, it's rather telling that you take from 1 to 4 secs for almost all of your trials.
Rather than conclude that you are not really bothering to listen & just guessing (as the results show) can you say why such short timings?"

My link may have trouble but it is post #39 of 105 in the thread titled [not in bold face though]: MP£ vs FLAC..!
http://www.avsforum.com/forum/91-audio-the...ml#post30565777 (http://www.avsforum.com/forum/91-audio-the...ml#post30565777)

Psst, Here's yet another test I just took, also just moments ago, in under half a minute, simply by guessing, showing his inability to take a test and post it right now is probably because I would presume he doesn't know how or perhaps has never even downloaded and installed the test:

foo_abx 2.0 report
foobar2000 v1.3.3
2015-03-31 15:42:08

File A: naim-test-2-flac-24-192000-44.1andback.flac
SHA1: 58336a1216fe1c3cd53d9f2d928f67f137e97599
File B: naim-test-2-flac-24-192000.flac
SHA1: 54fd277d2f70d4960fe73db58799ecfdb424293d

Output:
DS : Primary Sound Driver
Crossfading: NO

15:42:08 : Test started.
15:42:13 : 01/01
15:42:16 : 01/02
15:42:17 : 01/03
15:42:19 : 02/04
15:42:20 : 02/05
15:42:22 : 03/06
15:42:24 : 04/07
15:42:27 : 05/08
15:42:27 : Test finished.

----------
Total: 5/8
Probability that you were guessing: 36.3%

-- signature --
172dff24d09c67f8a3b5607471d90db835a455b2



P.S. I bet right NOW he is frantically downloading it and attempting to post results though, yet he NEVER has before!
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 00:04:41
A quick search I just did at AVS for, in quotes, "foo_abx", which appears at the top of all test results, at least the current ones, finds only one post by forum member "jkeny" and it is simply a quote of some test that Arny took, and a gripe about how quick he took the test, so it is of course Arny's results, not jkeny's:.............

P.S. I bet right NOW he is frantically downloading it and attempting to post results, yet he NEVER has before!

You must try harder, must try harder!!
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 00:17:55
OH, I think I found it. Sorry, my bad. They appear a bit differently from back then, and  there's no signature file to verify but they didn't exist back then I guess:
http://www.avsforum.com/forum/91-audio-the...ml#post25917873 (http://www.avsforum.com/forum/91-audio-the...ml#post25917873)

If that link's bad it's 07-21-2014, 03:20 PM in the thread Debate Thread: Scott's Hi-res Audio Test
SORRY! My mistake.


Arny also now says he goofed on those files though so results one way or another don't mean anything, but it does indeed look like you took at least one ABX test so I was wrong.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 00:31:59
OH, I think I found it. Sorry, my bad. They appear a bit differently from back then, and  there's no signature file to verify but they didn't exist back then I guess:
http://www.avsforum.com/forum/91-audio-the...ml#post25917873 (http://www.avsforum.com/forum/91-audio-the...ml#post25917873)

If that link's bad it's 07-21-2014, 03:20 PM in the thread Debate Thread: Scott's Hi-res Audio Test
SORRY! My mistake.


Arny also now says he goofed on those files though so results one way or another don't mean anything, but it does indeed look like you took at least one ABX test so I was wrong.

OK, apology accepted

So let's see your ABX test results for those flawed files of Arnyk's - the Jangling keys files that I produced positive results for. You know something that isn't random guessing but actually requires you to listen & try to differentiate
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 00:37:55
Why?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 00:47:32
Why?

It's trivial to post ABX random results - it shows no capability that couldn't be achieved by a monkey - I don't consider your posted ABX results here credible in any way

So you accuse me of never using Foobar ABX & yet I have published not just trivial results (as you) but positive ABX results that show I can differentiate small impairments between two files.

If you want to retain any credibility you will show that you have a capacity for a similar differentiation & not just results that are random guesses
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 00:53:21
If you want to retain any credibility you will show that you have a capacity for a similar differentiation & not just results that are random guesses

I don't recall making any claims about my abilities in this thread, hence a need to defend them, but here you go:

Quote
Here are my log files using Arny's old files  [the only version of his files yet posted in this thread, up to now] where I heard no IM distortion 4/5 kHz tones/noise after the training tone, at all,  even at blaring levels, just a couple of identical clicks common to both files:

foo_abx 1.3.4 report
foobar2000 v1.3.3
2014/07/26 18:30:29
File A: C:\Users\Me\Documents\Keys jangling folder\keys jangling full band 2496 test tones 1644.wav
File B: C:\Users\Me\Documents\Keys jangling folder\keys jangling full band 2496 test tones.wav

18:30:29 : Test started.
19:03:56 : 01/01 50.0%
19:05:38 : 02/02 25.0%
19:08:15 : 03/03 12.5%
19:10:27 : 04/04 6.3%
19:12:03 : 05/05 3.1%
19:16:13 : 06/06 1.6%
19:21:46 : 07/07 0.8%
19:23:08 : 08/08 0.4%
19:41:54 : 09/09 0.2%
19:45:00 : 10/10 0.1%
19:51:02 : 11/11 0.0
19:52:12 : 12/12 0.0%
19:53:44 : 13/13 0.0%
19:55:33 : 14/14 0.0%
19:57:20 : 15/15 0.0%
20:02:51 : 16/16 0.0%
20:03:33 : Test finished.
----------
Total: 16/16 (0.0%)

Today, using his new files, I unfortunately hear a faint IM problem so I can't do that test, however I did want to point out that the data I provide above was accomplished by my keying on a secret, audible "tell", I don't think I should disclose, calling into question anybody else's published data prior to mine, using the same files, even if they truly had no IM problems in their system, just like I didn't.

No dogs, no bats, no children with >22kHz hearing, no analyzer, and no text editor used, nor was I comparing the click noises themselves; it was just me and my headphones listening intently for over an hour in suboptimal conditions [refrigerator compressor noise, distant train whistles, etc.]

http://www.avsforum.com/forum/91-audio-the...ml#post26078698 (http://www.avsforum.com/forum/91-audio-the...ml#post26078698)

07-27-2014, 05:15 PM, post #2463 same thread (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1532092-debate-thread-scott-s-hi-res-audio-test-83.html#post26078698) 

Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 03:44:59
Why?

It's trivial to post ABX random results - it shows no capability that couldn't be achieved by a monkey - I don't consider your posted ABX results here credible in any way

So you accuse me of never using Foobar ABX & yet I have published not just trivial results (as you) but positive ABX results that show I can differentiate small impairments between two files.

If you want to retain any credibility you will show that you have a capacity for a similar differentiation & not just results that are random guesses


Looks to me like double talk for: "I can't/won't walk the walk" combined with baseless accusations against those who AT LEAST try.

For the brave among us:

Latest keys jangling files uploaded here:

3-31-2015 latest keys jangling 16/44 - 24/96 comparison files (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894177)
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 04:49:42
I don't hear any difference in the sound of the keys, however...

Code: [Select]

foo_abx 2.0 report
foobar2000 v1.3.3
2015-03-31 20:37:38

File A: keys_jangling_full_band_2496_test_tones_f4.flac
SHA1: 5c0d71159fd3702d0515372876e699f1ca8de1d0
File B: keys_jangling_full_band_2496_1644E2Q150_2496_test_tones_f4.flac
SHA1: 6bbc99e2f0ca8f3083096b51fa0221792c6b5b85

Output:
DS : Primary Sound Driver
Crossfading: NO

20:37:38 : Test started.
20:43:52 : 01/01
20:44:06 : 02/02
20:44:17 : 03/03
20:44:30 : 04/04
20:44:42 : 05/05
20:44:48 : 06/06
20:44:54 : 07/07
20:45:03 : 08/08
20:45:03 : Test finished.

 ----------
Total: 8/8
Probability that you were guessing: 0.4%

 -- signature --
b7bfa66b241cce34263e3b92d1a08f5f2abc7b11


http://www.foobar2000.org/abx/signaturecheck (http://www.foobar2000.org/abx/signaturecheck)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 05:01:46
I don't hear any difference in the sound of the keys , however...

Code: [Select]
foo_abx 2.0 report
foobar2000 v1.3.3
2015-03-31 20:37:38

File A: keys_jangling_full_band_2496_test_tones_f4.flac
SHA1: 5c0d71159fd3702d0515372876e699f1ca8de1d0
File B: keys_jangling_full_band_2496_1644E2Q150_2496_test_tones_f4.flac
SHA1: 6bbc99e2f0ca8f3083096b51fa0221792c6b5b85

Output:
DS : Primary Sound Driver
Crossfading: NO

20:37:38 : Test started.
20:43:52 : 01/01
20:44:06 : 02/02
20:44:17 : 03/03
20:44:30 : 04/04
20:44:42 : 05/05
20:44:48 : 06/06
20:44:54 : 07/07
20:45:03 : 08/08
20:45:03 : Test finished.

 ----------
Total: 8/8
Probability that you were guessing: 0.4%

 -- signature --
b7bfa66b241cce34263e3b92d1a08f5f2abc7b11

http://www.foobar2000.org/abx/signaturecheck (http://www.foobar2000.org/abx/signaturecheck)

Please PM me the Tell.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 05:22:15
AT LEAST try.

Don't be surprised when you discover that there's a prepared answer in store for this.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 05:24:21
AT LEAST try.

Don't be surprised when you discover that there's a prepared answer in store for this.


That would not be a surprise! It would be an expectation.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 05:26:05
just like mzil passing your bullet-proof test?  meh.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-01 06:24:10
When a DBT/ABX antagonist tries to define DBT/ABX, they tend to get it wrong.

It's a pattern I've noticed.
Must try harder, must try harder, must try harder!!

I'm giving you what I think the widespread opinion of ABX is, as I've experienced it on audio forums - a) it relies on echoic memory b) it deprives the listener of visual knowledge of which device/track is being used. According to the typical proponent of DBTs both of these pillars ensure that DBTs are the "gold standard" of audio tests.

I'm not saying that this is correct - I'm just reporting what I have gathered over many years of being on audio forums. As you are suggesting that this is very wide of the mark, I would suggest that you really need to educate the people who argue for DBTs/ABX on audio forums as they are doing ABX a disservice with their mis-information & obviously using ABX incorrectly as a result

Quote
jkeny, you keep asking folks to just give your boutique DACs a listen.  How about we ask you do to some proctored DBTs?
Where did I ask people to give my DAcs a listen? As Clark Gable said ""Frankly, my dear, I don't give a damn"

And not a damn is given by me about your  *unproctored* ABX results.  But ISTR you asking AJinFla if he'd heard your product.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 10:33:36
just like mzil passing your bullet-proof test?  meh.


Mzil was rattling my cage. ;-)

According to his PM, the positive results were obtained while listening to the test tones at the end (t > 12.5 seconds), not during  the keys jangling experiment.

Thst would seem to disqualify his monitoring system.  However, he says that atypically high listening levels were required to reliable detection of audible artifacts in his monitoring system, so all might be well with his monitoring system for the purpose at hand.

His true results for the actual keys jangling listening test segment would then be random guessing. He said that a few posts back. 

The listening test uses linear phase, minimal-ringing, minimal aliasing downsampling, and perceptually shaped dither.

So my bullet-proof test might appear to still be bullet proof (up to this  minute!), with the added features of protecting itself from substandard monitoring systems and unrealistically high playback levels.

Of course we can't do anything about people who don't play the game, which is a truism that grants the power to dismiss what we will based on our perceptions of the person reporting the test results.

Its all good, so far!
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 14:40:17
I'm giving you what I think the widespread opinion of ABX is, as I've experienced it on audio forums -

a) it relies on echoic memory


Can be true, or not.

Quote
b) it deprives the listener of visual knowledge of which device/track is being used.


Right and that is where most claims of reducing bias hang their hat.

Quote
According to the typical proponent of DBTs both of these pillars ensure that DBTs are the "gold standard" of audio tests.


Doesn't make sense on point (1).  Why would relying on  echoic memory be a good thing? It's an obvious debating trade ploy - make your opponent's position as narrow as possible for easier demolition.

Contrary to the Golden Ear's closely held religious beliefs, the ABX developers did not invent quick switching. It was alive and well long before that.  The actual genesis of the original ABX box was that it was a controller for a sighted  quick-switch relay box that had existed for years before.

Quote
I'm not saying that this is correct - I'm just reporting what I have gathered over many years of being on audio forums.


If the futility of looking for truth on your typical audio forum needs underscoring, go over to AVS and have a good laugh at a guy who is trying to convince the world that the percentage right in a listening test is absolutely meaningless.

To quote: "Let me repeat, you need to completely ignore percentage of right answers. I explain this in detail in this article I wrote recently: http://www.madronadigital.com/Librar...20Testing.html (http://www.madronadigital.com/Librar...20Testing.html). " 

Looks to me like he's still trying to right the overturned boat of the Meridian AES Conference paper.

Quote
As you are suggesting that this is very wide of the mark, I would suggest that you really need to educate the people who argue for DBTs/ABX on audio forums as they are doing ABX a disservice with their mis-information & obviously using ABX incorrectly as a result


I have never seen anybody but you argue that ABX relies only on echoic memory.  Ever.

If you bothered to look at the ABX video that I've been linking, you'd see that someone independent beat us to it.

I did quite a bit of searching on the web to see what people are saying about how to switch between sound samples during an ABX test, and I found very little actually being said. That probably means its a personal thing, and that's fine with me.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 16:36:54
just like mzil passing your bullet-proof test?  meh.
  Mzil was rattling my cage. ;-) 
  More precisely I posted my results for the exact same reasons I posted my similar results in the AVS thread both on your keys and the AIX records test cut, "Mosaic" [A2 vs B2]. It was to document that when other people post score sheets showing they can achieve good statistical significance in differentiating A from B it doesn't necessarily mean it is because they hear a difference in the music, they may just be hearing a difference in the files, which is exactly what I did.

Lucky for you I'm on your side of the big debate so rather than toot my own horn about my excellent hearing [I'm not quite as old as you, but no spring chicken either], my excellent system [my DAC  cost all of $29, however it is  extremely, um, organic], or my extensive training*, like the shysters often do, I instead exposed to all how easy it was to show such results if you know what vulnerabilities to look for and how they will be manifested. The whole point was to negate their findings, yet at least one of them had the gall to then run off to their other forums and brag, [paraphrased] "Not only could I hear a difference but even some of the long time skeptics (meaning me) could too!"

*I actually started to take Olive's thing at one point but then realized part way through that it was NOT increasing my sensitivity to subtleties, as I had hoped it would, but rather was simply grooming me to be a more consistent reporter, so I quit.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 16:49:51
According to his PM, the positive results were obtained while listening to the test tones at the end (t > 12.5 seconds), not during  the keys jangling experiment.


You can't post an ABX challenge and expect people to only listen to the subsection you tell them to, rather than choose whatever part of the file they want (especially considering our present company). You need to break it down into two files: a "Test your system for IM" file and then the actual challenge file, "Jangling Keys".

My system exhibits no tone, even at elevated levels, similar to the target test tone (I assume 4 kHz).  There are differences in the sound which are only evident at elevated levels however I don't need to ride the gain, nor clip/distort my system to hear it, so in my book I'm playing the game.

Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 17:47:39
To quote: "Let me repeat, you need to completely ignore percentage of right answers. I explain this in detail in this article I wrote recently:

Your link doesn't work, at least for me, but then again nor do my own in recent posts which is why I've been including a back up way of finding the source. I think there is something wrong when the link gets truncated to the abbreviated form with the "..." in the middle. What does seem to still work is to highlight a particular word or phrase and turn it into a hyperlink. (http://"http://www.madronadigital.com/Librar...20Testing.html[/quote)
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-01 18:03:49
In my experience the reason people "stumble over ABX" is mainly because it entails a different type of listening - it's no longer the normal, relaxed listening they are used to.

Yes, the audiophile variety of "listening" with peeking eyes, priori knowledge, status symbol worship, street cred worries etc.
IOW, very little to do with actual ear>brain listening, which is what, by definition must occur during a blind audio test. Listening and judging only by "listening with ones ears", not the convoluted audiophile version of "listening".

This can probably be addressed by the test design but I seldom see it done.

Pray tell what is stopping you folks from doing this?

As AJ quoted me - this is the "organic sound" that I referenced in the posts that were recently deleted.

Right. Yet you keep asking others, myself and greynol included, to "listen" for this, with no evidence that we would be able to "hear" it. You can. Just like your pal Amir insists that capable trained listeners are a must for any valid test. You meet that criteria, by your own admission. So, lets see your single blind test results of your DAC vs a 1/5th price one. Take your sweet time between switching. A week, a month, however long you feel adaptation takes in "long term listening". Just make sure there's no peeking, level mismatches, etc.
So that you are indeed "trusting your ears" and judging the sound by listening and listening only.
We await your results.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 19:34:30
So what is being demonstrated here is the complete uselessness of ABX testing - if positive results are produced there are many ways of explaining it away. If a negative result is returned there are similarly obvious ways of explaining it away.

Why would anyone use such a test except for personal use (when honesty is not usually an issue)?
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 19:42:15
Quote
So what is being demonstrated here is the complete uselessness of ABX testing

Wrong. Any A/B test or comparison, blind or sighted, doesn't matter, would be compromised by the issue I'm keying on.

Arny can easily fix his files so my attack at a particular vulnerability is thwarted, however, by simply making two files as I previously mentioned.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 19:46:26
Quote
So what is being demonstrated here is the complete uselessness of ABX testing

Wrong. Any A/B test or comparison, blind or sighted, doesn't matter, would be compromised by the issue I'm keying on.

Arny can easily fix his files so my attack at a particular vulnerability is thwarted, however, by simply making two files as I previously mentioned.

The same honesty issue that disqualifies positive ABX results also disqualifies null results - as amply demonstrated here
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-01 19:54:26
blabla

More derailing/demonstrating ignorance (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108127&view=findpost&p=888929)?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 19:55:27
Somehow we're pretending the word proctored wasn't uttered earlier?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 19:57:08
We await your results.

cheers,

AJ

Why would I be interested in jumping through hoops to produce test results (whether positive or negative) using a test that is so shot full of holes?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 19:58:24
Certainly not by someone who's pointing a pretend gun at a real target.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:00:15
Somehow we're pretending the word proctored wasn't uttered earlier?

Not ignored, just useless - what criteria do you use to judge the honesty of the proctor?
I do an ABX test & my mate proctors it - you trust the results, right?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:03:06
Certainly not by someone who's pointing a pretend gun at a real target.

Sorry, your logic escapes me!
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 20:04:54
Especially if your mate happens to be your barking dog wearing a t-shirt showing his favorite brand of organic DAC.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 20:05:58
Sorry, your logic escapes me!

Not surprised.

But if you uttereth, you must be right.
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-04-01 20:06:17
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-01 20:11:02
Somehow we're pretending the word proctored wasn't uttered earlier?

Not ignored, just useless - what criteria do you use to judge the honesty of the proctor?
I do an ABX test & my mate proctors it - you trust the results, right?

As a means to bridge the division between audiophiles and objectivists, the ABX test is useless. In this sense you are right, and you go on to demonstrate the issue yourself.

However, even when there is no agreement on how to interpret the outcome of an ABX test, or whether to trust it, it is quite interesting to watch the arguments, even entertaining at times. That may not change the position of the hardliners, but it certainly benefits the more open-minded bystanders.

Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-01 20:12:53
Why would I be interested in jumping through hoops to produce test results

To peddle your wares on the basis of repeatable, veriffiable reality, rather than organic self delusion daydreams.
No prob if you are frightened by such prospects. Most of understand that deep down inside, golden ears have no trust of their ears whatsoever, despite all the braying. 

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:13:49
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:17:23
blabla

More derailing/demonstrating ignorance (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108127&view=findpost&p=888929)?

You deny that the question of honesty/proctoring equally applies to ALL ABX results both positive & negative
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-01 20:18:07
And it's only real science if the guys wear lab coats, obviously. Haha, c'mon jkeny, you can troll better than that.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:19:34
Why would I be interested in jumping through hoops to produce test results

To peddle your wares on the basis of repeatable, veriffiable reality, rather than organic self delusion daydreams.
No prob if you are frightened by such prospects. Most of understand that deep down inside, golden ears have no trust of their ears whatsoever, despite all the braying. 

cheers,

AJ

Selective quoting - the full quote is "Why would I be interested in jumping through hoops to produce test results (whether positive or negative) using a test that is so shot full of holes?"

I don't need to peddle any ware
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:23:59
And it's only real science if the guys wear lab coats, obviously. Haha, c'mon jkeny, you can troll better than that.

You certainly demonstrate a lack of expertise in the field being examined but if you want to pretend you are doing science, go ahead while I & others chuckle at your naivete/ignorance
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 20:37:10
Haha, c'mon jkeny, you can troll better than that.

I'm afraid he can't.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-01 20:37:36
You certainly demonstrate a lack of expertise in the field being examined but if you want to pretend you are doing science, go ahead while I & others chuckle at your naivete/ignorance

Oh come on, now you're just projecting. You've clearly demonstrated your own lack of expertise in the linked thread. And you apparently did not look up the tests done on this forum, which I suggested to you 2 freaking months ago, and instead decided to remain ignorant.

It's pretty clear that all you come here to do is derail/troll and pridefully demonstrate ignorance.
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-04-01 20:38:56
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?

Sure they are.  Science is a methodology for determining aspects of reality, its not a property of a laboratory or even expertise.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:42:21
You certainly demonstrate a lack of expertise in the field being examined but if you want to pretend you are doing science, go ahead while I & others chuckle at your naivete/ignorance

Oh come on, now you're just projecting. You've clearly demonstrated your own lack of expertise in the linked thread. And you apparently did not look up the tests done on this forum, which I suggested to you 2 freaking months ago, and instead decided to remain ignorant.

It's pretty clear that all you come here to do is derail/troll and pridefully demonstrate ignorance.

@mods: I suggest to again put those lasts postings into the bin.

Don't fret - they'll all be removed shortly
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 20:46:30
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?

Science is a state of mind, not a state of employment.

Let's take Dr. Earl Geddes. He does science in his dining room and basement. Did  leaving  the commercial research lab take the Science out of him? Did he have to check his PhD at the door on the way out?

As far as expertise by professionals goes, sometimes it is great, sometimes not so much. One of the worst cases I've seen lately of highly credentialed audio professionals screwing the Science pooch can be found here:

Link to example of highly credentialed professionals screwing the Science pooch (https://secure.aes.org/forum/pubs/conventions/?ID=416#2977)

I suspect that your apparent belief that science has to be conducted in Sanctified Places by Ordained Individuals explains a lot of the phoney limits that you have placed on yourself.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-01 20:46:55
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?

Sure they are.  Science is a methodology for determining aspects of reality, its not a property of a laboratory or even expertise.
Conducting scientifically valid tests requires expertise
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-01 20:55:41
So what is being demonstrated here is the complete uselessness of ABX testing - if positive results are produced there are many ways of explaining it away. If a negative result is returned there are similarly obvious ways of explaining it away.


Of course. Who told you that Science was totally immune to the usual failures and successes of human endeavor?

Quote
Why would anyone use such a test except for personal use (when honesty is not usually an issue)?


A man who claims that he can't lie to himself is probably doing it incessantly.

Of course we can lie to ourselves. Of course our senses and reasoning can fail us.

When we aren't doing it to ourselves there is a world of other people who are happy to do it to us, some because they don't know better, some because they do it for fun and profit. 
Title: How do you listen to an ABX test?
Post by: saratoga on 2015-04-01 21:01:01
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?

Sure they are.  Science is a methodology for determining aspects of reality, its not a property of a laboratory or even expertise.
Conducting scientifically valid tests requires expertise

Not necessarily.  Plenty of people have simply gotten lucky, or made their own expertise as they went.  Much like fancy labs, expertise helps a lot, but its not necessary.  A reasonably intelligent person with no expertise in audio could certainly device scientific tests of equipment.  Afterall its not like man was created with knowledge of ABX.  People came to these methodologies through reason and effort, not revelation. 

Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 21:01:55
there is a world of other people who are happy [lying] to us, some because they don't know better, some because they do it for fun and profit.

and sometimes both

Have we exhausted the original subject material to the point that we're now speculating about human nature in general?
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-01 21:43:35
Conducting scientifically valid tests requires expertise

So does understanding/constructing DACs, but that's never stopped you. 
So now blind tests are valid, just when conducted with expertise? Which is it?
Do your "organic" daydreams posited as factual reality, require expertise also?

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-01 21:50:18
I don't need to peddle any ware

But you do peddle wares. On the basis that it "sounds" organic during your daydreams.
Stuff that's going to be "hidden" by blind (ABX type) tests, regardless of length.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-01 23:05:03
I was thinking. Maybe an ABX test proctored via a Facetime telephone call could work?
Aim the camera at the test subject's display screen, launch foobar, play the files to prove they are what they say, launch ABX, take test, and conclude test with signature verification all being filmed, er, streamed to the proctor so they can see details like time codes and signature verification codes as they appear on screen?

Well, I can already think of ways to cheat, but they would be pretty darn extreme. Thought I'd throw this out there though in case it sparks other ideas that don't require expensive , time consuming journeys.

[Although I've challenged audiophile buddies in the past to do blind tests as a bet, and won,  that was in my youth and I knew them. This is not the sort of thing I'd volunteer to do today, just so everyone knows.]
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-01 23:08:32
On the basis that it "sounds" organic during your daydreams.

...or the daydreams of a prospective customer, perhaps; though non-placebo-induced reality is what he should be interested in.  He, being a customer who prefers not to exist in a delusional state when it comes to his purchases.

You know, just the thing that snake-oil peddlers will go to great lengths to suppress.

We better find some way to discriminate between daydreams and reality though, since ABX tests are simply not sensitive enough to detect what are otherwise night-and-day differences.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 10:20:41
Have we exhausted the original subject material to the point that we're now speculating about human nature in general?


Is there a basic difference in the human nature of us people with a scientific orientation and the dedicated placebophiles, such as the one before us?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 10:23:28
I was thinking. Maybe an ABX test proctored via a Facetime telephone call could work?
Aim the camera at the test subject's display screen, launch foobar, play the files to prove they are what they say, launch ABX, take test, and conclude test with signature verification all being filmed, er, streamed to the proctor so they can see details like time codes and signature verification codes as they appear on screen?

Well, I can already think of ways to cheat, but they would be pretty darn extreme. Thought I'd throw this out there though in case it sparks other ideas that don't require expensive , time consuming journeys.


We have some recent experience with a fairly sophisticated liar and cheater. The challenge is to come up with some way to reign in people like him in.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 11:53:06
We have some recent experience with a fairly sophisticated liar and cheater. The challenge is to come up with some way to reign in people like him in.

We also have recent experience of people doing tests who don't actual listen, they just guess randomly. This should also be a challenge to reign in.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-02 13:01:57
We have some recent experience with a fairly sophisticated liar and cheater. The challenge is to come up with some way to reign in people like him in.

We also have recent experience of people doing tests who don't actual listen, they just guess randomly. This should also be a challenge to reign in.

What do you want to get at? That nobody can be convinced who doesn't want to be convinced? That's trivial: Ignorance always wins. You demonstrate that quite forcefully.

The people who do want to be convinced will want to look at the credibility of a test or an argument. If someone gets caught cheating, he will not be believed even when he occasionally is right. I find that quite natural.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 13:28:35
We have some recent experience with a fairly sophisticated liar and cheater. The challenge is to come up with some way to reign in people like him in.

We also have recent experience of people doing tests who don't actual listen, they just guess randomly. This should also be a challenge to reign in.

What do you want to get at? That nobody can be convinced who doesn't want to be convinced? That's trivial: Ignorance always wins. You demonstrate that quite forcefully.

The people who do want to be convinced will want to look at the credibility of a test or an argument. If someone gets caught cheating, he will not be believed even when he occasionally is right. I find that quite natural.

The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring. I consider randomly guessing also a form of cheating as it's not actually doing the test. We have no way to determine this has occurred & hence the need to include this possibility in any auditing procedures. Yes, I'm suggesting that the credibility of the test needs to be addressed - it's not just a case of examining positive ABX results for credibility - it also applies to null results.
Title: How do you listen to an ABX test?
Post by: pdq on 2015-04-02 13:38:43
This is how it normally works at HA:

Member A thinks he can hear a difference between two versions. If he passes a DBT then he submits his results along with samples and a description of what he hears.

Member B attempts to verify the difference. If verified then the difference is accepted, otherwise it is up to member C, etc.

As long as it has not been verified then the issue remains open, EXCEPT if user A has a proven track record of being able to discern effects that are difficult for others (what some may call "golden ears"), in which case the original results are usually accepted.

This system works well for us, and I am sorry if you do not find it acceptable.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 14:05:59
This is how it normally works at HA:

Member A thinks he can hear a difference between two versions. If he passes a DBT then he submits his results along with samples and a description of what he hears.

Member B attempts to verify the difference. If verified then the difference is accepted, otherwise it is up to member C, etc.

As long as it has not been verified then the issue remains open, EXCEPT if user A has a proven track record of being able to discern effects that are difficult for others (what some may call "golden ears"), in which case the original results are usually accepted.

This system works well for us, and I am sorry if you do not find it acceptable.

If you want to disregard an obvious flaw & only focus on disproving positive ABX results, I am also sorry if you do not find it acceptable.
The scientific approach to such a test would be to focus equally on the validity of the positive & negative results - if truth is what you actual want to reveal in such a test
If truth is not the objective of such a test then continue as you were
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 14:19:25
We have some recent experience with a fairly sophisticated liar and cheater. The challenge is to come up with some way to reign in people like him in.

We also have recent experience of people doing tests who don't actual listen, they just guess randomly. This should also be a challenge to reign in.


We can get all the examples of random guessing we can possibly stomach by reading the general run of audio forums WBF, AVS and etc. and publications such as Stereophile, Absolute Sound, and etc.  Two words: Sighted evaluations. These are actually worse than a fair sighted evaluation because the outcome is usually directed by some pseudo-scientific explanation.

The usual hypercritical posturing by the people who defend and even seek to profit by this activity is easy to dismiss.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 14:22:09
This is how it normally works at HA:

Member A thinks he can hear a difference between two versions. If he passes a DBT then he submits his results along with samples and a description of what he hears.

Member B attempts to verify the difference. If verified then the difference is accepted, otherwise it is up to member C, etc.

As long as it has not been verified then the issue remains open, EXCEPT if user A has a proven track record of being able to discern effects that are difficult for others (what some may call "golden ears"), in which case the original results are usually accepted.

This system works well for us, and I am sorry if you do not find it acceptable.


If you want to disregard an obvious flaw & only focus on disproving positive ABX results,


Straw man argument.

Quote
The scientific approach to such a test would be to focus equally on the validity of the positive & negative results


That's what happens.

Compare that to the very many audiophile forums and publications where negative results are banned and people who talk about them are banned.


Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 14:28:30
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?


No one's pretending sighted evaluations of DACs return trustworthy information about their sound, are they?
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 14:35:17
The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring.


I'd require proctoring for *you* not because ABX without positive/negative controls means 'cheating', but because I don't trust *you*.

Glad we could clear that up.

Now, tell us again why yours, or anyone's , sighted evaluations of DAC sound are trustworthy?



Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 14:36:42
So what is being demonstrated here is the complete uselessness of ABX testing the scientific method

Indeed science is not easy, which is I think your actual complaint, but it is still very useful if you are willing to put in the work.
Nobody is really pretending that ABX tests conducted outside of a laboratory with expertise in psychoacoustic/neuroscience is actually science, are they?


No one's pretending sighted evaluations of DACs return accurate information about their sound, are they?


What about a guy who may have bet his IRA on a wholesale quantity of Euro-Trash high end DACs to resell over here? ;-)
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 14:46:41
It's always interesting to see how people avoid the real issue & deflect with all sorts of the usual techniques learned on forums.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 14:51:36
The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring.


I'd require proctoring for *you* not because ABX without positive/negative controls means 'cheating', but because I don't trust *you*.

Glad we could clear that up.



The usual rule is that any exceptional result must be repeated, and if repeated by the orginator, repeated with independent supervision or proctor.

So we have this guy over on WBF who claims to have done a single unproctored ABX test that is: "Conclusive "Proof" that higher resolution audio sounds different"

Link to Post (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different&p=276574&viewfull=1#post276574)

Now if certain people were true to their claims over here, might they have confronted that sideshow over there?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 14:53:49
It's always interesting to see how people avoid the real issue & deflect with all sorts of the usual techniques learned on forums.



Yes, we've had a master's class in that sort of behavior over on AVS...
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 15:04:05
The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring.


I'd require proctoring for *you* not because ABX without positive/negative controls means 'cheating', but because I don't trust *you*.

Glad we could clear that up.

Now, tell us again why yours, or anyone's , sighted evaluations of DAC sound are trustworthy?

And you would not trust any proctor I used.
Glad we could clear that up too
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-02 15:39:02
If you want to disregard an obvious flaw & only focus on disproving positive ABX results, I am also sorry if you do not find it acceptable.

How can one "disprove" a negative ABX result anyhow? This stuff is by its very nature not symmetrical. A negative ABX test result can be due to any number of reasons, which the test won't help you distinguish from each other. A cheating tester is only one of those reasons.

For this reason the negative result proves nothing. It means that under the circumstances of the test, with the given listeners, the differences couldn't be demonstrated as audible. It doesn't mean that there were no differences, it doesn't mean that nobody can hear the differences, it doesn't even mean that the same people would be unable to hear them in another test.

While a negative result is pretty easy to ignore, a positive isn't. If there was no flaw in the test procedure, it means that there really were audible differences. It doesn't, however, tell you what they were and what caused them.

This is a very grave asymmetry to start with. This should actually be in favor of the audiophiles, because in a symmetrical situation, a larger number of negative results would beat a smaller number of positive results. You couldn't achieve much by succeeding at an ABX test then, because you would have to outnumber the negative results to tip the balance. I am absolutely sure that you wouldn't accept that. You would consider one successful ABX test to be authoritative already.

So the situation is clearly not symmetrical, hence your whining about not giving negative results the same scrutiny is pointless at best, and dishonest at worst. Why is it necessary to give a failed ABX test such a level of scrutiny when it can and will be disregarded anyway?

You may say that we are treating negative results as being significant, contrary to what I just wrote. After all, we see the lack of valid positive results as support for our position. That is true to some extent, but it is not due to some negative results that we manufactured ourselves, it is the failures of those who claimed audibility before, which matters here. If you think that they could have cheated, or otherwise compromised the test, they would have shot themselves in the foot. I don't see why I should be reconsidering my position because of that remote possibility.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 16:26:56
The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring.


I'd require proctoring for *you* not because ABX without positive/negative controls means 'cheating', but because I don't trust *you*.

Glad we could clear that up.

Now, tell us again why yours, or anyone's , sighted evaluations of DAC sound are trustworthy?

And you would not trust any proctor I used.
Glad we could clear that up too



Lets see if we can sumarize this. You are not going to do any ABX tests because were there any positive outcomes, we'd dismiss them.

This is what is known as a logic-tight box, because the above rhetoric is illogical, anti-scientific and self-defeating

(1) You've already precluded the possibility that you would obtain negative results and learn something from that experience.

(2) It seems to show that what's most important to you is adulation from others, not personal knowledge of the truth.

There's another possibility: You might get positive results and then some fun would begin.

Anyway, some relevant test files can be found here: High Resolution audio files for ABX-ing (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894177).

If you were right about the facts of this matter, you'd download those files, obtain positive results, and have the last laugh. It would be a great recommendation for your wunder-DACs as well.  But, you've already carted them off to the Salvation Army Store and taken the tax loss, right?
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-02 16:39:26
pelmazo, don't waste your breath. We've already explained exactly that to jkeny 2 month ago. It's just bad trolling.
I have even explained to him why his suggestion to identify false negatives is flawed and easily exploited by dishonest people. If you publicly demonstrate your willful ignorance or dishonesty (like we've seen certain individuals doing in the past months) then you should expect people to reject your test results.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 16:59:52
If you want to disregard an obvious flaw & only focus on disproving positive ABX results, I am also sorry if you do not find it acceptable.

How can one "disprove" a negative ABX result anyhow? This stuff is by its very nature not symmetrical. A negative ABX test result can be due to any number of reasons, which the test won't help you distinguish from each other. A cheating tester is only one of those reasons.

For this reason the negative result proves nothing. It means that under the circumstances of the test, with the given listeners, the differences couldn't be demonstrated as audible. It doesn't mean that there were no differences, it doesn't mean that nobody can hear the differences, it doesn't even mean that the same people would be unable to hear them in another test.

While a negative result is pretty easy to ignore, a positive isn't. If there was no flaw in the test procedure, it means that there really were audible differences. It doesn't, however, tell you what they were and what caused them.

This is a very grave asymmetry to start with. This should actually be in favor of the audiophiles, because in a symmetrical situation, a larger number of negative results would beat a smaller number of positive results. You couldn't achieve much by succeeding at an ABX test then, because you would have to outnumber the negative results to tip the balance. I am absolutely sure that you wouldn't accept that. You would consider one successful ABX test to be authoritative already.

So the situation is clearly not symmetrical, hence your whining about not giving negative results the same scrutiny is pointless at best, and dishonest at worst. Why is it necessary to give a failed ABX test such a level of scrutiny when it can and will be disregarded anyway?

You may say that we are treating negative results as being significant, contrary to what I just wrote. After all, we see the lack of valid positive results as support for our position. That is true to some extent, but it is not due to some negative results that we manufactured ourselves, it is the failures of those who claimed audibility before, which matters here. If you think that they could have cheated, or otherwise compromised the test, they would have shot themselves in the foot. I don't see why I should be reconsidering my position because of that remote possibility.

OK, glad to see that you recognise that the accumulation of negative results is often used to support the claim that certain things are not audible - they are not a neutral factor nor ignored. Yes an individual null result proves nothing but an accumulation of null results is a strong indicator of inaudibility of the device/etc under test. This is not often admitted to in such discussions so it's refreshing to encounter it.

So, my position is that we don't know how "valid" these null results are - I gave the example of someone knowingly "cheating" by randomly guessing without listening. Ample evidence is already given in this thread & other such "cheating" posted by ArnyK on AVS. But there are many other situations/circumstances where null results can arise without the listener knowingly "cheating" - situations that will skew the test towards a null result. Tiredness; loss of focus; unsuitability of the environment or playback equipment to reveal small, audible differences; unsuitability of the person's hearing; disinterest/no pre-training to identify audible differences prior to test & many, many more possible reasons why a null result may be returned other than a valid null result from a genuine, valid test   

The question really is - are you interested in evaluating how many of these null results are actually valid or are you happy with the existing situation? I would love to see a genuine interest in the validity of ALL results coming from such tests & not just the positive ones being examined.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 17:03:44
The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring.


I'd require proctoring for *you* not because ABX without positive/negative controls means 'cheating', but because I don't trust *you*.

Glad we could clear that up.

Now, tell us again why yours, or anyone's , sighted evaluations of DAC sound are trustworthy?

And you would not trust any proctor I used.
Glad we could clear that up too



Lets see if we can sumarize this. You are not going to do any ABX tests because were there any positive outcomes, we'd dismiss them.

This is what is known as a logic-tight box, because the above rhetoric is illogical, anti-scientific and self-defeating

(1) You've already precluded the possibility that you would obtain negative results and learn something from that experience.

(2) It seems to show that what's most important to you is adulation from others, not personal knowledge of the truth.

There's another possibility: You might get positive results and then some fun would begin.

Anyway, some relevant test files can be found here: High Resolution audio files for ABX-ing (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894177).

If you were right about the facts of this matter, you'd download those files, obtain positive results, and have the last laugh. It would be a great recommendation for your wunder-DACs as well.  But, you've already carted them off to the Salvation Army Store and taken the tax loss, right?


I've asked over & over again what proctor would be accepted but got no answer. Care to answer this?
In the absence of this why would anyone waste their time doing such a test? Your phrase "You might get positive results and then some fun would begin." reveals exactly how much you are actually interested in finding truth
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 17:11:58
pelmazo, don't waste your breath. We've already explained exactly that to jkeny 2 month ago. It's just bad trolling.
I have even explained to him why his suggestion to identify false negatives is flawed and easily exploited by dishonest people.
Huh? Got a link to your explanation?
Quote
If you publicly demonstrate your willful ignorance or dishonesty (like we've seen certain individuals doing in the past months) then you should expect people to reject your test results.

Who are you referring to here, exactly?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 17:12:16
If you want to disregard an obvious flaw & only focus on disproving positive ABX results, I am also sorry if you do not find it acceptable.

How can one "disprove" a negative ABX result anyhow?


Child's play: Merely reliably and cleanly obtain a number of positive results. Preferably have a number of independent experimenters do it.

Just imagine all of the golden ears in the world ABXing away and ultimately proving the meter readers wrong. 

Quote
This stuff is by its very nature not symmetrical.


Absence of proof is not proof of absence.

Quote
A negative ABX test result can be due to any number of reasons, which the test won't help you distinguish from each other. A cheating tester is only one of those reasons.


You'd think that some people think they are the only honest people around. Some of them would like us to believe that ABX Comparators are like hen's teeth, and that only the blessed few can use them.

Quote
For this reason the negative result proves nothing. It means that under the circumstances of the test, with the given listeners, the differences couldn't be demonstrated as audible. It doesn't mean that there were no differences, it doesn't mean that nobody can hear the differences, it doesn't even mean that the same people would be unable to hear them in another test.


Exactly.

Quote
While a negative result is pretty easy to ignore, a positive isn't. If there was no flaw in the test procedure, it means that there really were audible differences. It doesn't, however, tell you what they were and what caused them.


Let's review the moving goal posts in this discussion. Once upon a time the golden ears told us that the differences were "Mind Blowing".  In those exact words.  Strangely enough they became elusive once the usual list of obvious non-audible cues were removed. Common sense suggests that since the "Mind Blowing" audible differences disappeared when the sighted cues were removed, maybe the "Mind Blowing" audible differences were due to the sighted cues. Of course correlation isn't always proof of causality, but often it is or at least it gets you looking in the right places.

Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-02 17:17:18
I've asked over & over again what proctor would be accepted but got no answer. Care to answer this?


Sure:

JJ

Me

John Vanderkooy

Stan Lipshitz

and 100's others who are probably more conveniently physically sited

Besides, aren't you in this for the sake of knowledge and truth?  Wouldn't just knowing be enough?  It has been for me on many occasions. For example, I didn't spill the beans about the first ABX test ever done until someone did the second-dozen or more ones. Didn't want to bias them.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 17:22:45
......
Let's review the moving goal posts in this discussion. Once upon a time the golden ears told us that the differences were "Mind Blowing".  In those exact words.  Strangely enough they became elusive once the usual list of obvious non-audible cues were removed. Common sense suggests that since the "Mind Blowing" audible differences disappeared when the sighted cues were removed, maybe the "Mind Blowing" audible differences were due to the sighted cues. Of course correlation isn't always proof of causality, but often it is or at least it gets you looking in the right places.

And here we have an example of a null result being used quite contrary to the neutral pretence, "doesn't prove anything" excuse used in these forums. It is very much used in this politically disingenuous way
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 17:29:00
I've asked over & over again what proctor would be accepted but got no answer. Care to answer this?


Sure:

JJ

Me

John Vanderkooy

Stan Lipshitz

and 100's others who are probably more conveniently physically sited
Pretty much as I suspected - you need to get a grip on reality if you think any of those named people would be remotely interested in proctoring. It makes any positive ABX results impossible - just as the whole silly issue of proctoring was designed to do.

You guys have backed yourself up into a corner of reality that is untenable. All ABX tests need to be proctored, right?

Quote
Besides, aren't you in this for the sake of knowledge and truth?  Wouldn't just knowing be enough?  It has been for me on many occasions. For example, I didn't spill the beans about the first ABX test ever done until someone did the second-dozen or more ones. Didn't want to bias them.

Do you not think I haven't done my own personal blind tests & am satisfied with my personal conclusions? That is truth for me
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 18:10:48
The point is that the test itself doesn't have the necessary controls to eliminate cheating & hence the need for proctoring.


I'd require proctoring for *you* not because ABX without positive/negative controls means 'cheating', but because I don't trust *you*.

Glad we could clear that up.

Now, tell us again why yours, or anyone's , sighted evaluations of DAC sound are trustworthy?

And you would not trust any proctor I used.
Glad we could clear that up too



Correct.  I would get to choose or approve the proctor.  Because I don't trust you.

Now, please answer the question posed in the rest of the post:

Tell us again why yours, or anyone's , sighted evaluations of DAC sound are trustworthy?
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 18:18:26
OK, glad to see that you recognise that the accumulation of negative results is often used to support the claim that certain things are not audible - they are not a neutral factor nor ignored.



Neither are they ignored in scientific work.  Does that make science not 'neutral' either?



Quote
Yes an individual null result proves nothing but an accumulation of null results is a strong indicator of inaudibility of the device/etc under test. This is not often admitted to in such discussions so it's refreshing to encounter it.


You have no idea what you're talking about.  It is routinely 'admitted'. 



Look, let's cut to the chase around all these attempts of yours to generating FUD and reinvent the wheel:

You make and sell DACs.

You claim your DACs sound a certain way. 

How would you prove it?  Describe your method.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 18:33:37
Pretty much as I suspected - you need to get a grip on reality if you think any of those named people would be remotely interested in proctoring. It makes any positive ABX results impossible - just as the whole silly issue of proctoring was designed to do.


Get over yourself.  It doesn't make 'any' positive ABX results impossible  -- indeed, if you had a clue, you'd know that right here on HA, there have been some highly unlikely but nevertheless 'accepted' ABX results over the years (e.g., 320 kbps mp3). 

It merely makes any highly unlikely positive ABX results *you* post, unlikely to be believed.  By me.  Others can speak for themselves.

Quote
You guys have backed yourself up into a corner of reality that is untenable. All ABX tests need to be proctored, right?


Nope.  In science one assumes good faith all the time.  But you're the one who claims it's easy to cheat, right?

Quote
Do you not think I haven't done my own personal blind tests & am satisfied with my personal conclusions? That is truth for me


But not 'just 'for you', eh?  You claim your DACs will  sound different/better...presumably not just to you.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-02 18:44:33
The question really is - are you interested in evaluating how many of these null results are actually valid or are you happy with the existing situation? I would love to see a genuine interest in the validity of ALL results coming from such tests & not just the positive ones being examined.

You only seem to read half of what I wrote. No, I'm not interested in evaluating that, because there is no such thing as a null result being valid or invalid to start with. The mere concept is bunk. Snap out of it!

For example, say I complete an ABX test with a null result, and my friend completes the same test with a success. Does that make my test invalid or not? On what grounds do you decide that? Assume I didn't cheat or play games with the test. Isn't it perfectly normal for such a test that some people fail and others succeed, even when nothing is wrong with the test? Could be due to differences in hearing ability, in training, in form, in patience, in luck ... You are listing them yourself, yet you don't draw the obvious conclusion!

The only thing you can say is that the test succeeded or didn't succeed. If it didn't, you won't know why. You'd have to design another test to find that out if you wanted to know more. Only if it succeeded, the question arises whether it succeeded because of a condition that made the test invalid.

Quote
And here we have an example of a null result being used quite contrary to the neutral pretence, "doesn't prove anything" excuse used in these forums. It is very much used in this politically disingenuous way

If a null result doesn't prove anything, the default position until proven otherwise has to be to reject the claim. That a series of null results bolsters this position may look disingenious to you, but in fact it changes nothing. A claim that has to be rejected anyway can - strictly speaking - not be rejected even more, nor does it need to be rejected more. The increased confidence in the rejection is purely psychological, yet it is perfectly reasonable. If it doesn't please you, you are free to ignore it. The rejection remains anyway.

[quote author=xnor link=msg=0 date=]It's just bad trolling.[/quote]
Looks like it.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 18:46:22
OK, glad to see that you recognise that the accumulation of negative results is often used to support the claim that certain things are not audible - they are not a neutral factor nor ignored.



Neither are they ignored in scientific work.  Does that make science not 'neutral' either?



Quote
Yes an individual null result proves nothing but an accumulation of null results is a strong indicator of inaudibility of the device/etc under test. This is not often admitted to in such discussions so it's refreshing to encounter it.


You have no idea what you're talking about.  It is routinely 'admitted'. 
OK, good, my statement is wrong & it is admitted that null results are of importance. Therefore the validity of these null results must be of importance - after all you don't want to be be strongly swayed by invalid results, do you?

So then proctoring is needed, right?


Quote
Look, let's cut to the chase around all these attempts of yours to generating FUD and reinvent the wheel:

You make and sell DACs.

You claim your DACs sound a certain way. 

How would you prove it?  Describe your method.


It's not about my DACs - it's about how ABX tests are done
I will never be able to "prove" that my DAcs sound the way I describe - it's up to individual customers to decide this in whatever way they see fit - sighted/blind listening. They have 30 days return with no restocking fees so it's their personal tests that count not a case of "proving" or "persuading" anybody of anything.

When it comes to tests that are not personal tests & used to "prove" or "strongly indicate" something, then a whole raft of other considerations need addressing
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 19:01:24
Quote
Pretty much as I suspected - you need to get a grip on reality if you think any of those named people would be remotely interested in proctoring. It makes any positive ABX results impossible - just as the whole silly issue of proctoring was designed to do.


Get over yourself.  It doesn't make 'any' positive ABX results impossible  -- indeed, if you had a clue, you'd know that right here on HA, there have been some highly unlikely but nevertheless 'accepted' ABX results over the years (e.g., 320 kbps mp3). 

It merely makes any highly unlikely positive ABX results *you* post, unlikely to be believed.  By me.  Others can speak for themselves.
So, as I said the test is useless as a test other than a personal test - it leads nowhere - anyone can decide that they don't trust you & demand proctoring - it's complete rubbish. Logically, if you insist on proctoring then it has to be done for all ABX tests!!

Quote
Quote
You guys have backed yourself up into a corner of reality that is untenable. All ABX tests need to be proctored, right?


Nope.  In science one assumes good faith all the time.  But you're the one who claims it's easy to cheat, right?
Nope, I'm not the one that raised the requirement for proctoring - it was first raise as the final "excuse" to deny Amir's results. Cheating was then demonstrated by ArnyK in his "false" null ABX results. It was demonstrated here by Mzil posting a number of ABX null results with his random guessing
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 19:14:42
The question really is - are you interested in evaluating how many of these null results are actually valid or are you happy with the existing situation? I would love to see a genuine interest in the validity of ALL results coming from such tests & not just the positive ones being examined.

You only seem to read half of what I wrote. No, I'm not interested in evaluating that, because there is no such thing as a null result being valid or invalid to start with. The mere concept is bunk. Snap out of it!

For example, say I complete an ABX test with a null result, and my friend completes the same test with a success. Does that make my test invalid or not? On what grounds do you decide that? Assume I didn't cheat or play games with the test. Isn't it perfectly normal for such a test that some people fail and others succeed, even when nothing is wrong with the test? Could be due to differences in hearing ability, in training, in form, in patience, in luck ... You are listing them yourself, yet you don't draw the obvious conclusion!

The only thing you can say is that the test succeeded or didn't succeed. If it didn't, you won't know why. You'd have to design another test to find that out if you wanted to know more. Only if it succeeded, the question arises whether it succeeded because of a condition that made the test invalid.
That would be fine if all null results were disregarded & considered of no significance but this isn't the case as you & krabapple have admitted - they form part of the body of "evidence".  No, "Results" presuppose that a real "test" had actually taken place. In the ABX null results posted here by mzil, where he just randomly guessed - do you consider that he "took a test"? Do you consider his "results" have any meaning or should they be eliminated from the "accumulated body of evidence"?

If I sat a monkey down in front of the keyboard would I most likely get the same null results? Would you count the monkey's results among the null results?
Would you consider the results produced by a deaf person as a valid test? What about someone who demonstrated a hearing impairment in the audible area being tested? What about someone who has demonstrated that they are pre-biased to not hearing any differences? What about someone who is so tired that they aren't focussed? What about playback equipment that is unsuitable for revealing differences? Do I need to go on?

In all properly designed scientific tests, controls are used to eliminate conditions that disqualify the results from being counted among the valid results.

Quote
Quote
And here we have an example of a null result being used quite contrary to the neutral pretence, "doesn't prove anything" excuse used in these forums. It is very much used in this politically disingenuous way

If a null result doesn't prove anything, the default position until proven otherwise has to be to reject the claim. That a series of null results bolsters this position may look disingenious to you, but in fact it changes nothing. A claim that has to be rejected anyway can - strictly speaking - not be rejected even more, nor does it need to be rejected more. The increased confidence in the rejection is purely psychological, yet it is perfectly reasonable. If it doesn't please you, you are free to ignore it. The rejection remains anyway.
Yes, & the important word here is "result" - you have to decide what is a result & what should be eliminated from "results". In your statements you are including everything into results even the deaf monkey's "results"
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 19:27:53
What we have here is a great example of experimenter's bias (http://en.wikipedia.org/wiki/Experimenter%27s_bias) or research bias where the test is skewed towards a particular result & any attempt at pointing out how it might be improved are rejected.

You guys cite biasing so much that I'm sure you know about this particular bias - so why the rabid rejection of examining the ABX test?
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-02 19:46:03
No, results presuppose that a "test" was actually taken. In the ABX null results posted here by mzil, where he just randomly guessed - do you consider that he "took a test" & delivered "results"?

It doesn't matter. Noone except him could tell the difference. That's why your whole idea of an invalid null result is bunk.

And let me add this: The fact that it doesn't matter is an important quality of such a test. It speaks for the ABX test method, and is a major factor in its usefulness.

Quote
If I sat a monkey down in front of the keyboard would I most likely get the same null results? Would you count the monkey's results among the null results?

I am not supposed to judge the monkey. If he produces a null result, I count it as a null result. If he produces a result that deviates significantly from chance, I count it accordingly. If I did anything else I'd be rigging the test.

Quote
Would you consider the results produced by a deaf person as a valid test? What about someone who demonstrated a hearing impairment in the audible area being tested? What about someone who has demonstrated that they are pre-biased to not hearing any differences? What about someone who is so tired that they aren't focussed? What about playback equipment that is unsuitable for revealing differences? Do I need to go on?

No, you needn't go on. I would count all of them as valid results. Doing anything else would put my own judgement above their results. I would effectively override their test results, thereby making the test invalid. Any test is invalid if the test administrator is allowed to override the test results of selected participants. Isn't that abundantly clear?

Now, it is true that a hearing test conducted with monkeys or deaf people might be regarded as pointless. That is an unfortunate consequence of a poor test design. If you aren't interested in the hearing abilities of monkeys or deaf people, you should exclude them from the test before the start. Once they are in, they are in - this doesn't make the test invalid. It may merely make it useless, depending on what the question was. This is so by design, I have to emphasize it again! It is not a fault or deficiency of ABX, quite the opposite, it is a major factor of its usefulness.

Quote
Yes, & the important word here is "result" - you have to decide what is a result & what should be eliminated from "results". In your statements you are including everything into results even the deaf monkeys "results"

Yes, that is quite deliberately so. You are trying to ridicule my position, but you miss that it isn't ridiculous at all. It is the opposite: Your position puts the designer/administrator of the test into a position where he can manipulate the criteria for result acceptance to his liking, potentially even after the fact. That's what I would call invalid!

And you wonder why other people don't trust you as a test designer/administrator? Amusing!
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-02 19:49:29
All ABX tests need to be proctored, right?

Can you cite some AES ones that aren't?
Yes, generally speaking they should be, especially when done by known shysters and those with strong pecuniary interests in the audio fashion jewelry business.

Do you not think I haven't done my own personal blind tests & am satisfied with my personal conclusions? That is truth for me

Yes, as has your pal (http://www.avsforum.com/forum/86-ultra-hi-end-ht-gear-20-000/1136745-establishing-differences-10-volume-method-14.html). We don't care about those (outside of the tremendous entertainment factor  ), or your own sub 9 second 100m dashes in the back yard, or whether you've bent spoons in front of your wife and friends.
We need more than photoshopped unicorn pics around here. We also understand that there are many who don't and will accept your daydreams as concurrent reality with theirs. But this isn't their hangout.

I will never be able to "prove" that my DAcs sound the way I describe

We know.
Blind tests are great for revealing real audio differences. They are worthless for apparitions. So the apparition believers reject them and will seek to discredit them.
Especially those with strong pecuniary interests.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-02 19:54:21
I think by his [jkeny] logic, if some student were to fill out a true/false test in school without reading the questions and simply checking all the "true" boxes, this would prove that true/false tests "don't work" and that the entire test methodology should be scrapped. 

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 19:57:51
................
Especially those with strong pecuniary interests.

cheers,

AJ

You talk about my pecuniary interests. So you think that my posting on here has something to do with my €500 DACs (my most expensive product).
I guess your $8,500 speakers are 17 times more likely to be a pecuniary motivation for you to post here & try to discredit anything other than speakers
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 20:03:29
I think by his [jkeny] logic, if some student were to fill out a true/false test in school without reading the questions and simply checking all the "true" boxes, this would prove that true/false tests "don't work".

You obviously know nothing about how tests are designed & administered to answer a specific question.

In the student test, it is designed to ascertain "true knowledge" from random guessing & if he ticks all the "true" boxes he will fail - similarly if he ticks all the "false" boxes, similarly if he randomly ticks true & false boxes.

Get your test design right to answer the question under examination - this concept is patently missing from your thinking about ABX testing
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-02 20:29:29
Guessing "true" on every question in a true/false test, without even reading the question, is the same as guessing B on every trial of an ABX test, without really listening to X. In both instances it is the fastest way to complete a test when the only concern is simply to determine how fast a test can mechanically be completed due to the delay in being forced to give a response and then clicking a "move on to the next trial" button.  I've completed such tests in this forum for exactly that purpose, not caring what the score was since my only purpose, at the time for THAT test, was to check for how quickly a test can be completed due to the physical demands of moving one's cursor from place to place and clicking when appropriate.

I've also shown the fastest I can do where I actually listened to X and had to think and process the sound to make a decision and I demonstrated that I wasn't just randomly guessing, in THAT test, by getting a perfect score. That's not as fast though as just guessing.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-02 20:30:28
You talk about my pecuniary interests.

Yep, you're "In the biz". The biz that ABX is very bad for.

So you think that my posting on here has something to do with my €500 DACs (my most expensive product).

Yes and the "organic" way they "sound"....free of ABX. You said so yourself.

I guess your $8,500 speakers are 17 times more likely to be a pecuniary motivation for you to post here & try to discredit anything other than speakers

Actually I just sold a $20k set...without resorting to nonsensical claims about magic power supplies, interleaving, gold alchemy and smell of flowers, etc.
Or pounding the drums of doubt/rejecting blind/ABX testing.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 20:37:11
Guessing "true" on every question in a true/false test, without even reading the question, is the same as guessing B on every trial of an ABX test, without really listening to X. In both instances it is the fastest way to complete a test when the only concern is simply to determine how fast a test can mechanically be complete due to the delay in being force to give a response and then click a "move on to the next trial" button.  I've completed such tests in this forum for exactly that purpose, not caring what the score was since my only purpose was to test for how quickly a test can be completed due to the physical demands of moving one's cursor from place to place and clicking when appropriate.
Yes & without your admission there is no way to know that you randomly guessed. ArnyK didn't admit this initially when he posted his null ABX test results, - he only admitted to this eventually when questioned on his timing (btw, he beat you for fasted random key hitter)

The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-02 20:44:05
. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.
Nope. You don't get it. Not hearing a difference, on that day, with that person, with that system, with that song, doesn't prove anything. It's not really evidence one way or the other. It could have been guessing, a poor song selection, a poor listener, almost anything. You can't prove a negative. It's only when a person gets a good score, where a strong statistical difference is shown, that you start to say you have a suggestion of having found some pay dirt. But I'm not going to waste my time trying to explain this concept to you. Bye.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-02 20:51:31
The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.

Yes, exactly. And that is how it should be. The null result is the default assumption anyway, so it doesn't matter how many results are being piled in there. You see, it all makes perfect sense! 

There is no alternative anyway. Or do you have a way to look into people's heads to determine whether they were deliberately cheating or otherwise unfit?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 20:52:22
You talk about my pecuniary interests.

Yep, you're "In the biz". The biz that ABX is very bad for.
I have no problem with anybody doing personal blind tests - it's truth for them. When it comes to presenting such "evidence" for public dissemination as "evidence" for a particular stance then more stringent criteria & examination are required. 

Quote
So you think that my posting on here has something to do with my €500 DACs (my most expensive product).

Yes and the "organic" way they "sound"....free of ABX. You said so yourself.
I never posted that here - you pulled it off my website & posted it here. Much like your statement "Our products reflect the philosophy that loudspeakers should strive to sound like the real thing." Who are you trying to fool? Maybe yourself? We all know that audio playback is not about trying to sound like the "real thing"- that's as silly as it gets. Nobody was ever fooled that listening to playback is like a live event. Audio playback is about creating an illusion - an illusion that appeals to our auditory perception as somewhat realistic. There is no system that comes anyway near being mistaken for a live musical event. 

Quote
I guess your $8,500 speakers are 17 times more likely to be a pecuniary motivation for you to post here & try to discredit anything other than speakers

Actually I just sold a $20k set...without resorting to nonsensical claims about magic power supplies, interleaving, gold alchemy and smell of flowers, etc.
Or pounding the drums of doubt/rejecting blind/ABX testing.

cheers,

AJ

Well done! I'm sure we would all be interested in the blind test results of your $20,000 speakers Vs your $1,600 speakers - I don't see any posted on your website. So what does the extra $18,400 deliver?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 21:18:22
Quote
Quote

. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.
Nope. You don't get it. Not hearing a difference, on that day, with that person, with that system, with that song, doesn't prove anything. You can't prove a negative. But I'm not going to waste my time trying to explaining this concept to you. Bye.
Yep, I get it. Treating all null results as valid & piling them into a block of evidence designed to create an edifice of damning evidence - this is what your tactic is all about.

The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.

Yes, exactly. And that is how it should be. The null result is the default assumption anyway, so it doesn't matter how many results are being piled in there. You see, it all makes perfect sense! 
No, it doesn't make sense unless you ignore all null results & not use them as a body of evidence. Yes, it all makes per sense if you have a particular position you want to advance & want to use this test which is obviously rife with experimenter's bias

Quote
There is no alternative anyway. Or do you have a way to look into people's heads to determine whether they were deliberately cheating or otherwise unfit?

Well designed tests use pre-screening, pre-training & internal controls to eliminate as many issues as possible - this will eliminate some listeners/playback systems from the test. Internal controls are used in well-designed tests to catch problems within the test. For instance, let's say this ABX test was testing high-res Vs RB - so we have two files that are A & B - randomly, in some trials, the software could introduce a difference of 0.5dB or whatever (is considered an agreed audible difference) in X. The listener, if he is doing the test correctly should be able to identify this difference. This would verify that the listener is actually listening on every trial & isn't just randomly guessing or isn't too tired & lost focus. The software can report these trials as controls separately to the other trials. The expected result is known for these controls & if the listener doesn't get correct results for these controls then his other results should be discarded
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-02 21:35:30
I have no problem with anybody doing personal blind tests - it's truth for them.

We do when it's a complete farce. A facade due to knowing the derisive laughter elicited by admitting they were sighted. "Adaptation" goes beyond the audible variety. 
Now all sorts of audiophiles are doing "personal" "blind tests" like the one I linked earlier and of course, your claimed ones. While Stereophile et al rejects them.

When it comes to presenting such "evidence" for public dissemination as "evidence" for a particular stance then more stringent criteria & examination are required.

Yup. Unless dealing with those who still hear Santa Claus and get enraged by by those who doubt them.

I never posted that here - you pulled it off my website & posted it here.

You said it, you own it. 

Much like your statement "Our products reflect the philosophy that loudspeakers should strive to sound like the real thing." Who are you trying to fool? Maybe yourself? We all know that audio playback is not about trying to sound like the "real thing"- that's as silly as it gets. Nobody was ever fooled that listening to playback is like a live event. Audio playback is about creating an illusion - an illusion that appeals to our auditory perception as somewhat realistic. There is no system that comes anyway near being mistaken for a live musical event.

Quite a bit of blathering to say nothing. Must be an "In the biz" thing. 

Well done! I'm sure we would all be interested in the blind test results of your $20,000 speakers Vs your $1,600 speakers

To verify what claim vs said $1600 speakers? What would be ABX'd? Now blind tests are valid?

I don;t see any posted on your website.

Right. No claims about organic and all that crap.

So what does the extra $18,400 deliver?

Some pretty large, pretty loud, pretty speakers. 

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-02 21:44:26
So what does the extra $18,400 deliver?

Oh yes, deeper bass, higher output, more adjustability, fully active (10ch amplification), larger soundstage (indirect drivers), 6 cabinets, in home setup by me.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 21:44:37
So what does the extra $18,400 deliver?

Some pretty large, pretty loud, pretty speakers. 

cheers,

AJ

OK, got it- $18,400 for extra loudness (which I can get by turning up the volume) & some hi-fi jewellery which I would get better bang for my buck (literally ) if I spent this on my wife's jewellery.
Title: How do you listen to an ABX test?
Post by: pdq on 2015-04-02 21:45:23
@jkeny: Let me make it easier for you to understand with a hypothetical;

You claim that your DAC sounds better than a much less expensive DAC. I know of no reason why it should so my assumption is that it does not.

You take an ABX test and fail - no change, I still assume that they sound the same.

Ten million more people try to ABX a difference, and they also fail - no change, I still assume that they sound the same.

It turns out that all of them were guessing randomly, or that they were all monkeys - still no change, I still assume that they sound the same.

One person passes an ABX test convincingly - finally a change - my original assumption was incorrect, and there is an audible difference (although this does not necessarily prove that yours sounds "better").
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 21:47:12
So what does the extra $18,400 deliver?

Oh yes, deeper bass, higher output, more adjustability, fully active (10ch amplification), larger soundstage (indirect drivers), 6 cabinets, in home setup by me.

Where are the measurements that prove your claim of "larger soundstage"?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 21:54:42
@jkeny: Let me make it easier for you to understand with a hypothetical;

You claim that your DAC sounds better than a much less expensive DAC. I know of no reason why it should so my assumption is that it does not.

You take an ABX test and fail - no change, I still assume that they sound the same.

Ten million more people try to ABX a difference, and they also fail - no change, I still assume that they sound the same.

It turns out that all of them were guessing randomly, or that they were all monkeys - still no change, I still assume that they sound the same.

One person passes an ABX test convincingly - finally a change - my original assumption was incorrect, and there is an audible difference (although this does not necessarily prove that yours sounds "better").

If you don't know what to listen for either because it has been pointed out to you in sighted listening & you can also successfully identify this in sighted listening, why would you go into blind testing? Are you expecting something to jump out at you in blind testing that you haven't identified already in a sighted test?

I would consider this very optimistic. This is not the way to enter into a blind test.

So the accumulation of null results are really just senseless & only serve one purpose - to build a body of "evidence" to support a particular stance. if you don't hear a difference between my DAC & another then just state that as the case & choose the cheaper one.

This pretence of the ABX test bringing something "extra" to this is just bunkum. What will happen in the ABX test is what has already been posted here in a closed thread - knowing that they haven't heard any difference in sighted listening they really don't bother to listen & just hit random keys because "life's too short" was the excuse given by one
Title: How do you listen to an ABX test?
Post by: pdq on 2015-04-02 22:02:15
So you would make the assumption that all of those failed ABX tests were because the subjects didn't know what to listen for - fair enough. I have no problem with that. But that still has zero effect on my initial assumption that there is no audible difference.

The difference is that I have no ax to grind - I would be just as happy to have my assumption proven wrong as not. It is, however, in your best interest to prove a difference, so you are the one that will be pursuing a positive ABX, not me.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-02 22:07:24
Jkeny, roughly how many foobar ABX tests have you taken? Just that one you posted on AVS?




Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 22:13:44
So you would make the assumption that all of those failed ABX tests were because the subjects didn't know what to listen for - fair enough.
No, I wouldn't state that & never did. Lack of training is just one of the many reasons why a "false" null result could be returned.
Quote
I have no problem with that. But that still has zero effect on my initial assumption that there is no audible difference.
Yes, but you haven't tried to design the test so that this null result (no audible difference) is more likely to be the result of there being no ACTUAL AUDIBLE difference to be heard rather than there is no difference because it is masked by bad test design

Quote
The difference is that I have no ax to grind - I would be just as happy to have my assumption proven wrong as not. It is, however, in your best interest to prove a difference, so you are the one that will be pursuing a positive ABX, not me.

As I said, I have no problem with anyone doing their own personal blind test on my DACs as I have done & this is fine for me & for most people who want to make buying decisions. The rest of this is just people playing science & aping what they think the grown-ups do in their laboratories but not really understanding much of it
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 22:15:21
Jkeny, roughly how many foobar ABX tests have you taken? Just that one you posted on AVS?

Attempting another diversionary tactic.
Title: How do you listen to an ABX test?
Post by: mzil on 2015-04-02 22:17:41
As I thought it would seem. Just one.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 22:28:48
So as is done on this thread, let's summarise:

1) - it's claimed that a null result "proves" nothing

2) - it is admitted that an accumulation of null results is a strong indication of there being no ACTUAL audible difference

3) - thus the number of null results has a direct bearing on how strong this indication is perceived to be - the higher the number the stronger the indication

4) - treating all null results as valid & piling them all into the valid null results pile is knowingly skewing the overall number of null results towards (2) & (3) above
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 22:31:00
As I thought it would seem. Just one.

Ah, the old logic fallacy game, eh? When did you stop beating your wife, then?

You have a habit of making claims that you then ask "the accused" to disprove.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 22:38:32
I think by his [jkeny] logic, if some student were to fill out a true/false test in school without reading the questions and simply checking all the "true" boxes, this would prove that true/false tests "don't work" and that the entire test methodology should be scrapped.


Pretty much.

jkeny seems to think a lot of people would 'randomly guess' for nefarious purposes -- to somehow game the cumulative results toward negative.  That seems far-fetched to me, given the ABX trials I've seen time and again here on HA (including many positives, even for something like 320 kbps mp3 vs source).  And surely the eager audiophiles who find themselves at a loss in some of the more famous audio DBTs, could not have been purposely gaming the results towards*negative*.  When John Atkinson failed his own amp DBT, was it some brilliant ploy to discredit DBTs? Don't think so.

So the 'nefarious guessing' complaint seems rather silly... but when even 'award winning' audio DBT results from Meridian et al. yielded only a very, very small difference, under highly specified conditions, what's a DAC salesman to do? 

So let's leave jkeny's desperate argument for a moment, and take a look at one of HA's own 'best practices' guides (http://www.hydrogenaud.io/forums/index.php?showtopic=16295).

When subjects find themselves 'guessing' on ABX tests, I aver that it's typically without nefarious intent.  I'm sure many who have ever taken one and found they can't with 100% confidence say X is A or B during a particular trial, have resorted to their 'gut' or 'best guess'.  I know I have.   

But our own author of HA's sticky post about ABX tests, written in 2003 (or so), would not approve  -- though not for jkeny's reason:

Quote
3. The p values given in the table linked above are valid only if the two following conditions are fulfilled :

-The listener must not know his results before the end of the test, except if the number of trials is decided before the test.
...otherwise, the listener would just have to look at his score after every answer, and decide to stop the test when, by chance, the p value goes low enough for him.

-The test is run for the first time. And if it is not the case, all previous results must be summed up in order to get the result.
Otherwise, one would just have to repeat the serial of trials as much times as needed for getting, by chance, a p value small enough.
Corollary : only give answers of which you are absolutely certain ! If you have the slightest doubt, don't answer anything. Take your time. Make pauses. You can stop the test and go on another day, but never try to guess by "intuition". If you make some mistakes, you will never have the occasion to do the test again, because anyone will be able to accuse you of making numbers tell what you want, by "starting again until it works".


(bold black emphasis mine)

As best I can tell, English is/was not Pio's first language, and I think the term 'absolutely certain' there is very unfortunate.  I do get the point that if you find are *only* guessing from the get-go, with utterly no feeling that there might be a difference, and your confidence does not increase during the test, you might consider the test pointless and should just stop -- you can't hear the difference.  (Though one might ask, what if you can *unconsciously* 'sense' a difference (vide Oohashi, et al)?    You won't 'know' unless you complete the test!)  I also get the point that if you have become fatigued, you should stop and resume again when you feel sharp.  That's all to maximize your chance of hearing a real difference.  But if you're feeling aurally alert *yet* you find yourself perhaps less than 'absolutely certain' that X is A (or X is B) at some point, don't stop, just finish the test as best you can.  I would bet that has happened to all of us.


Pio's reasoning is laid out further:


Quote
Of course you can train yourself as much times as you whish, provided that you firmly decide beforehand that it will be a training session. If you get 50/50 during a training and then can't reproduce this result, too bad for you. the results of the training sessions must be thrown away whatever they are, and the results of the real test must be kept whatever they are.

Once again, if you take all the time needed, be it one week of efforts for only one answer, in order to get a positive result at the first attempt, your success will be mathematically unquestionable ! Only your hifi setup, or your blind test conditions may be disputed. If, on the other hand, you run again a test that once failed, because since then, your hifi setup was improved, or there was too much noise the first time, you can be sure that there will be someone, relying on statistic laws, to come and question your result. You will have done all this work in vain.



His points about training and  about picking/choosing results, and earlier (not shown), about  deciding beforehand on trial number,  are valid, but I don't think the logic extends to banning any response that involves the 'intuitive'.


I would also expand on this:

Quote
4. The test must be reproducible.
Anyone can post fake results. For example if someone sells thingies that improve the sound, like oil for CD jewel cases of cable sheath, he can very well pretend to have passed a double blind ABX test with p < 0.00001, so as to make people talk about his products.
If someone passes the test, others must check if this is possible, by passing the test in their turn.


Reproducing the positive result with *other* subjects is one sort of verification; having the original subject replicate the result, under monitored conditions, would be another.  Though of course, a 'difference' that was only EVER demonstrated with *one* subject (n=1), would not be terribly significant for the sorts of claims routinely made in audio-land.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 22:42:38
No, results presuppose that a "test" was actually taken. In the ABX null results posted here by mzil, where he just randomly guessed - do you consider that he "took a test" & delivered "results"?

It doesn't matter. Noone except him could tell the difference. That's why your whole idea of an invalid null result is bunk.

And let me add this: The fact that it doesn't matter is an important quality of such a test. It speaks for the ABX test method, and is a major factor in its usefulness.

Quote
If I sat a monkey down in front of the keyboard would I most likely get the same null results? Would you count the monkey's results among the null results?

I am not supposed to judge the monkey. If he produces a null result, I count it as a null result. If he produces a result that deviates significantly from chance, I count it accordingly. If I did anything else I'd be rigging the test.

Quote
Would you consider the results produced by a deaf person as a valid test? What about someone who demonstrated a hearing impairment in the audible area being tested? What about someone who has demonstrated that they are pre-biased to not hearing any differences? What about someone who is so tired that they aren't focussed? What about playback equipment that is unsuitable for revealing differences? Do I need to go on?

No, you needn't go on. I would count all of them as valid results. Doing anything else would put my own judgement above their results. I would effectively override their test results, thereby making the test invalid. Any test is invalid if the test administrator is allowed to override the test results of selected participants. Isn't that abundantly clear?

Now, it is true that a hearing test conducted with monkeys or deaf people might be regarded as pointless. That is an unfortunate consequence of a poor test design. If you aren't interested in the hearing abilities of monkeys or deaf people, you should exclude them from the test before the start. Once they are in, they are in - this doesn't make the test invalid. It may merely make it useless, depending on what the question was. This is so by design, I have to emphasize it again! It is not a fault or deficiency of ABX, quite the opposite, it is a major factor of its usefulness.

Quote
Yes, & the important word here is "result" - you have to decide what is a result & what should be eliminated from "results". In your statements you are including everything into results even the deaf monkeys "results"

Yes, that is quite deliberately so. You are trying to ridicule my position, but you miss that it isn't ridiculous at all. It is the opposite: Your position puts the designer/administrator of the test into a position where he can manipulate the criteria for result acceptance to his liking, potentially even after the fact. That's what I would call invalid!

And you wonder why other people don't trust you as a test designer/administrator? Amusing!

Wow, I'm glad that you laid all this out in black & white for all to see - it really proves the experimenter bias that underlies the thinking by many DBT supporters on thsi forum.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-02 22:53:23
Yep, I get it. Treating all null results as valid & piling them into a block of evidence designed to create an edifice of damning evidence - this is what your tactic is all about.

The null results are predominantly provided by the audiophiles themselves. There's very little need for a tactic here. It suffices to give them enough rope to hang themselves. They reliably will.

Quote
No, it doesn't make sense unless you ignore all null results & not use them as a body of evidence. Yes, it all makes per sense if you have a particular position you want to advance & want to use this test which is obviously rife with experimenter's bias

I don't use the null results as a body of evidence. I've said that before, but you don't appear to take it seriously. I treat the null result as the expected result anyway, hence I don't need extra confirmation by more null results. This position is inherent in the test method, and needs to be.

The most damning evidence that confirms my position isn't the null results. It is the way audiophiles react to the null results. In other words how they completely fail to get the point of the tests, and how they come to misrepresent the test fundamentals, in a futile attempt to change the interpretation in their favor.

Quote
Well designed tests use pre-screening, pre-training & internal controls to eliminate as many issues as possible - this will eliminate some listeners/playback systems from the test. Internal controls are used in well-designed tests to catch problems within the test. For instance, let's say this ABX test was testing high-res Vs RB - so we have two files that are A & B - randomly, in some trials, the software could introduce a difference of 0.5dB or whatever (is considered an agreed audible difference) in X. The listener, if he is doing the test correctly should be able to identify this difference. This would verify that the listener is actually listening on every trial & isn't just randomly guessing or isn't too tired & lost focus. The software can report these trials as controls separately to the other trials. The expected result is known for these controls & if the listener doesn't get correct results for these controls then his other results should be discarded

No problem. If you think you want to do a test like that, go ahead and do it. If it helps avoiding wasted test effort, by avoiding null results, I'm all for it. It is definitely ok to do pre-test screening and remove inadequate testers. If you remove testers based on their performance in the actual test, I want to know exactly how that's determined in an impartial way, in order to be sure this isn't being used to skew the test.

However, be aware that such measures only have the effect you desire if the extra control trials measure the property you are after in the real test. For example, if you are trying to find the minimum detectable level difference, your control trials will introduce a known level difference. If you are looking for something entirely different, or even an unknown kind of difference, the control trials may not have any benefit at all.

However, don't expect too much of such a test. It won't magically yield the result you crave for. Don't be surprised if it confirms the objectivist position. Be prepared to produce another null result which goes onto the pile.

Quote
Wow, I'm glad that you laid all this out in black & white for all to see - it really proves the experimenter bias that underlies the thinking by many DBT supporters on thsi forum.

You're welcome. Except it doesn't prove any bias. It just describes how ABX testing works. There is not and cannot be any symmetry between positive and negative results. Your expectations are completely off, and your insistence just shows your ignorance.

Now, please do me a favor and disseminate my statement as widely as possible as "proof" that the objectivists are biased. You show how tempting it is for you. It would only reinforce my conviction, and my amusement.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 22:55:58
Wow, I'm glad that you laid all this out in black & white for all to see - it really proves the experimenter bias that underlies the thinking by many DBT supporters on thsi forum.



or that you can't/don't read.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 23:01:05
jkeny seems to think a lot of people would 'randomly guess' for nefarious purposes -- to somehow game the cumulative results toward negative.

Nope, wrong again - there are many reasons why people would just guess, not just for nefarious reasons but doe sit really matter what the motivation/intent behind guessing is - the point is to have a robust test design that can sense random guessing. I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.

So far, no reply to this suggestion. Oh I see pelmazo replied to it while I was posting this - yes the usual reply given "Go ahead & do it"
No interest whatsoever! That's expected.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 23:07:30
You're welcome. Except it doesn't prove any bias. It just describes how ABX testing works. There is not and cannot be any symmetry between positive and negative results. Your expectations are completely off, and your insistence just shows your ignorance.

Now, please do me a favor and disseminate my statement as widely as possible as "proof" that the objectivists are biased. You show how tempting it is for you. It would only reinforce my conviction, and my amusement.

Yep, I've bookmarked it, thanks - now don't go changing it

That seems like a good note to end on!!
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-02 23:14:09
The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.


You *stipulate* that 'no serious attempt at listening' was involved.  But that's not been my experience.

I can stipulate that there are tests where 'positive' difference might have been due to some other factor.    They get lumped into the positive 'evidence' file.  Depending on who the lumper is.



In fact, much of the formal/professional audio DBT literature consists of ever-more-contrived protocols to account for possible sources of false positives/negatives.

And to date, have the large claims of the high-end folks -- the ones you're courting as customers -- been borne out by these tests?

Nope.  No veils lifted, no 'night and day' differences.

But by all means, if you feel you've identified yet another hole in the dike that needs to be plugged, do so, and publish your own protocols and multisubject results.


Just  don't expect anyone with a clue (or any journal)  to take your (or anyone's)  sighted claims of DAC audio quality as good evidence.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-02 23:14:38
As I said, I have no problem with anyone doing their own personal blind test on my DACs as I have done
You don't mention anything like this in the interview and quite frankly, you're personal blind tests are a fabrication (good biz practice apparently) without any evidence of such.

this is fine for me & for most people who want to make buying decisions.
Not a single person buying your biochemically engineered "organic" DAC would ever consider a blind test. That simply isn't your type customer...and you know it.
If you had any real blind test data showing you identified/preferred your DAC, it would be pasted on every forum and your website. That's the reason why it isn't.

The rest of this is just people jkeny playing science & aping what he think the grown-ups do in their laboratories but not really understanding much of it
Bingo. You have nothing to gain by blind/ABXing your orgasmic DAC...and nothing to lose either, given the appeal of such "engineering".

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-02 23:19:51
But by all means continue to rabbit on without me!!
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-03 01:18:09
I've asked over & over again what proctor would be accepted but got no answer. Care to answer this?


Sure:

JJ

Me

John Vanderkooy

Stan Lipshitz

and 100's others who are probably more conveniently physically sited
Pretty much as I suspected - you need to get a grip on reality if you think any of those named people would be remotely interested in proctoring.


Prove it.

Quote
It makes any positive ABX results impossible - just as the whole silly issue of proctoring was designed to do.


False and illogical claim.  If you were able to follow instructions, your results would be the same, proctor or not.

Quote
You guys have backed yourself up into a corner of reality that is untenable.


It is you who are cornered, Mr. Kenny.

Quote
All ABX tests need to be proctored, right?


Not at all.


Quote
Quote
Besides, aren't you in this for the sake of knowledge and truth?  Wouldn't just knowing be enough?  It has been for me on many occasions. For example, I didn't spill the beans about the first ABX test ever done until someone did the second-dozen or more ones. Didn't want to bias them.


Do you not think I haven't done my own personal blind tests & am satisfied with my personal conclusions? That is truth for me


Do tell about those alleged blind tests.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-03 01:36:08
So as is done on this thread, let's summarise:

1) - it's claimed that a null result "proves" nothing


That is how it works.  That is not a claim, it is simple logic. Absence of proof is not proof of absence, or said otherwise: A negative hypothesis is difficult or impossible to prove. That's one reason why we try to prove the positive outcome with every test.

Quote
2) - it is admitted that an accumulation of null results is a strong indication of there being no ACTUAL audible difference


That is just common sense.

One other thing. Your co-conspirator is plastering AVS with claims that the percentage right answers means nothing. I quoted and linked that earlier so it is a proven fact that he is saying this.

The percentage of right can be used to discern a number of worthwhile things:

(1) A percentage right that is significantly below random guessing means that the test was compromised. The usual error at this point is communication among the listeners.

(2) Statistically significant results that have a low percentage of right answers means that the audible effect is subtle.

Point (2) is interesting because virtually every person who has actually done a significant number of DBTs finds this to be intuitively clear, and it is a result that can duplicated by varying the strength of the audible difference over a range around the threshold of reliable detection.  Yet we have someone who claims to have done a number of DBTs, but keeps spewing this preposterous nonsense.  oesn't compute. one explanation is that the majority of his alleged DBTs were shams and the ABX logs that were shown were the results of trickery.

Quote
3) - thus the number of null results has a direct bearing on how strong this indication is perceived to be - the higher the number the stronger the indication


Other things matter too, such the quality and rigor of the test procedures. 

Quote
4) - treating all null results as valid & piling them all into the valid null results pile is knowingly skewing the overall number of null results towards (2) & (3) above


However, if one can come up with some worthwhile refinement on the test that would make positive results more probable, the game starts all over with a fresh score card.
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-04-03 10:46:50
Hi jkeny,

I think you’ve shown more “perseverance in the face of adversity” than I did. ;-)
Although, many people are open-minded, and therefore neither hard-line subjectivists nor hard-line objectivists, “preaching to the choir” within either group doesn’t have much power to persuade.
It is the power to convince a skeptic that is most interesting in this thread.

Many here have said they are skeptical of certain claims but would be convinced that a difference that is heard is real if: a proctored* ABX is performed with p<##, where *proctor and ## need to be agreed upon. There is no chance of getting 100% being convinced. krabapple wants a proctor, Mr. Krueger doesn’t. Some want p<0.05, some p<0.01 (from other threads/forums).

But you are an ABX skeptic, or at least you believe many are not performed well, and I would agree. But showing examples of bad ABX tests does not mean all are bad. Under what conditions would you accept the result and allow it to convince you of something (e.g. no audible difference)? You have mentioned including controls. Would you accept a result with controls that showed no audible difference between two DACs? If you chose the listeners and the listening conditions and there were controls, would you accept the result? If not, where are you going with this? Before you request analysis of the probability of Type II errors, let’s state up front that that requires a measure, or at least sensible estimate, of the effect size, that is, the probability of noticing a difference in a trial for the population being tested (e.g. “critical listeners”, “experienced, trained listeners”, “humans”). But to do that you must have data from previous work to calculate the probability. You have to do the experiment before you do the experiment = impossible. The first step would be to produce one positive test and then the statistical calculations can begin. If you can’t describe what would convince you, then I wonder why you continue to pursue it. Why do you have such “perseverance in the face of adversity”?

Cheers.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 11:54:58
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.
That would confuse a listener who was listening - if A or B suddenly became a previously unheard C!

However, the concept is well used in other ways that don't confuse, and called a hidden reference. It's not a new idea.


In the strictest sense, a blind test can't prove that something is inaudible. There are pages written about this. However, where someone claims to hear a differences, continues to believe they hear the difference during the blind test, but their answers show they could not detect the difference, that's about as close to proof that that person was imagining the difference that you can get.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-03 12:02:55
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.

I already told you 2 months ago (in the thread about amir's demonstration that he has no clue about statistics) that dishonest people can still distort the results to their liking. But you aren't here to discuss or learn anything, just make noise. So /ignore.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-03 12:11:27
I'm not an audiophile bi$man but I play one on TV, so I'll address some of this just in case John is just too busy elsewhere. Clearly you are new at this also.

Under what conditions would you accept the result and allow it to convince you of something (e.g. no audible difference)?

Specifically regarding audiophile beliefs, like about DACs, positive results only. Negatives (nulls) are unacceptable, since we "know" that there are audible differences between mass market and organically grown DACs. How do we know? Long term completely uncontrolled sighted "listening", where we are relaxed, get invaluable and completely impartial, detached input from the Mrs, etc.
IOW not waterboarded being unable to know the namebrand and what I read on audiophile sites, etc.

You have mentioned including controls.

Yes, as part of the adaptation, you must learn the terms so you can throw them out there, even though you have no clue what they mean. Leading to some very amusing exchanges with JJ on AVS.

If you chose the listeners and the listening conditions and there were controls, would you accept the result?

Momentarily (http://www.avsforum.com/forum/86-ultra-hi-end-ht-gear-20-000/941184-observations-controlled-cable-test-2.html#post12255000). However, without continued therapy for the condition, there will be the inevitable relapse (http://www.whatsbestforum.com/showthread.php?15480-Transparent-Magnum-Opus&p=282038&viewfull=1#post282038).

If not, where are you going with this?

Bank of Ireland?

Why do you have such “perseverance in the face of adversity”?

They banned Scientologists in Germany didn't they. Lucky you!

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: SoundAndMotion on 2015-04-03 12:26:39
Hi AJ,
Just as one bad ABX doesn't spoil the whole bunch...
One bad Subjectivist (ML) doesn't spoil the whole bunch.
I'm curious what JK's conditions to be persuaded would be. He has been posting concerns here.
Cheers, SAM
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-03 12:45:20
It is the power to convince a skeptic that is most interesting in this thread.

Many here have said they are skeptical of certain claims but would be convinced that a difference that is heard is real if: a proctored* ABX is performed with p<##, where *proctor and ## need to be agreed upon. There is no chance of getting 100% being convinced. krabapple wants a proctor, Mr. Krueger doesn’t. Some want p<0.05, some p<0.01 (from other threads/forums).

Given that everybody is skeptical about something, I think it is not that hard to understand why people show different amounts of resistance to getting convinced of something they find implausible.

Changing the topic sometimes helps illustrating the problem. Say you are dealing with claims by numerous people that they can run the 100m distance in 7 seconds or better. They say that they have done it routinely, and some even provide time measurements of runs they claim to have done. You are skeptical that anyone can complete this distance in 7 seconds, given your current knowledge, so the claims don't convince you. You don't need a degree in human biology for that, some common sense is enough. The question now is, what will convince you? Do you think a proctor helps? Any proctor? Do you need to be present yourself?

Even if there is an impartial proctor, does he have access to all relevant aspects of the test? For example, if the role of the proctor is only to check that the time measurement was correct, you may suspect that the run didn't start at 0 velocity. The trick may be that the runner enters the 100m stretch at full velocity. Or the track goes steeply downhill. What about doping? What if the runner isn't a human being at all, but a purpose built machine...

You see that it can be quite difficult to work out every cheating possibility in advance. It is correspondingly difficult to say in advance what you would accept as proof. Every set of conditions you provide can be used as a basis for working out a cheating strategy. Take sports again as an example: The anti-doping rules have illustrated this for decades. Knowing what the rules are allows developing strategies to circumvent them. If the incentive is large enough this will most certainly be done in practice. This can only be counterbalanced by updating the rules according to the experiences made along the way.

Back to our topic: We have seen that at least for some "players", the incentive seems to be large enough to employ any cheating method they can find. Some people fight a bitter battle and feel compelled to use any device they can get hold of. In a climate like that, one proctor will probably not be enough. You will need an elaborate system of rules, and the first problem is to get everybody to accept them. We see how often it fails at that step already.

Many people don't want to go that far. They resolve to say: I don't care how and whether you can run the 100m in 7 seconds. I don't believe it, no matter what you say. My time is too precious to waste it on elaborate testing rules designed to prevent you from cheating. You can just fuck off.

Would that be close minded, even arrogant? Perhaps, but at that point, I have quite a bit of sympathy with such a stance. I find it perfectly reasonable. Being accused of politically motivated bias in such a situation doesn't really disturb me much. Not if the claim that I'm supposed to take seriously is so wide off any reasonable expectation that rather solid evidence would be called for.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 13:08:18
Hi jkeny,

I think you’ve shown more “perseverance in the face of adversity” than I did. ;-)
Although, many people are open-minded, and therefore neither hard-line subjectivists nor hard-line objectivists, “preaching to the choir” within either group doesn’t have much power to persuade.
It is the power to convince a skeptic that is most interesting in this thread.

Many here have said they are skeptical of certain claims but would be convinced that a difference that is heard is real if: a proctored* ABX is performed with p<##, where *proctor and ## need to be agreed upon. There is no chance of getting 100% being convinced. krabapple wants a proctor, Mr. Krueger doesn’t. Some want p<0.05, some p<0.01 (from other threads/forums).
Thanks on the perseverance comment
I believe that both krabapple & Arny want a procotor - Arny just reserves the option to invoke it when he sees fit - a many here do.

Quote
But you are an ABX skeptic, or at least you believe many are not performed well, and I would agree. But showing examples of bad ABX tests does not mean all are bad.
Sure, but the issue is we don't know what percentage are bad & I suspect that many are. On the other hand the attitude shown here is that they don't care about this. I wouldn't care either if null results were simply just discarded & of no importance but that is not the case. The accumulated body of null results is used as evidence of inaudibility. Look at ArnyK's jangling keys test - flawed as it was - the claim was that so many have done this test in 15 years & nobody has produced a positive result. So these flawed files were tested how many thousands of time over 15 years & not one person picked up the audible flaw. This body of null results was often cited as evidence that there is no audible difference with high-res. Irrespective of whether Amir's results are due to high-res differences, his repeated ABX positive test results show that there is an audible difference between the two jangling keys files that stood for 15 years as being audibly identical.

A similar thing happened for Ethan Winer's loopback test files which stood for a similar amount of time without any positive results. These files were the evidence that Winer used to claim that a soundblaster audio card is audibly indistinguishable from a professional audio system for recording. He looped back a recording through the card many times & put online test files extracted after different number of loopback generations as "proof" that many trips through D/A-A/D is not audible. Again positive results were reported around the same time & Winer then changed his files.

The point being - in both of these cases ACTUAL audible differences existed in the test files but this was undiscovered during the claimed many thousands of blind tests run during the 15 years previous. So what was the problem? Why no positive results over this time? Why, when some positive results are reported do others then find similar positive results?

This gives me a large question mark over the capability of such tests to reveal small differences & led me to ask what the level of false negatives are for ABX testing. I know from my own experience of running ABX & other blind tests how easily it is to get bored & lose focus & not actually listen. It's a difficult task to retain concentration on the same short piece of repeated music at the level of analytic hearing required in this form of listening. This lapse is often not even something that people are aware of - it's not like reading where you find that you need to re-read the same paragraph a number of time because your mind has wandered - in audio, a lapse in focus generally goes unnoticed. I figured that including some internal controls in the test could begin to reveal how prevalent this might be & build a profile of just how reliable these tests are.

There seems to be a great reluctance to address such a mechanism.
Quote
Under what conditions would you accept the result and allow it to convince you of something (e.g. no audible difference)? You have mentioned including controls. Would you accept a result with controls that showed no audible difference between two DACs? If you chose the listeners and the listening conditions and there were controls, would you accept the result? If not, where are you going with this? Before you request analysis of the probability of Type II errors, let’s state up front that that requires a measure, or at least sensible estimate, of the effect size, that is, the probability of noticing a difference in a trial for the population being tested (e.g. “critical listeners”, “experienced, trained listeners”, “humans”). But to do that you must have data from previous work to calculate the probability. You have to do the experiment before you do the experiment = impossible. The first step would be to produce one positive test and then the statistical calculations can begin. If you can’t describe what would convince you, then I wonder why you continue to pursue it. Why do you have such “perseverance in the face of adversity”?

Cheers.

As I said, once I see reasonable internal controls that can be used to give an indication of when someone is actually listening Vs when someone is not (for whatever reason) then I will accept the results of blind tests conducted by A.N. others. I even have to be aware of this in my own blind testing as I know this is a very insidious issue.

Let me give you an example - I find that this is somewhat like reading a book - normal reading is different to proof reading. In normal reading you may pick up some spelling/typo glitches but it's the story & flow that is of importance. In proof reading (which I find is the equivalent of blind testing) you are reading to pick up typos/spellings & grammar issues. It's very easy to lose focus doing this & regular breaks are needed. One also has to be aware of when you lose focus.

What I'm saying is that before I accept that the book has been properly proof read by a stranger, I would want to include misspellings/typos grammar mistakes throughout the book & if they weren't reported I would know that the proof-reading wasn't done properly. If some were found but not others I would know that focus had been lost at times, etc.

At the moment, for the majority of blind tests run by non-specialists, no-one has a way of judging the validity of the null results.
I have given here & in past threads some examples of how I would implement internal controls in ABX software & what they could be so I'm not sure why you say " If you can’t describe what would convince you, then I wonder why you continue to pursue it." I was hoping that internal controls might be considered a good idea & people would work together to come up with some workable solutions rather than every suggestion I made being shot down & the idea being dismissed as unimportant or dismissed in other ways. The reaction to my questioning of the validity of null results suggests to me that people have invested so much in these null results that they are unwilling to objectively examining the testing & finding a way of separating out tests that should be eliminated from valid null results - they seem to feel threatened by this very concept.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 13:19:47
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.
That would confuse a listener who was listening - if A or B suddenly became a previously unheard C!
By changing the volume of X every now & then you are introducing a small change which should be audible. I doubt this would confuse - how are hidden references used in blind tests? If on the other hand, it wasn't noticed, would this not indicate that the sound file was not being listened to analytically at that particular point in the test? If all occurrences of this control went unnoticed would it not suggest that there was a loss of focus (or some other reason for not hearing the audible difference) throughout the test?

Quote
However, the concept is well used in other ways that don't confuse, and called a hidden reference. It's not a new idea.


In the strictest sense, a blind test can't prove that something is inaudible. There are pages written about this. However, where someone claims to hear a differences, continues to believe they hear the difference during the blind test, but their answers show they could not detect the difference, that's about as close to proof that that person was imagining the difference that you can get.

Cheers,
David.

Sure hidden references are recommended for blind tests in the standards documents - why?
It isn't just included for no reason - it is a way of self-checking the test itself - something that is sadly missing in ABX testing
This is what I'm trying to get across here.

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 13:37:45
....
Back to our topic: We have seen that at least for some "players", the incentive seems to be large enough to employ any cheating method they can find. ....

A good start in eliminating cheating is to look at examples of cheating that have been perpetrated in the past. Have you got these examples?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 13:40:49
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.

I already told you 2 months ago (in the thread about amir's demonstration that he has no clue about statistics) that dishonest people can still distort the results to their liking. But you aren't here to discuss or learn anything, just make noise. So /ignore.

Sorry, what has your reply got to do with the extract from my post you quoted?
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-03 14:09:45
A good start in eliminating cheating is to look at examples of cheating that have been perpetrated in the past. Have you got these examples?

Yes, although I am not in a position to prove that it was being done consciously and on purpose. It is often conceivable that the individuals were simply incompetent or negligent, but not malicious, but sometimes that's hard to believe, and I'm compelled to assume that there must have been an element of maliciousness. Sometimes I'd say that they were deliberately ignorant and obstinate. I can't look into other peoples' heads, but it doesn't matter for the end result.

Examples are: Fiddling with the statistics, trying to extract a significance where there isn't one. Moving the goalpost after the fact. Trying to selectively exclude unwelcome results using dubious arguments. Inventing reasons for dismissing a test after the fact, even though they previously had accepted the terms. Not revealing a clue which gave away the result, thereby pretending the test was valid when it wasn't.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-03 15:17:02
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.
That would confuse a listener who was listening - if A or B suddenly became a previously unheard C!

However, the concept is well used in other ways that don't confuse, and called a hidden reference. It's not a new idea.


In the strictest sense, a blind test can't prove that something is inaudible. There are pages written about this. However, where someone claims to hear a differences, continues to believe they hear the difference during the blind test, but their answers show they could not detect the difference, that's about as close to proof that that person was imagining the difference that you can get.

Cheers,
David.


Thanks.

One would have thought none of this needed saying at this point.

But one would again have been wrong.

Those who imagine they are  bravely tilting ' in the face adversity' are going to keep showing up here, unaware of prior work, or mis-characterizing it, or insisting that they've invented the new work. Some may even actually know a thing or two, but not a thing or three.

Note to self, must keep that in mind.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-03 15:46:49
Sure, but the issue is we don't know what percentage are bad & I suspect that many are.


You've been a member here at HA since 2005...plenty of time to witness many, many posted ABX results.  Can you point to all the 'suspect' ones right here on HA?  Have you any sense of how many returned 'positive' (which I
presume you're OK with) vs 'negative' (which is where all your suspicion seems to point) results?


Quote
On the other hand the attitude shown here is that they don't care about this. I wouldn't care either if null results were simply just discarded & of no importance but that is not the case. The accumulated body of null results is used as evidence of inaudibility. Look at ArnyK's jangling keys test - flawed as it was - the claim was that so many have done this test in 15 years & nobody has produced a positive result.
So these flawed files were tested how many thousands of time over 15 years



Good question.

Quote
& not one person picked up the audible flaw. This body of null results was often cited as evidence that there is no audible difference with high-res. Irrespective of whether Amir's results are due to high-res differences, his repeated ABX positive test results show that there is an audible difference between the two jangling keys files that stood for 15 years as being audibly identical.

A similar thing happened for Ethan Winer's loopback test files which stood for a similar amount of time without any positive results.



How many reports existed?

Quote
The point being - in both of these cases ACTUAL audible differences existed in the test files but this was undiscovered during the claimed many thousands of blind tests


You sure about that number you're throwing around?  Downloading some files does not mean tests were done and results were reported every time.

Btw, if you find all *that* suspect, what are we skeptics to make when suddenly *multiple* reports of positive difference, within a short period of time, appear online for differences previously mooted to be difficult if not impossible to discern?



Quote
run during the 15 years previous. So what was the problem? Why no positive results over this time? Why, when some positive results are reported do others then find similar positive results?

This gives me a large question mark over the capability of such tests to reveal small differences & led me to ask what the level of false negatives are for ABX testing. I know from my own experience of running ABX & other blind tests how easily it is to get bored & lose focus & not actually listen. It's a difficult task to retain concentration on the same short piece of repeated music at the level of analytic hearing required in this form of listening. This lapse is often not even something that people are aware of - it's not like reading where you find that you need to re-read the same paragraph a number of time because your mind has wandered - in audio, a lapse in focus generally goes unnoticed. I figured that including some internal controls in the test could begin to reveal how prevalent this might be & build a profile of just how reliable these tests are.


'Internal controls' can be a training run beforehand, where difference is introduced at the start then incrementally decreased until some threshold is reached.  This is not a new idea.  Neither  'phantom switch', where A and B are actually the same, though the listener doesn't know it.  This is not a new idea either.

Best practice for an ABX type test (which does not admit  'internal' negative and positive controls -- the sort that would prove the subject is 'really listening' -- in the sense of including them *within a single test*) includes training the subjects beforehand. For a given A and B, the only way to implement 'controls' of the sort you demand,  would be to do *two* ABX 'training' runs, keeping A the same as you experimental A, but changing B -; in one, B 'must' differ from A (positive expected from magnitude of difference; this could be an incremental difference test, thereby also checking sensitivity)) ,and in the other 'B 'cannot' differ (phantom switch).  These would of course mean you are using a *different* B in some sense, than your experimental B.  They merely would demonstrate that the ABX setup works and that the subject's hearing is intact.



Quote
There seems to be a great reluctance to address such a mechanism.



So say you.  There are also comparison tests that are better suited to testing different propositions.  Your own ignorance of them is not a sign of 'reluctance'.

There does seem to be a great reluctance of DAC makers -- and champions of certain DACs -- to run and publish DBTs involving their gear, though.


Meanwhile there have been many, many attempts to critique ABX...perhaps the most recent being the 'cognitive load' ploy from Meridian (which is amusingly at odds with your critique -- if anything it suggests listeners are *trying too hard*).  Your belief that an accumulation of nulls is due to many subject not 'really listening' is simply that: your belief, absent some good evidence.  It does nto accord with either my personal ABX experiences, nor with what I ahve seen on this forum, which is perhaps the largest repository of ABX results online.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 15:55:43
By changing the volume of X every now & then you are introducing a small change which should be audible. I doubt this would confuse
With respect, there speaks the voice of one who doesn't know what they're talking about. You really haven't thought this through at all.

Quote
- how are hidden references used in blind tests?

...

Sure hidden references are recommended for blind tests in the standards documents - why?
It isn't just included for no reason - it is a way of self-checking the test itself - something that is sadly missing in ABX testing
This is what I'm trying to get across here.
Known audio problems (which aren't hidden references - apologies for using the word for the opposite concept!) are included in the double-blind medium scale listening tests carried out right here.

You're posting in the very same forum and sub-forum as the results from these tests. Go on - be a little bit curious and go and look at the one from last year (http://www.hydrogenaud.io/forums/index.php?showtopic=106911). FAAC q30 was the audibly different thing that everyone should have been able to spot. You might find some of my own comments aren't 100 miles way from yours. The difference being of course that I spent some time doing over 60 sets of double-blind comparisons before commenting.


It's strange to want to debate something that you know so little about. I can understand you wanting to ask questions to learn about it from people who invented it and people who have done it a lot. That would make sense. But to be so keen to find folks who understand something which you yourself have so little experience and understanding of, for the purpose of telling them how wrong and flawed the thing is. It's just weird.

It most real-life situations, most people would keep quiet until they'd learned a bit more.

Still, the faceless world of the internet does weird things to people. For better and worse.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 16:00:36
A good start in eliminating cheating is to look at examples of cheating that have been perpetrated in the past. Have you got these examples?

Yes, although I am not in a position to prove that it was being done consciously and on purpose. It is often conceivable that the individuals were simply incompetent or negligent, but not malicious, but sometimes that's hard to believe, and I'm compelled to assume that there must have been an element of maliciousness. Sometimes I'd say that they were deliberately ignorant and obstinate. I can't look into other peoples' heads, but it doesn't matter for the end result.

Examples are: Fiddling with the statistics, trying to extract a significance where there isn't one. Moving the goalpost after the fact. Trying to selectively exclude unwelcome results using dubious arguments. Inventing reasons for dismissing a test after the fact, even though they previously had accepted the terms. Not revealing a clue which gave away the result, thereby pretending the test was valid when it wasn't.

And my suggestion for internal controls would eliminate some of this revisionism, don't you think? If it was shown in the results that the hidden references were differentiated in the test then it would go a long way towards showing that the test was at least sensitive enough to reveal the impairment level in these hidden references.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 16:11:14
By changing the volume of X every now & then you are introducing a small change which should be audible. I doubt this would confuse
With respect, there speaks the voice of one who doesn't know what they're talking about. You really haven't thought this through at all.

Quote
- how are hidden references used in blind tests?

...

Sure hidden references are recommended for blind tests in the standards documents - why?
It isn't just included for no reason - it is a way of self-checking the test itself - something that is sadly missing in ABX testing
This is what I'm trying to get across here.
Known audio problems (which aren't hidden references - apologies for using the word for the opposite concept!) are included in the double-blind medium scale listening tests carried out right here.

You're posting in the very same forum and sub-forum as the results from these tests. Go on - be a little bit curious and go and look at the one from last year (http://www.hydrogenaud.io/forums/index.php?showtopic=106911). FAAC q30 was the audibly different thing that everyone should have been able to spot. You might find some of my own comments aren't 100 miles way from yours. The difference being of course that I spent some time doing over 60 sets of double-blind comparisons before commenting.


It's strange to want to debate something that you know so little about. I can understand you wanting to ask questions to learn about it from people who invented it and people who have done it a lot. That would make sense. But to be so keen to find folks who understand something which you yourself have so little experience and understanding of, for the purpose of telling them how wrong and flawed the thing is. It's just weird.

It most real-life situations, most people would keep quiet until they'd learned a bit more.

Still, the faceless world of the internet does weird things to people. For better and worse.

Cheers,
David.

David, you seem like a reasonable person & I'm sorry if I come across as somebody who knows nothing about what I'm talking about - I believed that this discussion was the form of debate in which I would learn something but Ok, let me ask why it is considered inappropriate to include the internal controls I suggested in the Foobar ABX software? You say it will confuse but I can't see how - can you explain your thinking some more, please?

Maybe my suggested control in Foobar ABX is not practical or not workable or not useful but so far I don;t see this

Irrespective of the practicality of my suggestion, is the principle of trying to separate valid from invalid null results objected to? as I haven't seen this actually answered. If the principle is agreed, is there no interest in working towards a way of doing this?

I see this applies for the blind test I was directed to:
Quote
Kamedo2, please, correct "Post-screening":

Quote

If you rank the low anchor at 5.0, your result of the sample will be invalid.
If you rank the mid-low anchor at 5.0, your result of the sample will be invalid.
If you rank the low anchor higher than the mid-low anchor, your result of the sample will be invalid.
If you rank the reference worse than 4.5, your result of the sample will be invalid.
If you rank the reference worse than 5.0 on 25% or more of submitted results, all of your results will be invalid.
If you submit 25% or more invalid results, all of your results will be invalid.


I just can't see why the same attempt at valid/invalid test discrimination doesn't apply to ABX testing?

BTW, I'm under no illusion that I'm introducing anything novel - I have cited the ITU standards documents many times in reference to the inclusion of internal controls in blind tests - I just wanted to see how this could be applied to Foobar ABX tests

Maybe my test won't work& I haven't really thought through the practicality of using it -  I know that ABX is a forced choice test but if the listener is made aware that some random hidden anchors/controls will be included & they should just register when they discern them, would this be workable?
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 16:31:00
In an ABX test (or any listening test, sighted or blind) you try to figure out what the difference is. Mostly (not always, but mostly) you then try to detect that difference again and again. That's how you pass an ABX test. That's how I do it, anyway.

A level difference is almost certainly not what you're looking for. Worse, a small level difference might not be perceived as a level difference, but as some other change. I've been tricked by this myself. It can sound like more or less bass, treble, clarity, sound stage - you name it. It's got to be comparatively big before you can unambiguously say "the main change, maybe the only change, is the level." By that stage, it's obvious, and even the person who wasn't really listening properly will notice it, so it probably doesn't help you. Worse though is that at lower levels, it could interfere with finding the exact difference that the attentive listener has homed in on.


You can't force someone to listen. But the context in which an ABX test makes most sense is when someone thinks they hear a difference in a sighted test (i.e. normal listening). Then you take the test to prove (first to yourself) that you really hear something. In that circumstance, I can't see why someone wouldn't listen.

I know some people do do it, but having failed to hear any difference in a sighted test, I've never felt the need to "prove" to myself that I don't hear a difference by failing an ABX test. What I have done sometimes is simply not been sure. Then I've done the ABX test, and carefully (not casually, but carefully) made my best guess when I've still not been sure, and looked at the results to see if there was anything in my hunches.

Proper psychoacoustic tests do the same thing BTW, varying the amplitude of the difference until they find the point at which your guesses are correct, say, 70% of the time. I've created, run, and taken part in those tests, and when you find yourself doing it, sometimes it's like magic right at that threshold because you can't "hear" the difference enough to be certain you heard anything, but you can guess, and those guesses are mostly right.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-03 16:33:16
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.


That destroys the simplicity of the ABX test and thus creates a strong potential to decrease the sensitivity of listeners who are trying to do their best.

Simplist possible perceptual chore for the listener => best possible results.

There are other ways to address the problem such as by monitoring a systematic listener training program.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 16:39:01
There's a bit of a red herring here btw. You're (jkeny) raising the possibility of there being a genuinely audible difference, but the person doesn't listen carefully enough to notice it, yes? You're raising it in the context of blind testing. But it's equally applicable to sighted testing.

The difference being that any claims to hear a difference are unproven and hence effectively worthless in simple sighted testing, but statistically verified in (uncheated) double-blind testing. Hence the 100 tests on people who couldn't hear (or couldn't be bothered to hear) the difference in double-blind tests don't take away from the few double-blind tests where people did prove they could hear a difference. They're not damaging anything.

Whereas the unreliability of sighted tests damages them completely.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-03 16:51:40
And my suggestion for internal controls would eliminate some of this revisionism, don't you think? If it was shown in the results that the hidden references were differentiated in the test then it would go a long way towards showing that the test was at least sensitive enough to reveal the impairment level in these hidden references.

I doubt it. I have encountered a large amount of creativity in inventing pretexts to excuse an unwelcome test result. Some of it is so blatant that most people would immediately dismiss it as a fabrication. Some are at least a little bit clever. I grant you that more internal controls can make it more difficult to mount a credible challenge of the test results, but it won't deter people from trying. If it needs to be explained away, people will explain it away. Heck, you are an example yourself: You ignore half of what is being written here in order to hang on to your failed notion of an invalid null result.

Or take the Meyer/Moran test as a high-profile example. It has attracted a large amount of criticism, most of it is undeserved. Quite frequently people don't seem to even read the paper before they criticise it. Some of the criticism is laughable in its small-mindedness. Some criticism is directed at claims their paper hasn't made. Some criticism is cloaked in questions and suspicions, to avoid the accusation of slander while achieving the same result.

Quite separate from the question of the validity of the respective test, this show can be very illuminating. It gives you an impression of the motives and the personality of the people involved in the debate. This has probably done more to shape my opinion than the test results themselves. I wouldn't want to miss it, even though the mudslinging is sometimes not funny at all.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 16:54:33
In an ABX test (or any listening test, sighted or blind) you try to figure out what the difference is. Mostly (not always, but mostly) you then try to detect that difference again and again. That's how you pass an ABX test. That's how I do it, anyway.

A level difference is almost certainly not what you're looking for. Worse, a small level difference might not be perceived as a level difference, but as some other change. I've been tricked by this myself. It can sound like more or less bass, treble, clarity, sound stage - you name it. It's got to be comparatively big before you can unambiguously say "the main change, maybe the only change, is the level." By that stage, it's obvious, and even the person who wasn't really listening properly will notice it, so it probably doesn't help you. Worse though is that at lower levels, it could interfere with finding the exact difference that the attentive listener has homed in on.
Yes, I edited my post & added this just before you posted "Maybe my test won't work & I haven't really thought through the practicality of using it - I know that ABX is a forced choice test but if the listener is made aware that some random hidden anchors/controls will be included & they should just register when they discern them, would this be workable?"

Quote
You can't force someone to listen. But the context in which an ABX test makes most sense is when someone thinks they hear a difference in a sighted test (i.e. normal listening). Then you take the test to prove (first to yourself) that you really hear something. In that circumstance, I can't see why someone wouldn't listen.
Agreed but in a previous closed thread I had many of the posters telling me that if there was no difference heard in the first couple of ABX trials then it is perfectly acceptable to randomly guess the rest i.e not bother listening. So this is confusing to me - I would continue to give my full attention to all trials in the test until finished, otherwise I would consider it a test that only consisted of 1 or 2 trials - the rest being guesses that I used a random generator to complete.

Quote
I know some people do do it, but having failed to hear any difference in a sighted test, I've never felt the need to "prove" to myself that I don't hear a difference by failing an ABX test. What I have done sometimes is simply not been sure. Then I've done the ABX test, and carefully (not casually, but carefully) made my best guess when I've still not been sure, and looked at the results to see if there was anything in my hunches.
I agree, so I don't understand why people would use ABX for files/devices that they don't hear any differences between - yet I see lots of evidence of that in these posts & posts on other forums. I would like some way of teasing out all these invalid tests from the valid results & that's all I'm trying to get at.

Quote
Proper psychoacoustic tests do the same thing BTW, varying the amplitude of the difference until they find the point at which your guesses are correct, say, 70% of the time. I've created, run, and taken part in those tests, and when you find yourself doing it, sometimes it's like magic right at that threshold because you can't "hear" the difference enough to be certain you heard anything, but you can guess, and those guesses are mostly right.

Cheers,
David.
Yes, I've experienced that threshold where your gut feeling can actually be found to be statistically correct & maybe this is a good enough reason for doing ABX tests even if you haven't identified a difference? But I believe the motive & intent behind doing this is the important aspect in determining whether this is a genuine open-minded use of the test or not. 
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 17:04:48
I already suggested a way to include controls in ABX testing that randomly puts an agreed audible condition in random trials as one way to sense this occurrence.


That destroys the simplicity of the ABX test and thus creates a strong potential to decrease the sensitivity of listeners who are trying to do their best.
Is simplicity the overriding factor that trumps all else? After all there are many other blind tests that are not as simple as ABX testing & these are not objected to. I don't accept this reason as a good reason for not trying to differentiate possible invalid results 

Quote
Simplist possible perceptual chore for the listener => best possible results.
For the above reasons I don't accept this statement

Quote
There are other ways to address the problem such as by monitoring a systematic listener training program.
You are making assumptions that there is a listener training program which I suggest there usually isn't. Even if there is - this doesn't tell us anything about whether the listener lost focus during the ABX test - something I suspect is far more prevalent than seems to be acknowledged here. I know from my own tests & form the reports of others that as the test goes on this becomes a greater & greater issue. Has anyone done a statistical analysis across many ABX results which shows the number of correct guesses early in the test Vs later in the test? I would be interested in these results if anyone has done them.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 17:12:04
There's a bit of a red herring here btw. You're (jkeny) raising the possibility of there being a genuinely audible difference, but the person doesn't listen carefully enough to notice it, yes? You're raising it in the context of blind testing. But it's equally applicable to sighted testing.
But this isn't a discussion about the validity/accuracy/specificity/sensitivity of ABX testing Vs sighted testing - it's about ABX testing & it's validity/accuracy/specificity/sensitivity

Quote
The difference being that any claims to hear a difference are unproven and hence effectively worthless in simple sighted testing, but statistically verified in (uncheated) double-blind testing. Hence the 100 tests on people who couldn't hear (or couldn't be bothered to hear) the difference in double-blind tests don't take away from the few double-blind tests where people did prove they could hear a difference. They're not damaging anything.

Whereas the unreliability of sighted tests damages them completely.

Cheers,
David.

Again this is not about sighted Vs blind tests but it does raise the issue of false positives & false negatives
Yes, sighted tests are well known to be prone to false positives
I'm suggesting that ABX tests are prone to false negatives & would like to disprove or verify this as I thought others might also be?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-03 17:20:29
By changing the volume of X every now & then you are introducing a small change which should be audible.


It is comments like this that tell me that I'm dealing with someone who lacks serious involvement in subjective testing. Given your track record for running the other way when the words are mentioned...

OK maybe you've done a half dozen tests that were in some sense blind. That puts you in the same category as a 15 year old who has just finished his driver's training course lecturing Jeff Gordon and  Fernando Alonso about how to drive a race car.

If there was a rule of rules about listening tests, it would be that the simplest test that gets the job done is the best test. The only reason why the simplest test possible which is arguably a sighted evaluation doesn't get the job done is the overwhelming number of false positives. 

Listeners are like batters in baseball, they score well by handling whatever the listening test throws at them, but the simpler the test, the better they score all other things being equal.  Throwing in random curve balls is how pitchers win games, not batters. That's true even when they are in accordance with the rules of the game.

Quote
I doubt this would confuse



Thanks for admitting that you don't know the answer. If you were an experienced listener like many who have posted their exceptions to your suggestion, you'd know the answer: It might not confound the listener, but it would be very likely to reduce his effectiveness.

Quote
how are hidden references used in blind tests?


Not knowing this fairly basic point and trying to lecture experienced and successful listeners on how to do DBTs doesn't help your credibility.

How hidden references are used in blind tests is well known and easy to find out. If we could only throw people out of this game for being terminally intellectually lazy.

Here's a good starting point: ABC/hr article on HA (http://wiki.hydrogenaud.io/index.php?title=ABC/HR). If it was in that spot below you where the sun shines not, it couldn't be closer! ;-)


Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 17:28:47
But this isn't a discussion about the validity/accuracy/specificity/sensitivity of ABX testing Vs sighted testing
Implicitly it is. We're talking about audio equipment used to listen to music for pleasure. That's normal listening. Well, it is for me  Normal listening is "sighted". If I can't, at any level (even guessing based on barely conscious "hunches") hear a difference during normal listening, then there is no point worrying that there might be some difference that I'm missing. That difference is irrelevant to the use case at hand.

If ABX is no worse than sighted listening in this respect (EDIT: false negatives), it's more than good enough for evaluating audio equipment used to listen to music for pleasure.

You're inventing some hypothetical increased level of sensitivity, that may or may not be possible but is surely irrelevant.


This is not a new discussion. It's been had here before.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-03 17:32:56
Quote
There are other ways to address the problem such as by monitoring a systematic listener training program.
You are making assumptions that there is a listener training program which I suggest there usually isn't.


Speaks again to the post's author's lack of familiarity with how DBTs are done.

For example, in the series of jitter tests I posted on AVS that your buddy $mir flunked, there was a formal listener training program.

My PCABX web site implemented an online listener training facility that was both generalized and also customized for each different kind of test. It was online for a goodly number of years starting in Y2K and stll can be found on the Wayback machine.

Listener training is described in the recent ill-fated Meridian AES Conference paper.

Listener training was discussed extensively in JJ's classic JAES aritcle about the MPEG listening tests from a decade or more ago.

Listener training is discussed in ITU standards committee document BS1116 and sequels.

Etc., etc., etc.

So the above comments underscore what others have been saying - the problem is not with ABX but the problem is due to vocal critic of blind tests who spouts off nonsensical suppositions from the viewpoint of zero intellectual curiosity and no actual reliable knowledge.

For another example of that, read here: Link to Ill-Informed Nonsense (http://www.aes.org/forum/?ID=416&c=2947)

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 17:48:20
This is getting trying & tiring - Can anyone link me to where I can find the answer to this simple question?
Where can I find a measure of the percentage of false negatives in null results reported for ABX tests?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-03 17:51:36
You're inventing some hypothetical increased level of sensitivity, that may or may not be possible but is surely irrelevant.


Agreed. I see no evidence that this increased level of sensitivity is even possible, let alone manifest. It seems to be 100% self-serving speculation.

All this wailing and gnashing of teeth about listening test sensitivity is a tacit admission of the elusive nature of the sonic benefits of most high end audio jewelry, especially DACs like the ones that our correspondent has invested his personal fortune in.

For about the first 25 years after we invented interactive ABX testing, the golden ears were telling us that their new wunder-amps were "Mind blowingly better sounding".

Now the goalposts have moved a whole long way and they are tacitly admitting that only the most careful of listening tests involving the goldenest of ears can be expected to have a positive outcome.

I can live with that! ;-)
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-03 18:38:21
This is getting trying & tiring - Can anyone link me to where I can find the answer to this simple question?
Where can I find a measure of the percentage of false negatives in null results reported for ABX tests?

Isn't it amusing how the guy completely ignores what doesn't fit his own delusion? At least the effort becomes trying and tiring, that means there is some effect...
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 18:50:16
......

If ABX is no worse than sighted listening in this respect (EDIT: false negatives), it's more than good enough for evaluating audio equipment used to listen to music for pleasure.
You think sighted listening might be prone to false negatives? Please explain

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 18:52:03
This is getting trying & tiring - Can anyone link me to where I can find the answer to this simple question?
Where can I find a measure of the percentage of false negatives in null results reported for ABX tests?

Isn't it amusing how the guy completely ignores what doesn't fit his own delusion? At least the effort becomes trying and tiring, that means there is some effect...

Care to answer, then?
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-03 19:07:28
Care to answer, then?

No. No point.

Care to present a practical, reliable way of telling a false negative from a true negative in an ABX test?
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-03 19:21:00
There's a bit of a red herring here btw. You're (jkeny) raising the possibility of there being a genuinely audible difference, but the person doesn't listen carefully enough to notice it, yes? You're raising it in the context of blind testing. But it's equally applicable to sighted testing.
But this isn't a discussion about the validity/accuracy/specificity/sensitivity of ABX testing Vs sighted testing - it's about ABX testing & it's validity/accuracy/specificity/sensitivity



But do tell us, please, what your take is on the validity/accuracy/specificity/sensitivity of sighted 'testing' -- the de facto 'testing' standard in the audio hobby, the audio press, and the implicit 'method' promoted in 99.9% audio product marketing, including yours.

C'mon, take a stand.  'They're known to be prone to false positives' is a start, but barely addresses the matter. 

Sighted audio 'testing' is 'known' to be unreliable; the 'method' is unacceptable to science. 

ABX testing is hypothesized (by you) to be. 

How much credibility to you accord to sighted 'test' results these days?
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 19:30:18
......

If ABX is no worse than sighted listening in this respect (EDIT: false negatives), it's more than good enough for evaluating audio equipment used to listen to music for pleasure.
You think sighted listening might be prone to false negatives? Please explain

It was your suggestion that people might not bother to listen properly. That possibility exists when ever people are supposed to be listening.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-03 19:35:09
This is getting trying & tiring - Can anyone link me to where I can find the answer to this simple question?
Where can I find a measure of the percentage of false negatives in null results reported for ABX tests?

There are stats for this (chances of failing a given length ABX test at a given p val when you detect the difference X percent of the time on average) but I don't have them to hand.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 19:56:00
Care to answer, then?

No. No point.

Care to present a practical, reliable way of telling a false negative from a true negative in an ABX test?

I already gave a way of using internal controls to self-test the listener/test - this will give 100% more information than we have at present
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 20:02:10
......

If ABX is no worse than sighted listening in this respect (EDIT: false negatives), it's more than good enough for evaluating audio equipment used to listen to music for pleasure.
You think sighted listening might be prone to false negatives? Please explain

It was your suggestion that people might not bother to listen properly. That possibility exists when ever people are supposed to be listening.

Ah, right, I see, but then wouldn't their knowledge of what they are listening to just bias them towards their expectations? So false negative is eliminated according to the theory of expectation bias & it's influence on the listener.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-03 20:06:02
I seriously wonder why anyone is still bothering with him since this has already been discussed 2 months ago.

Imagine someone claiming that he can distinguish the colors red and green. Okay, you show him randomly selected red and green cards, but he fails to identify them correctly. We cannot accept his claim (yet) and have to assume that he cannot distinguish the colors, regardless if he either is colorblind, or didn't look at the cards, or deliberately chose to identify them incorrectly.

Case closed.


edit: His suggestion is something along the lines of throwing in a black card instead of the green one (for example) every once in a while. He assumes that people will correctly distinguish that when they are looking at the cards and incorrectly identify them if they are not looking at the cards ... and it should be patently obvious that this assumption is wrong.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 20:18:38
Again, I ask this:
I just can't see why the same attempt at valid/invalid test discrimination doesn't apply to ABX testing?
Quote
This applies for the blind test I was directed to:
Quote
If you rank the low anchor at 5.0, your result of the sample will be invalid.
If you rank the mid-low anchor at 5.0, your result of the sample will be invalid.
If you rank the low anchor higher than the mid-low anchor, your result of the sample will be invalid.
If you rank the reference worse than 4.5, your result of the sample will be invalid.
If you rank the reference worse than 5.0 on 25% or more of submitted results, all of your results will be invalid.
If you submit 25% or more invalid results, all of your results will be invalid.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-03 20:26:07
I already gave a way of using internal controls to self-test the listener/test - this will give 100% more information than we have at present

That barely scratches the surface of what you expect others to give you statistics for. But hey, you remember your own post, so do you also remember the multiple posts where you have been explained that the concept of a false negative in ABX is bunk? Yet you proceed to even ask for statistics for something that is just a figment of your own imagination?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 20:31:14
I already gave a way of using internal controls to self-test the listener/test - this will give 100% more information than we have at present

That barely scratches the surface of what you expect others to give you statistics for. But hey, you remember your own post, so do you also remember the multiple posts where you have been explained that the concept of a false negative in ABX is bunk? Yet you proceed to even ask for statistics for something that is just a figment of your own imagination?

Really? It must also be a figment of the ITU standards bodies & Arnyk's imagination also!
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 21:07:19
I already gave a way of using internal controls to self-test the listener/test - this will give 100% more information than we have at present

That barely scratches the surface of what you expect others to give you statistics for. But hey, you remember your own post, so do you also remember the multiple posts where you have been explained that the concept of a false negative in ABX is bunk? Yet you proceed to even ask for statistics for something that is just a figment of your own imagination?

Just to add another persona similarly delusional - JJ (woodinville) who posted this here (http://www.hydrogenaud.io/forums/index.php?showtopic=98008&view=findpost&p=815194):
Quote
I'm asking for better tests, and yes, there should ALWAYS be positive and negative controls in a test, and no, they aren't that hard to add, and yes, you can add them in varying levels of positive control and get some very useful information. So you should. I'm standing absolutely firm on that position, because I see so many tests that I can't even evaluate the results coming to me in capacities as reviewer or editor, tests that have no way to relate them to other sets of results in any fashion. (no, I don't mean you should combine results)

.......

I am frankly surprised at the apparent offense taken to what I said. I'm simply describing standard practice.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-03 22:38:33
(sigh)

How many times have you been told that much of this has been discussed already?

And then *you* finally seek out some discussion  et voila,  it's a new thing.



NB: Much of *that* discussion ends up being  about ranking tests (public codec comaprison tests conducted via HA):  where difference per se is not being tested, it's relative quality.  E.g  ABC/HR

NB: to the extent that positive and negative controls for *ABX* are discussed, it's what I said:  for negative , JJ proposes 'A vs A'  , which is what I called 'phantom switch'. For postiive control, it's graded impairments. But:
If you insert random 'A vs A' trials into an otherwise 'A vs B'  ABX  test (where A and B really are different things), that is *a different test* -- either you are 'tricking' the subject (phantom switch) or you are asking them to take a test where at each trial they MAY OR MAY NOT be comparing two different inputs.  If you run a separate ABX test where A and B are *always* to same, that is part of 'training' or 'test/subject validation'. Same applies to having graded differences levels within one test. ..that's not quite the same test as 'ABX'.

NB: Yet the point I made in 2012 still stands.  JJ is talking from a pure research standpoint --  testing the *general* , abstract hypothesis 'there is no difference between A and B'.  You design the best test you can for that.  But much fo the 'debate' on the interwebs and the audio press is not absract, it  involves actual audio braggarts claiming that they ALREADY HEAR a difference, sighted, between and A and a B, and often a veil-lifting, *not subtle* difference.  A 'subject' is claiming to *already* have trained themselves, in other words, to do something that seems extraordinary.  We can test *that*, rather more specific, proposition too, if the subject agrees to it . 

JJ may disagree with me, but I'd say *no training needed there*.  You just ABX test the claimant and his claim, as is.  Positive/negative controls for the purpose jkeny proposes, would be simply unnecessary. 

And frankly, it's the big claims made under comparatively routine conditions that comprise 'home listening' that I'm more interested in testing.  I don't particularly care if under extremely well-chosen circumstances, a very, very small difference can be detected by some recruits to an academic study.  There's a huge bridge to cross from there, to 'a veil was lifted when I switched DACs'

jkeny, you already claim you can hear such differences betweem your DACs and others.  We don't need to train you to hear that.  We don't need positive or negative controls for you; this issn't for publication, and you seem likely to listen *attentively* given the stakes. We just need to sit you down with two DACs you claim sound different, and see if you can consistently tell them apart them just by hearing.  With a proctor at hand to make sure you don't cheat.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-03 23:12:54
Ah, right, I see you just want to use ABX testing as a type of challenge, rather than an actual tool to determine what is audible?
In this case I can see why you are not interested in controls or really anything about anything that might bias the test results - your main interest is in shaming the person who claims to hear something that you disagree can be audible.

You break most of the good practise guidelines for test design but it does explain your thinking quite succinctly.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-04 00:09:49
Where can I find a measure of the percentage of false negatives in null results reported for ABX tests?


With proper knowledge of psychoacoustics, the results a given listening test of any kind can be evaluated and any false negatives or positives can be estimated.

For example, existing knowledge of psychoacoustics will indicate that any positive results from  listening tests involving good DACs or amplifiers are false.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-04 00:36:16
JJ (woodinville) who posted this

Don't forget this (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-3.html#post31083466)

Quote
Quote
Originally Posted by jkeny
I've been suggesting for a long time now that If anybody wants to prove DBTs don't skew results towards nulls then include controls in them for false negatives - then we will have a measure of whether a null result is because the test is prone to false negatives or not, as the case may be. A perfectly logical & reasonable thing to do but so far just excuses put forth for not doing so. Until I see such controls the test is unreliable, in my mind
Well, there have been DBT's with controls that have shown striking sensitivity, in fact, DBT's with controls seem to show that pretty much uniformly, perhaps because people who use controls are good at testing.

Well, there have been DBT's with controls that have shown striking sensitivity, in fact, DBT's with controls seem to show that pretty much uniformly, perhaps because people who use controls are good at testing.
HOWEVER, since speakers all do audibly sound different, this complaint is utterly, completely, and absolutely MEANINGLESS to this loudspeaker testing.

The fact that differences were heard in the test that is having mud slung at it shows conclusively that it was sensitive enough, and that mostly, the engineers who were listening didn't hear such huge differences when the speaker ID was obscured. That they did show differences completely refutes, for all time, the complaining about positive controls in this particular case.

The complaint is mistaken, the test is good. We're done here.
James D. (jj) Johnston


Oh yes and this (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-3.html#post31096410)
Quote
Quote
Originally Posted by jkeny
As I said we have no way of judging if the blind part of the test is attenuating differences
And, since there WERE differences shown, we know that they weren't. When listeners do hear a difference, they can focus on the differences, obviously, because they gave reliable answers. In that case, we know that there is very, very little, if any, "attenuation". You're trying to use my words that apply to threshold tests to disqualify something that was obviously not a threshold test.

So, in this case, the test has self-tested. You can argue otherwise until the cows come home, but that's how this one worked out.
James D. (jj) Johnston

Just in case you "forgot".
Now, remind us, what oh so critical controls did you use during your purported DAC "blind" test?

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-04 00:48:11
Hi AJ,
Just as one bad ABX doesn't spoil the whole bunch...

That's the straw the organic peddlers grasp at.

One bad Subjectivist (ML) doesn't spoil the whole bunch.

Don't consider him to be "bad". Or a subjectivist, except the completely distorted versions audiophiles claim to be. He actually had the stones (or more likely lack of cognizance of what he was falling into) to do a proctored test, unlike the shyster-peddlers.

I'm curious what JK's conditions to be persuaded would be. He has been posting concerns here.
Cheers, SAM

Yep, you are new at this. Good luck with that. 

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-04 07:34:21
Ah, right, I see you just want to use ABX testing as a type of challenge, rather than an actual tool to determine what is audible?


For characters like you, sure.  For such as you -- and the audio hobby is *full* of them - I'm interested in using ABX to see if *your* particular claim holds up.  Because characters like you will see a 'no difference' lab result and say, well, *I* could still veils lifted with *my* gear, even if those deaf duffers couldn't.

Quote
In this case I can see why you are not interested in controls or really anything about anything that might bias the test results - your main interest is in shaming the person who claims to hear something that you disagree can be audible.


Who, me?   

Quote
You break most of the good practise guidelines for test design but it does explain your thinking quite succinctly.


Your thinking is not exactly opaque either, sirrah.

If standard discourse about audio quality was generally conducted at the level of academic science, I'd be all about academic difference testing.

But alas, it's not.

Hey, tell you what, when characters like you stop making grandiose claims about tiny-if-at-all-audible differences the *standard* in audio discourse, I'll lose interest in 'shaming' you.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-04 09:45:39
Ah, right, I see you just want to use ABX testing as a type of challenge, rather than an actual tool to determine what is audible?


You've missed an important point.  A tool that determines what is audible presents a serious challenge to people who are poorly informed about audibility.

Quote
In this case I can see why you are not interested in controls or really anything about anything that might bias the test results


You've missed yet another important point. It is the controls that make the test into a challenge. We know that people who hate DBTs such as yourself hate them because of the controls they implement to help manage bias.

Quote
your main interest is in shaming the person who claims to hear something that you disagree can be audible.


Your main interest is clearly in shaming anyperson who earnestly wants to find out which things are audible, but that you disagree with because of your pecuniary interests.

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-04 12:14:22
Thanks to all of you for being so frank & open.
I'm sure these posts will stand as a reference to which people who want to do perceptual testing will turn to for the expertise contained here on how NOT to do it.
It's been very revealing & enlightening - thanks!
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-04 13:29:38
I'm sure these posts will stand as a reference to which people who want to do perceptual testing

Your facade of interest ends when asked about your own purported "blind" test where you "heard" your DAC emitting "organic sounds".
You also don't seem to apply the "must have controls to be valid" rules to all "blind" tests (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different/page51).
Must be a biz$ness thing.

It's been very revealing & enlightening - thanks!

When you're ready to reveal and enlighten us with your proctored and +/- controls blind test results of organically grown, biochemically engineered DACs, we'll be here for ya.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-04 13:43:40
Thanks to all of you for being so frank & open.
I'm sure these posts will stand as a reference to which people who want to do perceptual testing will turn to for the expertise contained here on how NOT to do it.


There's a question that runs through my mind as I compose replies to threads like this - who is sincere and who will say just about anything to get a response.

Reviewing posts like this one;

AVS Reply (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-3.html#post31099354)

and this one:

Response from JJ (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-3.html#post31096410)

is strong evidence  that the problem is not your ignorance (even though there is a ton of it) but your blatant dismissal of all relevant and correct information in accordance with a destructive agenda.

Quote
It's been very revealing & enlightening - thanks!


If its so enlightening to you Mr. Keny, why do so many people. both Joe Audiophile and an Internationally renowned expert  have tell you the same thing over and over again?
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-04 14:06:07
This sums up Jkenys (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different&p=279920&viewfull=1#post279920) position very clearly:
Quote
But as I said, this is usually a (blind) test for a specific difference - I get better desired, more comprehensive results with longer term listening peeking.
Blinding is an inhibitor to the "results" elicited from long term daydreams.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-04 14:07:36
If its so enlightening to you Mr. Keny, why do so many people. both Joe Audiophile and Internationally renowned expert  have tell you the same thing over and over again?

I think he was being sarcastic. He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position. You can see from his treatment of JJ and the like how eagerly he is in pursuit of such snippets he can misapply. The actual enlightenment for him was most likely zero, and he never wanted any to start with.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-04 17:07:51
I think he was being sarcastic. He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position.
Nope, I wasn't being sarcastic - it was enlightening - I don't need to distort or ridicule your position, it's enough to just expose it untouched for all to see.
Quote
You can see from his treatment of JJ and the like how eagerly he is in pursuit of such snippets he can misapply. The actual enlightenment for him was most likely zero, and he never wanted any to start with.
JJ was right & I was wrong in that exchange!!
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-04 17:46:54
why do so many people [...] have tell you the same thing over and over again?

Answer (http://www.econectados.com/wp-content/uploads/senal_troll.jpg).
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-04 20:08:12
He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position.
It would be easier to understand if he was a classic troll (which of course he might be), or if he had a financial interest in this.

As it is, I'm not sure who he would want to convince, or why. It's a bit like religious fervor. Or the daftest "I have to be seen to be right" Usenet argument.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-05 00:04:17
He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position.
It would be more interesting...  ...if he had a financial interest in this.


That exists.

Link to aanouncement of JKenny's USB DAC (http://www.whatsbestforum.com/showthread.php?11946-In-For-Review-Our-Own-John-Kenny-s-Ciunas-USB-DAC!&p=216186&viewfull=1#post216186)


More:

Ciunas USB Web site - note URL (http://www.johnkenny.biz/ciunas-dac)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-05 03:39:18
He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position. You can see from his treatment of JJ and the like how eagerly he is in pursuit of such snippets he can misapply. The actual enlightenment for him was most likely zero, and he never wanted any to start with.



That has already happened earlier this afternoon:

Link to JKenny post from this afternoon (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-18.html#post31435625)

"
And if you want a good example of how not to run blind tests - just go to this thread on HA which was just closed. It comes from ArnyK & his crew. To avoid you having to drag yourself through the dross here's the jist of what is said:
- false negatives are of no importance in blind tests (most don't even admit to them or know what a false negative is), only false positives
- as a consequence no internal controls are necessary to determine the test's specificity/sensitivity
- & here's the really great one: Once you have done a few trials & heard no difference, there's no need to do any more listening, just select randomly for all subsequent trials  Isn't that great? That's about where greynol closed down the thread
"

The first lie is "Just go to this thread on HA which was just closed."  In fact the thread was closed more than a month ago.

The guy is really brave when nobody can correct his liberal and self-serving distortions of the truth!
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-05 07:41:08
That has already happened earlier this afternoon:

LOL!

Excellent, that's exactly what I expected. The guy is as shameless as he is clueless.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-05 11:18:50
It's rather sad, really. Add jkeny to that last statement (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108127&view=findpost&p=888811).
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-05 13:39:58
It's rather sad, really. Add jkeny to that last statement (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108127&view=findpost&p=888811).



Sad for John but probably good for the rest of us. The big tip off is that the probable purpose of his erroneous post on AVS was to distract people from his most recent appearance here. That thread didn't get closed so his disruptions were far better controlled. He made a number of pretty serious technical gaffes that revealed his lack of preparation to engage in serious talk about subjective testing.

BTW lest there be any doubt about his pecuniary purposes, from the registration of his web site:

- ADMIN EDIT: Personal information removed.

This appears to be residence.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-05 13:55:08
For someone whom you consider so stupid & ignorant & who is obviously so wrong, you are devoting an great deal of time & effort trying to discredit me in your fatwah, instead of just letting the stupidity & ignorance in my posts speak for themselves. To paraphrase Shakespeare - you guys doth protest too much.

I would appreciate my address being removed from postings, please!
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-05 14:07:25
I would appreciate my address being removed from postings, please!

Too late. That's the DAC police shyster squad knocking on your door right now John. Don't be too paranoid to open it and see if it's actually the pizza guy.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-05 14:20:27
For someone whom you consider so stupid & ignorant & who is obviously so wrong, you are devoting an great deal of time & effort trying to discredit me in your fatwah, instead of just letting the stupidity & ignorance in my posts speak for themselves. To paraphrase Shakespeare - you guys doth protest too much.


John, I didn't put a gun to your head and make you post that incorrect and misleading information about HA on AVS.  You might be your own worst enemy.

I see you also decided to promulgate this little piece of falseness:

"
Not having any measure of type II errors means you have no way of judging the statistical power of the test i.e your results are of unknown validity - basically they represent the central ceremony in your religious belief system & like all such religious ceremony they have at their core a mystery which people dare not question or they are treated as heretics.
"

I told you that we know exactly how to judge the statistical power of any particular listening test by applying what we know about psychoacoustics. but that doesn't appear to fit into your religious/commercial agenda, especially the part about psychoacoustics predicting the audible futility of chasing after the latest greatest DACs.

John, I'd love to see you admit in public that its only a type 2 failure if you know for sure that the negative outcome is false, and that almost all of the issues that high end audiophiles obsess over and spend their money with floobydust merchants like high end DAC merchants are known to be audibly futile in accordance with modern psychoacoustical knowledge.

Quote
I would appreciate my address being removed from postings, please!


Friendly advice: If you check off the right box on your domain registration, you can keep that information out of the public eye. As things stand it was very simple to type in Whois.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-05 15:03:20
.............
Friendly advice: If you check off the right box on your domain registration, you can keep that information out of the public eye. As things stand it was very simple to type in Whois.

Thanks for that advice, Arny - I will look into it but I didn't think you could make private that information in the domain registration
Anyway, it's another thing having it posted on a forum & not something I would do.
I'm surprised you did this bearing in mind all the trauma you have been through on the rec.audio forum?

As to the rest of your post, I will let you dig your own hole - as you have demonstrated your adeptness at doing
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-05 15:37:15
I'm surprised you did this bearing in mind all the trauma you have been through on the rec.audio forum?

You're afraid of a rational person with >2 brain cells showing up at your door?
The AVRev interview (http://www.avrev.com/home-theater-preamplifiers/stereo-preamps/john-kenny-ciunas-usb-dac-review-4.html) where you shared your non-ITU daydreams (love the TT reference btw) also has your "address". You want that removed too?

Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-05 16:07:14
.............
Friendly advice: If you check off the right box on your domain registration, you can keep that information out of the public eye. As things stand it was very simple to type in Whois.

Thanks for that advice, Arny - I will look into it but I didn't think you could make private that information in the domain registration
Anyway, it's another thing having it posted on a forum & not something I would do.
I'm surprised you did this bearing in mind all the trauma you have been through on the rec.audio forum?


That was trauma?  I guess you vastly underestimate what happened on RAO.  I had people from RAO in my front yard to harrass me, another driving around town with a firearm promising online to use it on me, and more threatening to confront me there including JA.

My take is that if you put something on the web that publicly, you mustn't have cared. The matter is oh, so clear in the registration application information.

Of course in the process, you confirmed what I wanted to have confirmed...
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-05 17:17:29
He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position. You can see from his treatment of JJ and the like how eagerly he is in pursuit of such snippets he can misapply. The actual enlightenment for him was most likely zero, and he never wanted any to start with.



That has already happened earlier this afternoon:

Link to JKenny post from this afternoon (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-18.html#post31435625)


The date/timestamps are all very wrong on that AVSF  thread. 

No way did I make those posts 'today'. None of those posts are new.

And the thread was closed quite some time ago.
Title: How do you listen to an ABX test?
Post by: knucklehead on 2015-04-05 18:44:19
He probably wanted to say that he had mined enough quotes now which he can use to distort and ridicule our position. You can see from his treatment of JJ and the like how eagerly he is in pursuit of such snippets he can misapply. The actual enlightenment for him was most likely zero, and he never wanted any to start with.



That has already happened earlier this afternoon:

Link to JKenny post from this afternoon (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-18.html#post31435625)


The date/timestamps are all very wrong on that AVSF  thread. 

No way did I make those posts 'today'. None of those posts are new.

And the thread was closed quite some time ago.


It's where it switches from "Today" to date of post. The Today title stayed locked when the thread was locked.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-05 19:59:13
Given the fact he's selling something, that rather shifts the burden of proof!

In that context, this entire thread is a joke.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-05 20:36:50
Thank you admin, for removing my personal address.
I have looked at my domain name registration & it doesn't appear that I can remove this information from public access?

I see the thread continues to have a nasty, unpleasant & ad hominem flavour to it - far from the objective, scientific image that HA seems to want to cultivate.
Title: How do you listen to an ABX test?
Post by: kode54 on 2015-04-06 02:18:42
Thank you admin, for removing my personal address.
I have looked at my domain name registration & it doesn't appear that I can remove this information from public access?

I see the thread continues to have a nasty, unpleasant & ad hominem flavour to it - far from the objective, scientific image that HA seems to want to cultivate.


Depending on your domain registrar, you may be able to pay a few bucks extra per year for some sort of privacy service, which will proxy your registration information, displaying their information instead, and forwarding any contacts to you. This will protect you from just about everything, except for legal inquiries which have access to subpoena information.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-06 11:11:48
Depending on your domain registrar, you may be able to pay a few bucks extra per year for some sort of privacy service, which will proxy your registration information, displaying their information instead, and forwarding any contacts to you. This will protect you from just about everything, except for legal inquiries which have access to subpoena information.

Thank you, I will look into that
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-06 13:59:49
I see the thread continues to have a nasty, unpleasant & ad hominem flavour to it - far from the objective, scientific image that HA seems to want to cultivate.

Yes, as Amir et al taught you, you can always use the woe-is-me victimization card to evade all tough questions, any sort of real test of your daydreams, etc, etc.....all under the guise of being the true champion of science. "Tone" has ruined things again. On the way to the bank with the proceeds from peddling $750 organic DACs, with zero evidence they sound any different than say, an ODAC.
Despite the pathological "engineering" typical of "audiophile" products, making this a very real possibility.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-06 16:03:50
I'll give this one last shot.

ABX is just one tool, one of many blind listening methods, that can be used to check if a claim regarding audible differences (the alternative hypothesis) is likely* to be true.
Most objections are completely irrelevant to how ABX is being used in these public online tests.

And if the tool doesn't fit, then simply use another bias-eliminating (blind) listening method.


*) In online tests there is no supervisor. A positive log can be the result of random button smashing (e.g. passing 8 trials is bound to happen about every 28th attempt of completely random button smashing), cheating, a problem with the test setup (e.g. stupid Windows forcing resampling causing audible aliasing artifacts), a problem with the test files ... or genuine audible differences. So it is not a guarantee for a difference. If anything, it is confirmation that we should investigate further.

Now if the people who produced positive logs behave demonstrably dishonest, have a vested interested in producing positive results ... or for example evade the simple question what difference they actually heard, or evade any further attempts trying to investigate what could have gone wrong, then of course any sane person will reject the results.

If a dowser told you he did a dowsing test in his backyard, and showed you a log with "positive results that prove that dowsing works", would you do anything other than double facepalming?
What would you think if the dowser then told you that you should also do the test, but if you fail there is never the possibility that dowsing doesn't work but that you just haven't trained hard enough?
What would you think if the dowser then told you that the rod he used doesn't even qualify for locating water, but his superior training allowed him to pass the test perfectly except for one trial where his dog disturbed him (*points at the failed trial in the log*)?
...
I-n-s-a-n-i-t-y.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-06 16:05:44
I see the thread continues to have a nasty, unpleasant & ad hominem flavour to it - far from the objective, scientific image that HA seems to want to cultivate.


Targeting the people that you disagree with as being intellectually dishonest, favoring religion over logic, etc. would seem to have a nasty, unpleasant & ad hominem flavour to it, wouldn't it?

But why should I put words into your mouth, John?

We on HA are quoted by you as saying:

Link to John Kenny's Libel of HA discussions (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive-18.html#post31435625)

" ...false negatives are of no importance in blind tests (most don't even admit to them or know what a false negative is), only false positives, as a consequence no internal controls are necessary to determine the test's specificity/sensitivity. " & here's the really great one: Once you have done a few trials & heard no difference, there's no need to do any more listening, just select randomly for all subsequent trials  Isn't that great? "

It is all your intentional lies and distortions of the truth, isn't it John?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-06 23:01:32
I'll try to summarise my position succinctly using some comparisons:

- Sighted tests, we all know, are prone to false positive results i.e a listener hearing a difference when there isn't one ACTUALLY present - they are biased towards a delivering positive results. AFAIK, we don't know the degree to which sighted tests are thus biased although we do know the psychological factors that give rise to this skew towards false positives. Also, we cannot use the fact that there are many times when no discernible difference is found in sighted listening to deny that the test itself is prone to false positives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias

- ABX testing, eliminates many of the biases towards false positives but as a result introduces a greater risk of false negatives i.e. a listener not hearing differences when they are actually present - they are biased towards delivering negative results. AFAIK, we don't know the degree to which ABX tests are thus biased although we do know some of the factors that give rise to this skew towards false negatives. Also, we cannot use the fact that there are many times when a discernible difference is found in ABX listening to then deny that the test itself is prone to false negatives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias.

There are some conflicting statements made here about ABX null results. 1) They are of no consequence, disregarded, etc. 2) They are used as a body of evidence to show that the strong indication is that there is no audible difference between X & Y 3) They are used to "prove" the error of a positive sighted test. Now, I don't believe that it makes sense to try to use a test (ABX) which is biased towards false negatives (to an unknown extent) to show that a positive sighted test result is wrong. If we had some way of determining the sensitivity of the actual ABX test run itself, we would have a far better handle on this.

I believed this might be addressed by including a hidden control in the test itself. Whatever way it's done, I believe we need a way of examining how prone any given ABX test run is to false negatives. In other words I would like an internal control that showed how discriminating the listener & the test conditions were to revealing a small impairment. I would hope to be able to eliminate listeners who weren't listening (i.e. didn't actually take the test) - in either sighted or ABX listening.

Anybody who was deaf or had hearing impairment would, quite correctly be excluded from these listening tests because their results would be meaningless. Similarly, I would suggest that there are many other conditions that should exclude a listener's results - these conditions may not be revealed unless some internal control is used. Pre-training & pre-testing are a start towards this - hopefully eliminating unsuitable listeners - but it's only a start - it doesn't show if someone has lost focus & stopped listening during ABX testing. If they have lost focus & stopped listening, surely you would want to be aware of this?
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 01:09:40
- Sighted tests, we all know, are prone to false positive results i.e a listener hearing a difference when there isn't one ACTUALLY present - they are biased towards a delivering positive results. AFAIK, we don't know the degree to which sighted tests are thus biased although we do know the psychological factors that give rise to this skew towards false positives. Also, we cannot use the fact that there are many times when no discernible difference is found in sighted listening to deny that the test itself is prone to false positives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias

And it is exactly that bias that eliminates sighted tests to determine if there really is a difference, i.e. if the claims made are true.

We do know many psychological factors, see /wiki/List_of_cognitive_biases (http://en.wikipedia.org/wiki/List_of_cognitive_biases). There can be many different biases at play based on your beliefs, your memories, your expectations, your peers, ... that can influence your perception.


- ABX testing, eliminates many of the biases towards false positives but as a result introduces a greater risk of false negatives i.e. a listener not hearing differences when they are actually present - they are biased towards delivering negative results. AFAIK, we don't know the degree to which ABX tests are thus biased although we do know some of the factors that give rise to this skew towards false negatives. Also, we cannot use the fact that there are many times when a discernible difference is found in ABX listening to then deny that the test itself is prone to false negatives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias.

This is pretty irrelevant in the case where someone claims to hear a difference (sighted) and is then asked to produce the evidence. That person will want to show that their claim is true and therefore will not be biased towards delivering a negative result. There is however the problem of cheating.

In more formal tests there is usually a careful selection of expert listeners; these listeners receive training to familiarize themselves with the test procedure; all the listeners' results that meet certain pre-defined rules will analyzed; low anchors; and so on ...
For example in the multiformat tests done on this forum which I pointed you to 2 months ago, ABX can be used as a tool to train yourself to hear fine differences and make sure you are not fooling yourself. Failing to distinguish the originals from the impaired files will disqualify you from the test.


There are some conflicting statements made here about ABX null results. 1) They are of no consequence, disregarded, etc. 2) They are used as a body of evidence to show that the strong indication is that there is no audible difference between X & Y 3) They are used to "prove" the error of a positive sighted test. Now, I don't believe that it makes sense to try to use a test (ABX) which is biased towards false negatives (to an unknown extent) to show that a positive sighted test result is wrong. If we had some way of determining the sensitivity of the actual ABX test run itself, we would have a far better handle on this.

It depends on the test, which should be obvious if you apply just a tiny bit of brain power.

In a public test where everyone can download the files it can be the case that not a single negative log is posted but actually hundreds of people that gave their best failed. A cheater could have posted a positive log without ever listening to the files. In such a test positive logs inform us that there indeed might be something to the claim, and that we need to investigate further. Plausibility of the claims, detailed descriptions of the heard differences and an honest behavior are key.
If even self-proclaimed experts (that usually make claims similar to the tested one) participate in the test and cannot produce true positive results, then we can reject the claim in good conscience. If failure is constant, then it is not unreasonable to assume that the differences are generally too small for humans to detect.

In the case of a single person making a claim, as mentioned above, if the person cannot produce the evidence then we can reject the claim. For positive results the same remarks as above apply (plausibility ... honesty).

We don't need to show that a sighted test result is wrong. The sighted test result is the claim that needs to be supported with evidence.

All of this is really rather basic logical thinking, statistics, science ...


I believed this might be addressed by including a hidden control in the test itself. Whatever way it's done, I believe we need a way of examining how prone any given ABX test run is to false negatives. In other words I would like an internal control that showed how discriminating the listener & the test conditions were to revealing a small impairment. I would hope to be able to eliminate listeners who weren't listening (i.e. didn't actually take the test) - in either sighted or ABX listening.

BS-1116 states: "The “double-blind triple-stimulus with hidden reference” method has been found to be especially sensitive, stable and to permit accurate detection of small impairments."
Foobar2000's ABX component does allow for such kind of testing.

Your idea of introducing a more audibly impaired audio file at a random trial doesn't really help, as I've told you already 2 months ago. I can still randomly push buttons except for when I hear this special "control". It would give you the false impression that I listened, which I actually did not.


Anybody who was deaf or had hearing impairment would, quite correctly be excluded from these listening tests because their results would be meaningless. Similarly, I would suggest that there are many other conditions that should exclude a listener's results - these conditions may not be revealed unless some internal control is used. Pre-training & pre-testing are a start towards this - hopefully eliminating unsuitable listeners - but it's only a start - it doesn't show if someone has lost focus & stopped listening during ABX testing. If they have lost focus & stopped listening, surely you would want to be aware of this?

We're back to basic logic and hypothesis testing and type of the test.

No, I don't really care if someone who makes a claim is too tired to provide the evidence, and in more formal tests we have controls in place to detect this anyway.


Again, ABX is just a tool. You seem to think that it is this jack of all trades magical thing that does everything for you. Well, it isn't.


PS: I hope there are no grave mistakes, I'm tired.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 02:39:15
I'll try to summarise my position succinctly using some comparisons:

- Sighted tests,  I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias

Yeah, it's called blinding. You know, like ABX, etc.
Yet on irrational forums, you insist on long term peeking daydreams as ideal.

- ABX testing, eliminates many of the biases towards false positives but as a result introduces a greater risk of false negatives i.e. a listener not hearing differences when they are actually present - they are biased towards delivering negative results.

As evidenced by....? That's pure wishful thinking fallacy without showing data with those "missed" positives...which you can't.

I believed this might be addressed by including a hidden control in the test itself.

Great, let's see the one you used in your blind DAC test, where you claim to heard some particular identifying "sounds".

I would suggest that there are many other conditions that should exclude a listener's results

Yes, strong pecuniary interests, computer software proficiency coupled with a sordid history of outright fabrications come to mind. Pretty much demands there be some protoring for results to be taken seriously.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 03:44:59
Sighted tests, we all know, are prone to false positive results i.e a listener hearing a difference when there isn't one ACTUALLY present - they are biased towards a delivering positive results.


For your incredible track record for vastly overstating the challenges of DBTs for profit and profit, you compound it by glossing over the fact that sighted tests do little but generate false positives when differences are small. As I said in my last post on this topic, sighted tests are very useful when there are relatively large differences in accordance with our current understanding of psychoacoustics. However when the differences are on the subtle side, sighted evaluations can be counted on to reliably create false impressions of large audible differences.

Quote
AFAIK, we don't know the degree to which sighted tests are thus biased


At this point everybody who isn't naive or doesn't have a financial stake knows that sighted evaluations involving subtle differences are so biased towards false positives that they can't even be properly called tests.

Quote
although we do know the psychological factors that give rise to this skew towards false positives.


Not only pyschological but sociological, perceptual and techical factors skew sighted evaluations towards an avalanche of false positives when psychoacoustics says "no audible differences".

Quote
Also, we cannot use the fact that there are many times when no discernible difference is found in sighted listening to deny that the test itself is prone to false positives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias.


Based on the context of your comments on HA, we know that your idea of "some controls" is overwhelmingly biased against everything that has been done so far to develop these controls no matter how widely they are accepted.  We more than "Some controls" already. We have exactly the controls we need to do sensitive, reliable listening tests that return results that agree with other relevant science such as psyschoacoustics.

Quote
ABX testing, eliminates many of the biases towards false positives but as a result introduces a greater risk of false negatives i.e. a listener not hearing differences when they are actually present - they are biased towards delivering negative results.


This comment needs to be understood in light of the fact that you have repeatedly dismissed the reliable means we have to examine the incidence of false negatives.  I don't know whether those means are beyond your knowledge or you know about them but the lure of the almighty dollar/pound/euro has blinded you to their application.

Quote
AFAIK, we don't know the degree to which ABX tests are thus biased


More properly stated, John Kenny has this religious belief that there are vastly more false negatives then actually exist because ABX tests don't work well for selling what he wants to sell.

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 05:44:21
And just to finish off my thoughts on this:
1) It's difficult to get an idea of how much sighted listening exaggerates differences & how much blind listening suppresses differences. The best handle I can find on this is the graph posted in Sean Olive's blog here (http://seanolive.blogspot.ie/2009/04/dishonesty-of-sighted-audio-product.html)
(http://4.bp.blogspot.com/_w5OVFV2Gsos/Sd5kUGjjhwI/AAAAAAAAAHw/j8vMfgoCNPw/s1600/BlindVsSightedMeanLoudspeakerRatings.png)
What we see in this graph are, in sighted listening some fairly distinctive preference differences shown among the speakers.
In blind listening these preferences all become squashed into one band & if you look - the error bars plotted for each point all pretty much overlap with one another -  signifying even less surety that there are actual differences being discerned in blind listening.
So what we are seeing in these graphs is, for speakers, which are generally agreed to have the greatest audible differences among the devices in the playback chain, that these differences all but collapse in blind listening. This doesn't give me much faith in the usefulness of this test to differentiate smaller differences than are found in speakers especially when we change the test to a forced choice test such as ABX

2) Because ABX testing relies on statistical analysis to interpret the results you need to be careful. You have to be clear what are valid tests & what aren't - you can't just bundle all null results into the statistical pool of results & then statistically analyse the outcome. This sort of behaviour will almost always guarantee a statistical result of near to 50% i.e guess work as it waters down any positive ABX results by ALL null results both valid & invalid. There should be some sort of validation process that excludes such invalid results.

3) Having looked at the video linked (http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/) to by Arny, a number of things stand out. Firstly, this guy is using his knowledge of microphone technology to hone in on where his experience tells him there might be an audible difference between the two microphones. So this is an expert listener & a very experienced expert listener. Having identified a part of the audio sample that he is confident he can hear the difference in, he then goes into ABX testing. Notice how he seldom uses just memory alone but often uses A/B instant switching to discern X (not what was contended earlier in this thread where memory was stated to be mainly used & A/B instant switching used as training). Even given the experience & expertise of this guy with these microphones, he scores 80% in the test (not 100%). Given a pool of other listeners taking the same ABX test who don't have his expertise & I can bet you that the accumulated statistical result will be far closer to 50% i.e guesswork. Just as a matter of interest if this figure was 65% instead of 80% what would your conclusion be?
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 09:33:58
What we see in this graph are, in sighted listening some fairly distinctive preference differences shown among the speakers.
Which shows how biased these tests are.


In blind listening these preferences all become squashed into one band & if you look - the error bars plotted for each point all pretty much overlap with one another -  signifying even less surety that there are actual differences being discerned in blind listening.
Your attempt to spin this by taking a single image and interpreting it however you seem to please doesn't work:
Quote
The psychological biases in the sighted tests were sufficiently strong that listeners were largely unresponsive to real changes in sound quality caused by acoustical interactions between the loudspeaker, its position in the room, and the program material. In other words, if you want to obtain an accurate and reliable measure of how the audio product truly sounds, the listening test must be done blind.
The sighted test is very insensitive to all kinds of what should be clearly audible differences and is therefore useless.


So what we are seeing in these graphs is, for speakers, which are generally agreed to have the greatest audible differences among the devices in the playback chain, that these differences all but collapse in blind listening. This doesn't give me much faith in the usefulness of this test to differentiate smaller differences than are found in speakers especially when we change the test to a forced choice test such as ABX
Complete and utter BS.
(http://3.bp.blogspot.com/_w5OVFV2Gsos/Sd5kUoW0LyI/AAAAAAAAAH4/94hRtdM6Ilw/s1600/BlindVsSightedPositionInteractions.png)

The sighted test is so biased that it completely masks the differences in sound quality. Besides, you still seem to think that the sighted test sets the ruler for sound quality - it DOES NOT, in fact it demonstrably fails. That's the whole point of blind testing.


2) Because ABX testing relies on statistical analysis to interpret the results you need to be careful. You have to be clear what are valid tests & what aren't - you can't just bundle all null results into the statistical pool of results & then statistically analyse the outcome. This sort of behaviour will almost always guarantee a statistical result of near to 50% i.e guess work as it waters down any positive ABX results by ALL null results both valid & invalid. There should be some sort of validation process that excludes such invalid results.
This false generalization has already been dealt with.


3) Having looked at the video linked (http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/) to by Arny, a number of things stand out. Firstly, this guy is using his knowledge of microphone technology to hone in on where his experience tells him there might be an audible difference between the two microphones. So this is an expert listener & a very experienced expert listener. Having identified a part of the audio sample that he is confident he can hear the difference in, he then goes into ABX testing. Notice how he seldom uses just memory alone but often uses A/B instant switching to discern X (not what was contended earlier in this thread where memory was stated to be mainly used & A/B instant switching used as training). Even given the experience & expertise of this guy with these microphones, he scores 80% in the test (not 100%). Given a pool of other listeners taking the same ABX test who don't have his expertise & I can bet you that the accumulated statistical result will be far closer to 50% i.e guesswork. Just as a matter of interest if this figure was 65% instead of 80% what would your conclusion be?
Straw man again and again a demonstration that you have no clue about statistics. Like none.

Not gonna bother with the straw man, just the statistics:
X ~ B(n, p)
with n = 5 trials
p = 0.25 since we have 4 different files

P(X >= 4) = 1.562% which is statistically significant


edit: Besides, it is imho not hard to hear audible differences between the samples in question
Code: [Select]
foo_abx 2.0 report
foobar2000 v1.3.7
2015-04-07 10:50:52

File A: s.wav
SHA1: d44049cfc06e5ed659deac2f5dbf9405f2b98ae9
File B: t.wav
SHA1: 3ba262f47f0f7c3c605ab0803557cbd33da8a06b

Output:
DS : Primärer Soundtreiber
Crossfading: NO

10:50:52 : Test started.
10:51:14 : 01/01
10:51:23 : 02/02
10:51:29 : 03/03
10:51:37 : 04/04
10:51:45 : 05/05
10:51:52 : 06/06
10:51:58 : 07/07
10:52:08 : 08/08
10:52:08 : Test finished.

 ----------
Total: 8/8
Probability that you were guessing: 0.4%

 -- signature --
a5990f1e97a31a51dbb553d291702ffac735551d

This was done quickly on what is possibly one of the worst systems: a laptop with onboard audio and old ~$10 earbuds.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-07 10:34:23
- Sighted tests, we all know, are prone to false positive results i.e a listener hearing a difference when there isn't one ACTUALLY present - they are biased towards a delivering positive results. AFAIK, we don't know the degree to which sighted tests are thus biased although we do know the psychological factors that give rise to this skew towards false positives. Also, we cannot use the fact that there are many times when no discernible difference is found in sighted listening to deny that the test itself is prone to false positives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias

Those controls are called blind tests. They have repeatedly shown that non-blind testing can't be relied upon. You show one example yourself below with Sean Olive's result, where the sighted test yielded results that were quite different from the blind test. This is confirmed by "fake" tests, where no real difference exists, but sighted tests with their bias erroneously show a difference. There is a mountain of evidence, far more universal than just in audio alone, that confirms this. There really is no point trying to ignore all this and starting from scratch again. Sighted tests are a waste of time, unless they are being used to train for a blind test.

You also repeatedly try to make it look as if the blind test were entirely different from the sighted test. I believe that you are deliberately distorting this, in order to pitch them against each other. Ideally, the only difference is that the blind test eliminates the clues, which any eventual bias needs to distort the results. All the other factors are identical as far as possible.

Quote
- ABX testing, eliminates many of the biases towards false positives but as a result introduces a greater risk of false negatives i.e. a listener not hearing differences when they are actually present - they are biased towards delivering negative results. AFAIK, we don't know the degree to which ABX tests are thus biased although we do know some of the factors that give rise to this skew towards false negatives. Also, we cannot use the fact that there are many times when a discernible difference is found in ABX listening to then deny that the test itself is prone to false negatives. I would be in favour of having some controls in these tests to get a handle on the level & influence of this bias.

If we are talking about a test with which you want to find the best performers, in order to establish the limits of audibility, there is no such thing as a false negative, because a negative result is the expected default outcome anyway. You are consistently trying to ignore that, too, despite repeated reminders, and try to suggest a symmetry between false positives and false negatives that is wholly inappropriate for the kind of test you are referring to. We are talking about the formal rules of the game here, not some non-formal impressions.

It is perhaps most easily demonstrated with an analogy in sports: If you are trying to find the limits of speed in 100m sprint, there are no false negatives, either. You either break the record or not. If not, it is a missed opportunity, nothing more. It doesn't matter whether you missed it due to a "genuine" lack of performance, or due to some handicap you can't be blamed for. You wouldn't even be able to tell those apart reliably. If you do break the record, people will rightly want to be sure that it wasn't a false positive, however. So there are quite a few rules and regulations and procedures built into the contest to ensure that false positives get detected and excluded as well as possible. The rules only counteract false negatives insofar as to ensure that noone is being given an unfair advantage over the other runners. Still, as simple as the problem might seem, the rules are quite elaborate, covering start and timing procedure, track condition, wind and weather condition, clothing, doping control, and more. This is a result of the amount of money and other interests involved, and the corresponding incentive to cheat.

In such a situation, the current record stands as long as it is not broken in a valid official run. This means that the default outcome and the default assumption is that the current record is and remains the best performance. The null result is the expected result, but a single positive result can beat it.

Now, it is true that there are pre-screening tests and an elaborate system of qualifying, which is there to ensure that only the best make it into the finals. You may say that this is a system to control false negatives, but it isn't. It is a system that makes the contest more interesting, and also more efficient, because it reserves the tightest controls and the greatest attention to those who are most likely to break the record. If the record would be broken in a run where the other runners finished more than 5 seconds behind, it would still count.

In all this, the null result is formally assumed to be the default outcome, while breaking the record is assumed to be a possibility. This formal position notwithstanding, it is still quite reasonable to assume that 7 seconds will not be possible for a 100m run, and to doubt anybody's claim to have achieved that. This is not bias, because it doesn't have an influence on the actual test.

Is the analogy with the 100m run appropriate? Yes, because you are after the best performers, and the limits of perception. You are not after the abilities of the average man, this is very clear. You would immediately accept one single valid positive result as the new benchmark, against any number of valid negative results. As much as you are advocating symmetry between negative and positive validity, I'm very sure that you would instantly abandon that symmetry, once you have a valid positive result. That's a very basic dishonesty in your stance.

Quote
There are some conflicting statements made here about ABX null results. 1) They are of no consequence, disregarded, etc. 2) They are used as a body of evidence to show that the strong indication is that there is no audible difference between X & Y 3) They are used to "prove" the error of a positive sighted test. Now, I don't believe that it makes sense to try to use a test (ABX) which is biased towards false negatives (to an unknown extent) to show that a positive sighted test result is wrong. If we had some way of determining the sensitivity of the actual ABX test run itself, we would have a far better handle on this.

You are postulating a conflict that doesn't exist. See above. That there have been no successful runs until now which completed the 100m in 7 seconds, is of course a strong indication that 7 seconds can't be done. It would be foolish to say otherwise. As long as the runs which are done fail to achieve that, they are of course contributing to this indication. Yet they are of no particular consequence, because that was the default assumption anyway.

If anybody would indeed complete the 100m in 7 seconds, the question would of course arise whether it was a false positive. The equivalent of a sighted listening test here would be a run which lacked adequate controls. It is completely clear that such a run would not be accepted as having broken the record, as everybody (except perhaps those with a vested interest) would assume that something must have been amiss (whatever it was may be impossible to determine). So of course the controlled test (i.e. ABX) is used to judge the non-controlled test (i.e. the sighted test). While it doesn't strictly prove that the positive was false, it provides way enough reason to dismiss the non-controlled test.

All of this is perfectly reasonable, it seems to me, yet people on the audiophile side manage to get this blatantly and consistently wrong.

Quote
I believed this might be addressed by including a hidden control in the test itself. Whatever way it's done, I believe we need a way of examining how prone any given ABX test run is to false negatives. In other words I would like an internal control that showed how discriminating the listener & the test conditions were to revealing a small impairment. I would hope to be able to eliminate listeners who weren't listening (i.e. didn't actually take the test) - in either sighted or ABX listening.

Before you start doing this, you would have to provide a workable definition of what a false negative is in such a situation. The examples you have given so far are so trivial that they are unhelpful. Of course you want to exclude the deaf, that's the trivial bit, but if you are thinking about screening the listeners beforehand, you need to establish the exact criteria for admission, and the way to test them. You could test for ability to discriminate small level differences, for example, but who says that this is a relevant measure of suitability for the actual listening test? You could be accused of eliminating the wrong people. My experience tells me that this would almost certainly happen in the event that your test ends unsuccessfully.

In general, pre-screening requires that you have prior knowledge about what kind of effect you are trying to detect in your listening test. If you don't know what exactly it is that people claim to hear, you don't have that prior knowledge, and you can't devise suitable screening criteria.

You are working yourself into an impossibility. My impression is that this is not unwelcome to you, as you can use it to raise the bar for acceptable ABX tests so high as to make them practically impossible, hoping that this will effectively rehabilitate the sighted tests as an admittedly imperfect but acceptable alternative. That's a well known ruse. Because it is so well known, it only works on those who need very little convincing anyway.

Quote
What we see in this graph are, in sighted listening some fairly distinctive preference differences shown among the speakers.
In blind listening these preferences all become squashed into one band & if you look - the error bars plotted for each point all pretty much overlap with one another - signifying even less surety that there are actual differences being discerned in blind listening.
So what we are seeing in these graphs is, for speakers, which are generally agreed to have the greatest audible differences among the devices in the playback chain, that these differences all but collapse in blind listening. This doesn't give me much faith in the usefulness of this test to differentiate smaller differences than are found in speakers especially when we change the test to a forced choice test such as ABX

You are misreading the plot entirely. It is by no means an indication that the listeners weren't able to distinguish the speakers from each other. It is an indication of their relative performance level. It is entirely plausible that two speakers get similar performance ratings even when they are clearly distinguishable from each other.

Quote
2) Because ABX testing relies on statistical analysis to interpret the results you need to be careful. You have to be clear what are valid tests & what aren't - you can't just bundle all null results into the statistical pool of results & then statistically analyse the outcome. This sort of behaviour will almost always guarantee a statistical result of near to 50% i.e guess work as it waters down any positive ABX results by ALL null results both valid & invalid. There should be some sort of validation process that excludes such invalid results.

The most blatant errors I have seen committed in audio blind tests were all committed by audiophiles. You are right in recommending care here, but your example is completely made up, and the ones most in need of that advice are your own peers, and indeed yourself.

Quote
3) Having looked at the video linked to by Arny, a number of things stand out. Firstly, this guy is using his knowledge of microphone technology to hone in on where his experience tells him there might be an audible difference between the two microphones. So this is an expert listener & a very experienced expert listener. Having identified a part of the audio sample that he is confident he can hear the difference in, he then goes into ABX testing. Notice how he seldom uses just memory alone but often uses A/B instant switching to discern X (not what was contended earlier in this thread where memory was stated to be mainly used & A/B instant switching used as training). Even given the experience & expertise of this guy with these microphones, he scores 80% in the test (not 100%). Given a pool of other listeners taking the same ABX test who don't have his expertise & I can bet you that the accumulated statistical result will be far closer to 50% i.e guesswork. Just as a matter of interest if this figure was 65% instead of 80% what would your conclusion be?

I think this paragraph shows very well how extremely biased (hell-bent) and at the same time clueless you are. You had more than enough time to get at least some basic understanding of what you are talking about now, still you are able to pose such questions. This is fierce ignorance, against all reality.

ABX testing is specifically trying to avoid reliance on memory as much as possible. Hence the immediate switching at the listener's control. If you find that remarkable, then that's remarkable by itself. I don't see where previously in the thread anything else was suggested. A link would be most appropriate here.

Of course experience in what exactly one is looking for is very welcome and helps the test. Nobody I'm aware of has argued against training here. You would assume, however, that people who claim that they can hear a certain difference, are in fact trained already. On what ground would they otherwise make the claim? Training is used when the test designer has an idea what to search for, but the listeners can't be expected to be alert to that. If the listener is assumed to know what the difference is, but the test designer does not, what do you train for and how?

And the last one: 65% are of course a null result! It is the confidence level that matters here, and the usual required confidence level is 95% (i.e. the probability of success by pure guessing is less than 5%). With 10 trials, you need an 80% score to reach that confidence level, everything below that is a null result. That this is even a question shows me that you are still clueless regarding even the statistical basics of such tests, even more hollow must ring your attempt to give recommendations.

I leave it to you as homework, to work out how many trials would be needed for a 65% score to reach a 95% confidence level. That might give you a basic feeling for the relationships between confidence level, score and trial count.

Seriously: Sit down and learn the basics at least. Try to suspend your extreme anti ABX-bias for at least some time until you know what it is all about. In your current state you are neither enlightening nor funny, and certainly completely unconvincing for everybody wo has at least half a clue.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 10:35:18
And just to finish off my thoughts on this:
1) It's difficult to get an idea of how much sighted listening exaggerates differences & how much blind listening suppresses differences. The best handle I can find on this is the graph posted in Sean Olive's blog here (http://seanolive.blogspot.ie/2009/04/dishonesty-of-sighted-audio-product.html)
(http://4.bp.blogspot.com/_w5OVFV2Gsos/Sd5kUGjjhwI/AAAAAAAAAHw/j8vMfgoCNPw/s1600/BlindVsSightedMeanLoudspeakerRatings.png)
What we see in this graph are, in sighted listening some fairly distinctive preference differences shown among the speakers.


Straw man argument.

What we see in the reference cited above is a discussion of loudspeakers not DACs or amplifiers or even perceptual coders.

It is well known that among loudspeakers the technical differences are clearly audible in accordance with the current and  findings of Psychoacoustics over the past two or more decades..

The cited listening tests are not ABX tests, and the goal of those tests is not the reliable identification of whether the sonic alternatives sound different, but which are preferred by listeners.

In short the post is either deceitful or just plain badly informed. In either case, it is not to be taken seriously.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 10:58:47
And just to finish off my thoughts on this:


Mr. Kenny, bBy "Finish off" it appears that you mean kill, destroy, and abandon for utter destruction.

Quote
2) Because ABX testing relies on statistical analysis to interpret the results you need to be careful. You have to be clear what are valid tests & what aren't -


Of course, Mr. Kenny. Please don't presume that everybody is as unfamiliar with statistics and experimental design as you have proven yourself to be, and continue to prove it again:

Quote
you can't just bundle all null results into the statistical pool of results & then statistically analyse the outcome.


Of course not, but this is not what happened. If there were a mixture of positive and negative results and as you would like to self-servingly and falsly pretend. and only the negative results were pooled that would be an incorrect thing to do.  What was really done was to pool the results of like experiments whether the results were postive or negative.

The relevant AES paper is "10 Years of ABX Testing," David Clark, AES Preprint No.3167 K-1, October 1991

Quote
3) Having looked at the video linked (http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/) to by Arny, a number of things stand out. Firstly, this guy is using his knowledge of microphone technology to hone in on where his experience tells him there might be an audible difference between the two microphones. So this is an expert listener & a very experienced expert listener. Having identified a part of the audio sample that he is confident he can hear the difference in, he then goes into ABX testing. Notice how he seldom uses just memory alone but often uses A/B instant switching to discern X (not what was contended earlier in this thread where memory was stated to be mainly used & A/B instant switching used as training). Even given the experience & expertise of this guy with these microphones, he scores 80% in the test (not 100%). Given a pool of other listeners taking the same ABX test who don't have his expertise & I can bet you that the accumulated statistical result will be far closer to 50% i.e guesswork. Just as a matter of interest if this figure was 65% instead of 80% what would your conclusion be?


Interesting that even John Kenny doesn't believe Amir's claims that the percentage right doesn't matter. Kenny based the entire argument above on percentage right, which is not a correct analysis. No matter which error one makes, whether the error demonstrated above of only looking at percentage right, or Amir's error of being obsessed with statistical confidence, reality is that both are important.

Kenny further ruins the credibility of his post by basing his conclusions on his speculations on what might have happened if...  Way to finish yourself off, John!
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 12:36:53
Glad to answer some of the less ad-hom questions or address some points raised:

- "there is no such thing as a false negative," This is something I was accused of falsely claiming you guys stated & yet it occurs again. Later you ask me for non-trivial examples of what a false negative is - well, I consider not listening during the test to be the source of many false negatives. As I've said before this could be due to fatigue, loss of focus or the reason given in the other closed thread "just use random guessing because life's too short" - I'm sure there are other possibilities I'm not thinking of but essentially it boils down to the test not being taken. As in the example of the 100m race - it's the equivalent of someone dropping out at 20 yards or someone walking the 100m or someone stating at the start that they cannot run it in 7 secs & not bothering - all these, if they were recorded in the pool of negative results instead of being eliminated would greatly skew the test results. Failures to ACTUALLY listen in an ABX trial are not revealed by the test - the result is just recorded.

- I know the confidence level is the important factor in statistics but from what was stated here of the 65% in the Stuart AES paper wondered if you guys got it? It seemed you paid more attention to the % guesses as a argument that the impairment resulting from digital filters was so small as to be inconsequential. I know statistics are tricky & I disliked them at college but it seems that mistakes are often made.

- Yes you are probably correct that preference tests results should not be extrapolated to difference test results - after all a similar preference could still mean that they don't sound alike but are equally preferred - I'll grant you that.

- "ABX testing is specifically trying to avoid reliance on memory as much as possible. Hence the immediate switching at the listener's control. If you find that remarkable, then that's remarkable by itself. I don't see where previously in the thread anything else was suggested. A link would be most appropriate here."  In the exchanges between SAM & Arny we have a lot of statements by Arny that short term, echoic memory is not so important culminating in his statement to me when I posted
Quote
"The two common reasons given for using ABX testing that I've seen promulgated on audio forums are that a) memory is unreliable & therefore short-term echoic memory is the only reliable way to do A/B comparisons.

Quote
Simply not true. A false claim! What said that?"..........Using short term echoic memory is one of the things that people may use to identify sounds until they adequately learn how to identify sounds all by themselves."
 
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 12:55:32
Glad to answer some of the less ad-hom questions or address some points raised:

- "there is no such thing as a false negative," This is something I was accused of falsely claiming you guys stated & yet it occurs again.


I'll take the absence of an actual link to an actual quote of that statement as tacit admission that the above is either made up, or taken out of context.

Of course there are such things as false negatives, and I've spent about 30 years of  my life trying to minimize them. The results of my personal efforts were mixed, but lots of other people have done good work in the area. Take it all together and we may even say that it is largely a solved problem. Not that even more reduction of false negatives is impossible, but that the problem is well above 90% solved.

Quote
Later you ask me for non-trivial examples of what a false negative is - well, I consider not listening during the test to be the source of many false negatives.


Illusions of omniscience and lightly veiled ad hominem  noted.

The question that no golden ear seems to want to answer is: Why go though the work of setting up and running a DBT and then not listen?  Reality is that the golden ears geneally have no idea what is involved with doing a proper DBT because they avoid doing them. This is probably because doing proper DBTs are either so much more work than a casual sighted evaluation (which is all they seem to be able to muster), or they are incapable of doing the necessary technical work to do a DBT, no matter how much we simplify it.

In fact John you have no idea what goes on in most DBts. Neither do I a priori because I'm easily as omniscience-challenged as you. But, I do see the evidence of other people's DBTs, the FB2K ABX logs, the samples, and other evidence.  Looks like some serious work went into most of it. Of course I know this well because I've done the same work. You haven't. What gives you the right to make such blanket statements without evidence to back it up, John?

Quote
As I've said before this could be due to fatigue, loss of focus or the reason given in the other closed thread "just use random guessing because life's too short"


I'll take the absence of an actual link to an actual quote of that statement as tacit admission that the above is either made up, or taken out of context.

Quote
- "ABX testing is specifically trying to avoid reliance on memory as much as possible. Hence the immediate switching at the listener's control. If you find that remarkable, then that's remarkable by itself. I don't see where previously in the thread anything else was suggested. A link would be most appropriate here."  In the exchanges between SAM & Arny we have a lot of statements by Arny that short term, echoic memory is not so important culminating in his statement to me when I posted
Quote
"The two common reasons given for using ABX testing that I've seen promulgated on audio forums are that a) memory is unreliable & therefore short-term echoic memory is the only reliable way to do A/B comparisons.
Quote
Simply not true. A false claim! What said that?"..........Using short term echoic memory is one of the things that people may use to identify sounds until they adequately learn how to identify sounds all by themselves."
 



So what? Yes, I said that. I see no discussion of what I said.

???????????
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 13:30:36
jkeny, you seem to define a false negative as any negative result where the participant maybe could have passed the test. You don't know that. We can turn this around and say that any positive result is a false positive if the person could have cheated.
So this gets us nowhere.

Again, if a self-proclaimed expert or group of such people make a claim to hear a difference then it is automatically in their interest to run a blind test as conscionable as possible. Not listening during the test is like shooting yourself in the foot if it is in your interest to provide evidence for your claim.

Flawed comparisons with flawed sighted tests, as you made above, don't help us either to establish if there truly is an audible difference for the participant.


"ABX testing is specifically trying to avoid reliance on memory as much as possible."

You are misunderstanding again (granted, pelmazo could have formulated that better). ABX doesn't force you to fast-switch, it enables you to do it. And typical ABX implementations don't force you to finish a trial within X minutes either.
So if you start your favourite ABX tool you should be able to make use of whatever types of memory you like. Obviously we need to make use of some memory if we compare files that are being played consecutively.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 13:46:11
JKeny has been offered free therapy (http://www.diyaudio.com/forums/everything-else/171506-what-can-measurements-show-not-show-5.html#post2266298) that would address his questions, for almost 5 years, yet has continued to evade.
Clearly has no interest outside of biz$ness related ones.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 14:03:03
"If we are talking about a test with which you want to find the best performers, in order to establish the limits of audibility, there is no such thing as a false negative, because a negative result is the expected default outcome anyway. You are consistently trying to ignore that, too, despite repeated reminders, and try to suggest a symmetry between false positives and false negatives that is wholly inappropriate for the kind of test you are referring to. We are talking about the formal rules of the game here, not some non-formal impressions." From Pelmazo a couple of posts above your statement "I'll take the absence of an actual link to an actual quote of that statement as tacit admission that the above is either made up, or taken out of context."

- "Why go though the work of setting up and running a DBT and then not listen?" Arny, you keep jumping between DBTs & ABX - I'm specifically talking about ABX, right. You also fail to admit that it's not necessarily someone wilfully not listening (like you did in your posted ABX result) but rather a failure of the ABX system to pick up when people have lapsed in their attention, for instance.

-
Quote
"So you suggest that the correct way to do an ABX test is like Arny did - stop listening after 4 trials
Quote
Why not? He couldn't tell them apart. Why wasting lifetime better spent elsewhere?"
from here  (http://www.hydrogenaud.io/forums/index.php?showtopic=108127&view=findpost&p=889163)
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 14:22:13
From Pelmazo a couple of posts above your statement "I'll take the absence of an actual link to an actual quote of that statement as tacit admission that the above is either made up, or taken out of context."

Yes, you clearly took that out of context. Please learn how to use a web forum. If you quoted properly, a link to the original post would automatically appear. And quote at least the whole sentence if not the entire paragraph. You realize that quote-mining is considered a demonstration of dishonesty? Jesus, why is this explanation even required?!

Are you back at trolling? Or playing willfully obtuse like amir did? Well, at least you're not using italics and red/blue colors ...
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 14:34:25
- "So you suggest that the correct way to do an ABX test is like Arny did - stop listening after 4 trials
Why not? He couldn't tell them apart. Why wasting lifetime better spent elsewhere?" from here  (http://www.hydrogenaud.io/forums/index.php?showtopic=108127&view=findpost&p=889163)



Great job John of cherry picking data, character assassination, libel and ad hominem, John. 

You are very fortunate that you don't paid back in kind.

You are getting a pass, probably because of your lengthy track record for incorrigibility and long term demonstration of financially-induced learning disability. ;-)
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 15:05:43
jkeny, your objections already got completely destroyed and the flaws in your suggestions pointed out multiple times. As the kiddies would say: "u got rekt m8".

Why do you continue this nonsense? You have no evidence of it being a false positive. Arny even explicitly said on AVS that "my ears are as I have said not up to this sort of thing".
The whole thing was amir's stupid request of Arny to repeat the mp3 test to see for himself - and he did. If amir can hear differences with his shot hearing with in-ears that roll off the highs powered by laptop audio then great. Arny said he cannot, great. So what's the problem?

(Besides the obvious problem that amir was completely clueless about what a negative test result means for such a test - at least at that point in time .. but, well, given his learning resistance that probably still is the case. Same goes for you.)

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 15:07:02
The context doesn't change the meaning of the statement - "there is no such thing as a false negative". In other words all ABX trials are considered valid listening tests even Arny's null ABX results posted on AVS forum where he admitted that he wasn't listening in this ABX test, just randomly guessing. I'm sure there are many other occurrences of ABX trials in which the listener isn't listening but we have no controls or other way of revealing such instances. That is the sense of the above statement  "there is no such thing as a false negative" & the context doesn't change one whit the meaning of it.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 15:19:03
The context doesn't change the meaning of the statement - "there is no such thing as a false negative".


Of course it does, especially when what we seem to have above is a sentence fragment, not even a whole sentence.

Is this what happens to people who post too much on WBF? They turn into clones of Amir? ;-)

If so, I'm glad to have been banned from there via the back door!

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 15:19:32
"jkeny, you seem to define a false negative as any negative result where the participant maybe could have passed the test."
No, I deem a false negative as a trial where the listener was hampered in some way from actually doing the trial - it could be his pre-conceived belief that there is no difference to be found & he therefore hears no difference, it could be that he has lost focus & attention & therefore isn't actually listening attentively, it could be any number of things. The point is he has to choose an option because it is a forced choice test. This is counted as a valid choice when in fact it shouldn't be.

You already agree to eliminating certain results from the codec preference test that was linked to earlier. These were eliminated on the basis that the listener/setup was not discriminating enough to do the test. I fail to see why you don't concede that there are similar discrimination issues in the running of ABX tests but these results, which should be eliminated as invalid tests, are openly accepted as valid. The problem is that there is no way at the moment to differentiate valid from invalid ABX test & this won't happen unless some controls are included during the running of the ABX test
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 15:28:00
The context doesn't change the meaning of the statement - "there is no such thing as a false negative". In other words all ABX trials are considered valid listening tests even Arny's null ABX results posted on AVS forum where he admitted that he wasn't listening in this ABX test, just randomly guessing. I'm sure there are many other occurrences of ABX trials in which the listener isn't listening but we have no controls or other way of revealing such instances. That is the sense of the above statement  "there is no such thing as a false negative" & the context doesn't change one whit the meaning of it.

Completely false again. Maybe forums should require a reading comprehension pre-screening...

The context was that of the best performers competing, such as runners in a sprint that want to break the 100m record. In that analogy Arny is not a runner, he said he is not a good runner, but was still requested to participate in the sprint which he graciously did.

What do you take away from either a true negative or false negative anyway? That Arny couldn't hear a difference or didn't care enough. So what?
I believe I can successfully ABX a 256 kbps mp3 depending on the material (but even lower bitrate mp3's don't hinder me from enjoying the music), and I will try to provide the evidence to the best of my ability if challenged on this claim. Arny didn't claim that. He was merely skeptical about amir's logs, which is understandable given amir's demonstrable dishonesty and lack of credibility. Amir has only himself to thank that people are highly skeptical about what he posts.

This whole thing is not big deal at all. It's nothing. You are just blowing it up again because you're running out of the usual noise that you produce.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 15:29:46
"jkeny, you seem to define a false negative as any negative result where the participant maybe could have passed the test."
No, I deem a false negative as a trial where the listener was hampered in some way from actually doing the trial - it could be his pre-conceived belief that there is no difference to be found & he therefore hears no difference, it could be that he has lost focus & attention & therefore isn't actually listening attentively, it could be any number of things.


IOW Mr. Kenny "a false negative" is whatever you want it to be: " it could be any number of things."

Nice job of descending into Solipsism.

Quote
You already agree to eliminating certain results from the codec preference test that was linked to earlier. These were eliminated on the basis that the listener/setup was not discriminating enough to do the test.


It has been stipulated that false negatives can occur in just about any kind of listening test. There can be false negatives in sighted evaluations. So what?

Quote
I fail to see why you don't concede that there are similar discrimination issues in the running of ABX tests but these results, which should be eliminated as invalid tests, are openly accepted as valid.


There are potential discrimination issues in any listening test. Why beat a dead horse to death, then to the point of total maceration, then to the point of global distribution of the remains?



Quote
The problem is that there is no way at the moment to differentiate valid from invalid ABX test & this won't happen unless some controls are included during the running of the ABX test


Asked and answered any number of times in just the past few days.  There are many ways to differentiate valid from invalid listening tests of all kinds. For example all sighted evaluations relating to small audible differences are invalid on the well known grounds of their extreme propensity towards false positives.

Mr. Kenny, by continuing to beat on these dead horses you would appear to be forcing the thread to be closed. TOS 2 seems to apply, if nothing else.

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 15:34:45
I'll restate my position again:
- ABX tests involve some level of false negatives in the results for a wide variety of reasons. I'm interested in the level of these false negatives. You guys are not.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 16:01:19
http://www.diyaudio.com/forums/everything-...ot-show-14.html (http://www.diyaudio.com/forums/everything-else/171506-what-can-measurements-show-not-show-14.html)

Quote
Quote
Originally Posted by jkeny
I & others have done numerous side by side comparisons of stock unit to modified unit - were they DBT, no - there is absolutely no need, the difference in sound is so glaringly obvious & noticeable from the first couple of notes.


Originally Posted by Pano
You appear not to be interested in proof at all, but something else entirely.

Bottom line: This thread is a violation of the rules on more than one account. Mostly this one:
Quote:
Originally Posted by The Rules
Some threads become repetitive or conflict prone. The moderation team will, at its discretion, close these threads. Starting a new thread to discuss the same topic is prohibited.
This is clearly a continuation of your other thread and should not have been allowed in the first place. You have been offered well reasoned help from very technically competent and friendly forum members which you have dismissed at every turn. This is insulting to the members of this forum. It is also trolling.


On it goes 5yrs later....
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 16:08:10
No, I deem a false negative as a trial where the listener was hampered in some way from actually doing the trial - it could be his pre-conceived belief that there is no difference to be found & he therefore hears no difference, it could be that he has lost focus & attention & therefore isn't actually listening attentively, it could be any number of things. The point is he has to choose an option because it is a forced choice test. This is counted as a valid choice when in fact it shouldn't be.

First of all, that is not a false negative. Just because someone e.g. doesn't give full attention to a trial doesn't mean that they could have actually distinguished the files.

Secondly, so we should look into people's heads, their feelings, emotions, beliefs ... while they do listening tests? Even given the possibility of that, I'm sure you'd then complain that this invasive monitoring prevents you from hearing fine differences?.

Have you not (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894883) understood (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894932) anything (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894952) that (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894964) others (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894967) and I have posted during the past 24 hours? How we deal with negative results?


You already agree to eliminating certain results from the codec preference test that was linked to earlier. These were eliminated on the basis that the listener/setup was not discriminating enough to do the test. I fail to see why you don't concede that there are similar discrimination issues in the running of ABX tests but these results, which should be eliminated as invalid tests, are openly accepted as valid. The problem is that there is no way at the moment to differentiate valid from invalid ABX test & this won't happen unless some controls are included during the running of the ABX test

Yes, how could you possibly rate the quality of the impaired file if you cannot even distinguish the files? But in that test ABX is the very tool used to convince yourself that you are not just guessing.
We have to start trying to distinguish the files somewhere. Is that so hard to get?

And for the 100th time, random logs on the Internet (especially by dishonest people) are never "accepted as valid". Have you not (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894883) understood (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894932) anything (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894952) that (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894964) others (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894967) and I have posted during the past 24 hours?

/sigh
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 16:31:48
I'll restate my position again:
- ABX tests involve some level of false negatives in the results for a wide variety of reasons. I'm interested in the level of these false negatives. You guys are not.


I agree with others who take exception to the sloppy and often false use of the phrase "False Negatives", particularly we have seen often practiced by paid representatives of the Unreliable Listening Test sector of Audiophilia. 

As a practical matter what we really have is Questionable Negatives. These are trials with negative outcomes that we are far less than totally confident about their being irreducible.

The false claim authored by John Kenney above is a complete and total reversal of the true facts.

We of the reliable listening testing persuasion are vitally and actively engaged in identifying the incidence and the causes of Questionable Negatives, and reducing and or eliminating them. This has been true for over 30 years. The ABX Test Development Team identified these major sources of Questionable Negatives during the initial development of Interactive ABX in April, 1977:

(1) Long delays during switching.

(2) Inability of listeners to train or retrain themselves to reliably detect the audible difference between A and B.

(3) Distraction by non-audible influences such as sight or the demeanor of others who thought they knew the identities of the unknowns.

(4) Distraction by audible but trivial influences such as level mismatch.

We were just about 100% effective in addressing these problems which remain generally unaddressed in sighted evaluations to this day.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 16:39:00
I'll restate my position again:
- ABX tests involve some level of false negatives in the results for a wide variety of reasons. I'm interested in the level of these false negatives. You guys are not.


Sure we're interested.

Please define:
- the test scenario:
What claims are being tested? Who is supposed to participate in it? Who is the target audience? ...

- the test itself:
What exactly is being tested? Who actually participates? How does the test procedure look like (the main part is ABX I guess)? Are all trials logged? Are all test results collected, or can participants randomly send in their logs when they feel like it? ...

- the analysis of the results:
How are positive and negative results analyzed and interpreted? When is the claim sufficiently evidenced? When can the claim be rejected? And everything in between ...

- what you mean by false negative:
So far your definitions were either unusably vague, nonsensical or unreasonable to work with. We cannot look into the brains and make judgments based on beliefs ... we can only try to eliminate biases, not eliminate them entirely.

- how do potential false positives and false negatives effect the results and conclusion?

- how do you suggest do we detect false positives and false negatives?
Please don't just say "add a control here". Be specific.


Given these details - or better yet a concrete example - we can have a sincere discussion of what is reasonably possible to improve.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-07 17:09:01
The context doesn't change the meaning of the statement - "there is no such thing as a false negative". In other words all ABX trials are considered valid listening tests even Arny's null ABX results posted on AVS forum where he admitted that he wasn't listening in this ABX test, just randomly guessing. I'm sure there are many other occurrences of ABX trials in which the listener isn't listening but we have no controls or other way of revealing such instances. That is the sense of the above statement  "there is no such thing as a false negative" & the context doesn't change one whit the meaning of it.

Others have already answered that, and I join them. I wrote this in the context of tests for the limits of perception, for which I used the analogy of the runners. You omitted this context and made it look as if I had denied the existence of false negatives in general. You are deliberately distorting what I wrote for a very transparent reason.

Even if I accepted your definition of a false negative, it still would be of no consequence in the mentioned context. Therefore, in that context, the concept of a false negative is useless - a red herring. A credible valid positive would override any number of false negatives obtained before - how much more can you want? Your insistence on distinguishing true from false negatives, apart from being completely impractical, is nothing but an attempt at distraction towards a useless concept.

I think it is clear what you actually would like to get: You would like to have the power and the pretext to dismiss any test result that you find inconvenient, even after the test has taken place. You'd like a blanket license to call anything you like a false negative. You admitted already that you would like to include reasons that only the listener himself can know about, such as "not really listening". You haven't yet offered a reliable and impartial way how that can be determined by others. The end result would be that there are no true negatives that you would approve. How stupid do you think everybody is?

The sports analogy really should be simple enough to comprehend. There's no harm when there are people participating who "are not really running", or are too fat, or too old, or whatever. That arguably happens frequently enough. Nobody will assume that they reduce in any way the established record, or hinder the establishment of a new record. At most, they make a run useless, because the chance that those participants would break the record was nil to start with. No harm in that. There's no harm either in pre-screening the participants to raise the chance of getting a new record. In fact it is a good idea, but it doesn't make other trials invalid. If you hope to break the record, you are free to organise a run, and the question of the validity of any of the previous runs is simply irrelevant.

To be clear: This does not mean that other kinds of test, which are not after the extremes of perception, but are for example conducted to find preferences between alternatives, could not suffer from false negatives. I never wrote that, never meant that and never wanted to imply it. Depending on the kind of test you conduct, false negatives can of course happen and be relevant. However, those test are not the ones that are under discussion in conjunction with audiophile claims of audibility.

You might of course be unable to tell the different kinds of test apart. I don't think you can be so dumb, and I assume you are being deliberately unclear. For example, it can hardly have escaped you that the test by Sean Olive you tried to use for a counterargument, was in fact a completely different kind of test from the one we were discussing.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 17:23:02
No, I deem a false negative as a trial where the listener was hampered in some way from actually doing the trial - it could be his pre-conceived belief that there is no difference to be found & he therefore hears no difference, it could be that he has lost focus & attention & therefore isn't actually listening attentively, it could be any number of things. The point is he has to choose an option because it is a forced choice test. This is counted as a valid choice when in fact it shouldn't be.

First of all, that is not a false negative. Just because someone e.g. doesn't give full attention to a trial doesn't mean that they could have actually distinguished the files.
Sure but we don't know if they could have distinguished the files. do we - they haven't actually done the test - they haven't actually listened - they may just as well not have turned up & just posted in a random set of results generated by a random number generator that get included in the pile of null results. That's why I suggest using some internal controls that they should pick up as different & if they don;t we have some measure of their attentiveness, ability to discern a difference 

Quote
Secondly, so we should look into people's heads, their feelings, emotions, beliefs ... while they do listening tests? Even given the possibility of that, I'm sure you'd then complain that this invasive monitoring prevents you from hearing fine differences?.
As I said that's why I'm suggesting internal controls - to attempt a random monitoring of their continued trials to give some indicator, some handle, some measure of their actual attentiveness as the ABX test progresses.

I asked this before - has anyone analysed ABX test results to see what, if any difference there is between the number of positive results in early trials Vs the number of positive results towards the end f trials. Nobody answered this - I expect it would reveal if there was any drop off in discrimination ability of the listener as the test progressed i.e. as the number of trials increased. It might give a handle on a number of questions about ABX tests.

I'm asking these questions because they have a foundation in my & others reported experience in doing ABX testing i.e. the fatigue & loss of attention that quickly creeps up on the listener when doing such tests.

Would you accept results from a test that you knew was done by a random number generator picking A or B?
Would you accept results where a random generator was used on some of the trials & a human listened on others?
What's the difference between this example & a human listener losing focus or interest in the test?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 17:28:53
Arny, I believe you are confusing false negatives with false positives & confusing yourself & others in the process
In yur posts you also jump between ABX & DBTs in your replies & further confuse yourself & others - as I've said before I'm talking about ABX tests
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 17:31:23
Xnor, I've addressed all your questions before so I don't want to go down that particular rabbit hole that you are waving your pocketwatch over & stating "we are late for a very important date"
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 17:41:50
I get better, more comprehensive results with longer term listening.

I'm asking these questions because they have a foundation in my & others reported experience in doing ABX testing i.e. the fatigue & loss of attention that quickly creeps up on the listener when doing such tests.

 
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 17:42:40
jkeny, all of this will be a lot clearer when you finally define the things I asked you to (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992).
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 17:45:01
Xnor, I've addressed all your questions before so I don't want to go down that particular rabbit hole that you are waving your pocketwatch over & stating "we are late for a very important date"

No, you haven't. What you posted was very general and vague, and my attempts to explain how this naive black/white thinking doesn't work have obviously failed, so let's do this properly. Don't evade. Give us a specific test scenario. See this post (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992).

Otherwise I have to assume that you are not interested in an honest discussion but only in slinging mud.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 18:09:45
The context doesn't change the meaning of the statement - "there is no such thing as a false negative". In other words all ABX trials are considered valid listening tests even Arny's null ABX results posted on AVS forum where he admitted that he wasn't listening in this ABX test, just randomly guessing. I'm sure there are many other occurrences of ABX trials in which the listener isn't listening but we have no controls or other way of revealing such instances. That is the sense of the above statement  "there is no such thing as a false negative" & the context doesn't change one whit the meaning of it.

Others have already answered that, and I join them. I wrote this in the context of tests for the limits of perception, for which I used the analogy of the runners. You omitted this context and made it look as if I had denied the existence of false negatives in general. You are deliberately distorting what I wrote for a very transparent reason.

Even if I accepted your definition of a false negative, it still would be of no consequence in the mentioned context. Therefore, in that context, the concept of a false negative is useless - a red herring. A credible valid positive would override any number of false negatives obtained before - how much more can you want? Your insistence on distinguishing true from false negatives, apart from being completely impractical, is nothing but an attempt at distraction towards a useless concept.
I see your logic -  there is no such thing as a false negative until such a time as a credible positive result is returned - at which point all the previous null results become false negatives. You are looking at this issue at a more macro level than I am - I'm saying that we don't know the validity of the null results of an ABX test - there are no internal checks that allow us to validate the results returned - were they done by a human or by a random number generator?


Quote
I think it is clear what you actually would like to get: You would like to have the power and the pretext to dismiss any test result that you find inconvenient, even after the test has taken place. You'd like a blanket license to call anything you like a false negative. You admitted already that you would like to include reasons that only the listener himself can know about, such as "not really listening". You haven't yet offered a reliable and impartial way how that can be determined by others. The end result would be that there are no true negatives that you would approve. How stupid do you think everybody is?
No, I would like to see that the null results were coming from a test where a certain level of assurance that the test run itself is/was capable of some level of discernment. I don't buy the statements made here that null results are of no consequence when we all know that these are used as an edifice of evidence

Quote
The sports analogy really should be simple enough to comprehend. There's no harm when there are people participating who "are not really running", or are too fat, or too old, or whatever. That arguably happens frequently enough. Nobody will assume that they reduce in any way the established record, or hinder the establishment of a new record. At most, they make a run useless, because the chance that those participants would break the record was nil to start with. No harm in that. There's no harm either in pre-screening the participants to raise the chance of getting a new record. In fact it is a good idea, but it doesn't make other trials invalid. If you hope to break the record, you are free to organise a run, and the question of the validity of any of the previous runs is simply irrelevant.
Sure but if you use the number of failed attempts to support your contention that the record can't be broken then the number & quality of these failed attempts come under scrutiny. Quickly one will see that allowing people with broken legs & hobnail boots into the count of failed attempts undermines your credibility as to your claims.

Quote
To be clear: This does not mean that other kinds of test, which are not after the extremes of perception, but are for example conducted to find preferences between alternatives, could not suffer from false negatives. I never wrote that, never meant that and never wanted to imply it. Depending on the kind of test you conduct, false negatives can of course happen and be relevant. However, those test are not the ones that are under discussion in conjunction with audiophile claims of audibility.

You might of course be unable to tell the different kinds of test apart. I don't think you can be so dumb, and I assume you are being deliberately unclear. For example, it can hardly have escaped you that the test by Sean Olive you tried to use for a counterargument, was in fact a completely different kind of test from the one we were discussing.
Sure, I used a preference test incorrectly in extrapolating it to how the results would appear in a test of difference.
I would still be interested in the level of false negatives in ABX tests
I would also be interested in the statistical level of positives Vs negatives throughout the trials in an ABX test - does this not have some interest for you?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 18:14:24
Otherwise I have to assume that you are not interested in an honest discussion but only in slinging mud.

...at which point someone might close the discussion since it's clearly only trolling.

Then the trolling party misinterprets this [intentionally (read: dishonest) or otherwise (read: difficulty with reading comprehension, to put it mildly)] in order to claim victory elsewhere.  He'll be sure to continue mining quotes, specifically ignoring truth such as what is stated in the previous sentence.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 18:17:13
I see your logic -  there is no such thing as a false negative until such a time as a credible positive result is returned - at which point all the previous null results become false negatives.

???
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 18:29:07
I & others have done numerous side by side comparisons of stock unit to modified unit - were they DBT, no - there is absolutely no need, the difference in sound is so glaringly obvious & noticeable from the first couple of notes.

I'm saying that we don't know the validity of the null results of an ABX test

But we do. We know no absolute proof exists and exactly what repeated null shows unequivocally.
That if any difference exists, it was below the threshold of the test AND, what triggered the test in the first place, the "glaringly obvious", is a disorder, better served by some sort of mental evaluation test, rather than frivolous "audio" tests.

I would still be interested in the level of false negatives in ABX tests

For sales purposes. Presumed to exist due to wishful thinking fallacy, while simultaneously completely uninterested in false positives, like fabricated online ABX logs, etc. that concur with beliefs/sales items.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 18:37:27
If they're so glaringly obvious then verification by way of producing valid positive ABX results by an independent third-party would be trivial.  Of course this assumes the first-party is acting in good faith by at least providing evidence using a test designed to remove biases.

"Night and day differences!" Placebophiles simply can't help themselves.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 18:41:56
Xnor, I've addressed all your questions before so I don't want to go down that particular rabbit hole that you are waving your pocketwatch over & stating "we are late for a very important date"

No, you haven't. What you posted was very general and vague, and my attempts to explain how this naive black/white thinking doesn't work have obviously failed, so let's do this properly. Don't evade. Give us a specific test scenario. See this post (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992).

Otherwise I have to assume that you are not interested in an honest discussion but only in slinging mud.

Ok, let's take a concrete example - Arny's original jangling keys test.
The test was an ABX test.
False positives were pretty well controlled by the actual design of the ABX test itself
No concern for nor any indication of the number of false negatives in the null results returned over the 15 year period that this test existed for
Arny can probably give us an idea of how many times this was downloaded
He might also be able to give us an estimate of the number of null results returned by this test.

So for 15 years, never having a positive result returned for it, this test remained as supporting evidence that anybody claiming high-res was audibly different from RB was mistaken.

Now, after Amir's positive results it was discovered that there were a number of audible issues with the Jangling keys files which would allow them to be differentiated.
Yet nobody in that 15 years actually found that tell & returned a positive result.
Why?

Was the test somehow suppressing the ability to discriminate this audible difference?
The null results returned in that 15 years were all false negatives as the listeners didn't hear the audible difference between these files.
That's an awful long stretch over which everybody who tried the test (including Arny, I presume) only ever reported false negative results - no positives.

So, what is the problem here? Is the test fundamentally insensitive at revealing such differences?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 18:44:30
Ok, let's take a concrete example - Arny's original jangling keys test.
The test was an ABX test.
False positives were pretty well controlled by the actual design of the ABX test itself

Sorry but no; they weren't.  These and subsequent samples attempting to correct glaring problems were all naively prepared.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-07 18:45:06
The answer to a negative (false or otherwise) is a true positive.

Easily delivered when there are night and day differences.

14 pages of argument seems less easy to me, but is preferred by some

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 18:45:49
All this talk about how I'm trying to discredit ABX testing for my financial benefit when you already stated that my potential customers are not the type who would be interested in ABX tests. So how do you square these two statements?
Some logic & consistency in your accusations would at least be welcome!
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 18:47:52
Look, he's retreating.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 18:48:38
Ok, let's take a concrete example - Arny's original jangling keys test.
The test was an ABX test.
False positives were pretty well controlled by the actual design of the ABX test itself

Sorry but no; they weren't.  These and subsequent samples attempting to correct glaring problems were all naively prepared.

Are we talking about the same thing - by false positives I mean the mistaken determination by a listener that two files are audibly different when they are actually identical files.
What do you mean?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 18:56:56
Wait, Arny, people have been swarming all over your jangling keys example for 15 years because all the necessary tools were readily available?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 19:13:50
I asked this before - has anyone analysed ABX test results to see what, if any difference there is between the number of positive results in early trials Vs the number of positive results towards the end f trials.


Typical of the intellectually lazy - to expect others to do their their own homework for them.

Typical of someone who has a highly prejudiced and closed mind - to assert that perchance there is a given result, it is solely due to the cause that one has hypothesized.




Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 19:23:15
I asked this before - has anyone analysed ABX test results to see what, if any difference there is between the number of positive results in early trials Vs the number of positive results towards the end f trials.


Typical of the intellectually lazy - to expect others to do their their own homework for them.

Typical of someone who has a highly prejudiced and closed mind - to assert that perchance there is a given result, it is solely due to the cause that one has hypothesized.

Wow, asking a question to find out if others had looked into this already is being intellectually lazy?
The lengths that some go to in this thread to bend something into an ad-hom attack is entertaining
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 19:25:19
Arny, care to give any stats on your original jangling keys file - number of downloads, number of ABX tests actually done on these files?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 19:27:37
Wait, Arny, people have been swarming all over your jangling keys example for 15 years because all the necessary tools were readily available?

Greynol, care to answer my previous post - what do you understand by false positives?
What "necessary tools" are you talking about in the above?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 19:34:43
Ok, let's take a concrete example - Arny's original jangling keys test.
The test was an ABX test.
False positives were pretty well controlled by the actual design of the ABX test itself


I wish that were true.

The test has been around for about 15 years and has been beset by false positives due to the very nature of the samples.  This has been going on right under your nose Mr. Kenny! The proof of that is contained in this thread which has been added to while the current thread was ongoing:

New Keys Jangling Samples & associated comments (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894177)

Quote
No concern for nor any indication of the number of false negatives in the null results returned over the 15 year period that this test existed for


False claim again - and again the discussion of this has been going on right under your nose, Mr. Kenny. Please follow this link for proof: WBF Forum thread discussing Keys Jangling Samples (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different&p=276574&viewfull=1#post276574)

Quote
He might also be able to give us an estimate of the number of null results returned by this test.


However the number of null results gives no clue as to the number of Questionable Negatives because psychoacoustically speaking, the test should return null results.

Quote
So for 15 years, never having a positive result returned for it,


False claim.  I simply am not privy to the actual number of positive or negative results. People had the samples to experiement freely, and AFAIK few of them reported their results either way.

Quote
this test remained as supporting evidence that anybody claiming high-res was audibly different from RB was mistaken.


That conclusion was based on an pyschoacoustical analysis.

Quote
Now, after Amir's positive results it was discovered that there were a number of audible issues with the Jangling keys files which would allow them to be differentiated.


No, many of those had been known all along.

Quote
Yet nobody in that 15 years actually found that tell & returned a positive result.
Why?


This may surprise you Mr. Kenny, but nobody was required to provide any results back to me at all, and it appears that people exercised that option in droves.

Quote
The null results returned in that 15 years were all false negatives as the listeners didn't hear the audible difference between these files.


False claim that demonstrates the claimant's prejudices.

Quote
That's an awful long stretch over which everybody who tried the test (including Arny, I presume) only ever reported false negative results - no positives.


The presumption is erroneous and again demonstrates the claimant's prejudices.

Quote
So, what is the problem here? Is the test fundamentally insensitive at revealing such differences?


In accordance with what I understand about psychoacoustics, the test should produce negative results, therefore as far as I am concerned, any negative results that are obtained are likely to be True Negatives.

Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 19:41:46
All this talk about how I'm trying to discredit ABX testing for my financial benefit when you already stated that my potential customers are not the type who would be interested in ABX tests. So how do you square these two statements?

Careful, my psychiatric billing fees are much higher than my loudspeaker ones! 

Now, if I may summarize:
"Glaring difference" heard by Jkeny when "long term" low stress peeking, no need for ABX.
....but, if JKeny not allowed to peek and forced to "Listen with ears" for a highly strenuous 20 seconds during said ABX, no difference detected with statistical significance.
Problem with peeking?
NO, problem with ABX cognitive load stress and "False Negatives"!!!

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 19:42:18
Ok, let's take a concrete example - Arny's original jangling keys test.

Everything you said after that was blatantly false or lies, and you still evaded the most important questions.
See here (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992). Come up with a hypothetical test if you must. The good thing about your post is that it was so abysmal that it can only get better from here.


The null results returned in that 15 years were all false negatives as the listeners didn't hear the audible difference between these files.

Wow... Are you serious?
This is exactly the asinine BS why I've asked you to define this stuff (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992) clearly. If you did then you'd notice the complete nonsense coming out of your own mouth.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-07 20:27:47
Quote
This is getting trying &audng - Can anyone link me to where I can find the answer to this simple question?
Where can I find a measure of the percentage of false negatives in null results reported for ABX tests?

There are stats for this (chances of failing a given length ABX test at a given p val when you detect the difference X percent of the time on average) but I don't have them to hand.

I had a quick google to try to find this, but was hampered by the fact that a certain jkeny has posted about this topic in goodness knows how many audio forums, and google ranks your questions more highly than the answers!

IIRC the answers are given in the form I suggest above. No great help because you don't know how many answers will be correct on average. If that is what you are really asking, that's a "how long is a piece of string" question.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 20:36:00
I wish that were true.

The test has been around for about 15 years and has been beset by false positives due to the very nature of the samples.
Ah, really, I didn't know that there were false positives - can you point to any of these ABX results prior to Amir's 
Quote
This has been going on right under your nose Mr. Kenny! The proof of that is contained in this thread which has been added to while the current thread was ongoing:
The proof of what, exactly?
Quote
New Keys Jangling Samples & associated comments (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894177)

Quote
No concern for nor any indication of the number of false negatives in the null results returned over the 15 year period that this test existed for

Quote
False claim again - and again the discussion of this has been going on right under your nose, Mr. Kenny. Please follow this link for proof: WBF Forum thread discussing Keys Jangling Samples (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different&p=276574&viewfull=1#post276574)
Can you show me where in that thread false negatives are being discussed?


Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-07 20:44:59
I see your logic -  there is no such thing as a false negative until such a time as a credible positive result is returned - at which point all the previous null results become false negatives.

That's not my logic, it is your twisted version of it. My version: ...all previous null results remain exactly that: Null results. Failed attempts, to be ignored.

Quote
You are looking at this issue at a more macro level than I am - I'm saying that we don't know the validity of the null results of an ABX test - there are no internal checks that allow us to validate the results returned - were they done by a human or by a random number generator?

Yes, I got that, like probably everyone else. It still doesn't matter how the null results came about. You may be interested in the distinction, but that's your private obsession.

Quote
No, I would like to see that the null results were coming from a test where a certain level of assurance that the test run itself is/was capable of some level of discernment. I don't buy the statements made here that null results are of no consequence when we all know that these are used as an edifice of evidence

If they would be used as an edifice of evidence, you would not be able to change that. People would continue no matter what you say. Do you really believe your diverging opinion would make any difference? It is much more plausible that you are trying to convince yourself of your own fabrications. That you are essentially talking to yourself.

I repeat that I do not see the null results as evidence, I see the continued lack of positive results as evidence. I have absolutely no idea how many null results there are. Most of them, I assume, I would never hear of. I'm sure there are vast numbers of null results that nobody bothers to report to anyone. I know I have done many myself. If I saw null results as evidence, I would have to pull the numbers out of thin air. That's not evidence by anyone's account. The lack of positive results over a prolonged period of time, particularly in hotly contested topics, however does constitute tangible evidence. This kind of evidence is completely independent of the number and "validity" of null results. Hence you could fiddle with the validity of null results as much as you want, it wouldn't change my account of the available evidence at all. You are simply barking up the wrong tree.

Quote
Sure but if you use the number of failed attempts to support your contention that the record can't be broken then the number & quality of these failed attempts come under scrutiny. Quickly one will see that allowing people with broken legs & hobnail boots into the count of failed attempts undermines your credibility as to your claims.

But I don't. I am not so stupid as to treat the performance of sumo wrestlers as being relevant for 100m sprints. You are implicitly accusing everybody of something that you are making up in your own mind. It is the lack of true positives that should occupy your mind. That's the elephant in the room.

Quote
Sure, I used a preference test incorrectly in extrapolating it to how the results would appear in a test of difference.
I would still be interested in the level of false negatives in ABX tests
I would also be interested in the statistical level of positives Vs negatives throughout the trials in an ABX test - does this not have some interest for you?

No, I think it is an obsession that leads away from the real issues. You are trying to not notice the elephant, so you affix your eyes (and try to affix everyone else's) to some irrelevant detail that you declare to be important.

Quote
Now, after Amir's positive results it was discovered that there were a number of audible issues with the Jangling keys files which would allow them to be differentiated.
Yet nobody in that 15 years actually found that tell & returned a positive result.
Why?

Perhaps the simplest explanation is so simple that you are missing it because it is so obvious:

Because they didn't find the tell.

You can either stumble across a tell by pure chance, or you have prior training that makes you look for it consciously. It is very easy to miss a subtle tell if you haven't got any idea whether there is one, and what it might be. This is completely normal and no sign of anything being wrong. This is one reason why your idea of including extra tests for determining the listener's acuity is so problematic. You can only include tests for known types of impairments or clues. If you don't know what you are trying to find, you can't devise a reliable test verification.

If some people are better at finding tells than others, it is simply because of their greater training in such things. You are much more likely to find a tell of a type that you have experienced before.

A classical example that has made ripples in the audiophile scene for many years, not the least because of the aggressive peddling by Mr. Harley and Stereophile, is the "failure" of the listeners at Swedish Radio to uncover a flaw in an early psychoacoustic codec they were evaluating in a large scale blind test. Harley and the Stereophile used the occurrence as an argument against blind tests, essentially claiming that if blind tests were any good, they would have uncovered that flaw. The flaw was actually uncovered by a now deceased scientist, Bart Locanthi, who allegedly found it in a non-blind test. The obvious explanation was that this had nothing to do with blind or non-blind testing, but was a case of someone just knowing what to search for. The codec was of course listened to in a non-blind fashion any number of times during its development, and if detecting the flaw had anything to do with blind vs. non-blind listening, it would have had to be uncovered even before the blind testing began. Hence Harley's argument was and remains a pure fabrication. He blamed the blind testing because that's what suited his agenda. In reality, both blind and non-blind tests suffer from the same problem: Subtle flaws may only be found by chance or when one knows what to search for.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 20:59:47
Ok, let's take a concrete example - Arny's original jangling keys test.

Everything you said after that was blatantly false or lies, and you still evaded the most important questions.
You asked for a concrete example to discuss I gave you one & now you won't discuss it. Jeez, I guess the continual demands are meant to wear one down!
Quote
See here (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992). Come up with a hypothetical test if you must. The good thing about your post is that it was so abysmal that it can only get better from here.
So you don't want to discuss an actual ABX test as you asked me to produce?

The null results returned in that 15 years were all false negatives as the listeners didn't hear the audible difference between these files.
Quote
Wow... Are you serious?
By definition they must be false negatives - an audible difference that these listeners didn't discern - there were known structural problems with the files as Arny admits. The fact that listeners didn't hear these audible differences means they were false negatives - it's the definition of a false negative (even though you & others try to sully this definition). The fact that Amir posted positive ABX results was argued that he did so because of the structural problems in the files. So he heard the audible issues but all the listeners in the previous 15 years didn't hear this. hence they all returned false negatives for whatever reason(s).
Quote
This is exactly the asinine BS why I've asked you to define this stuff (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992) clearly. If you did then you'd notice the complete nonsense coming out of your own mouth.
I've defined what a false negative is over & over again in this & in the other closed thread. It's amazing how much ducking & diving you guys do to avoid logic.
It's entertaining to see the same poster cropping up every now & then to ask for a definition of false negatives he has already been given before & whose definition is available on-line for anybody to look up. But like the continual demands for another example or another test or another whatever, it is designed to wear down the opposition.

Arny - this is truly a definition of intellectual laziness if you wanted an accurate one.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 21:24:16
The null results returned in that 15 years were all false negatives as the listeners didn't hear the audible difference between these files.



It would appear Mr. Kenny that you in your repeatedly demonstrated obsession with positive results have completely and utterly lost track of what the test was about.

Any differences that may have ever been heard from these files were either absolutely known to be artifacts of errors by the people who supervised or ran the tests, or suspected to be so.

If the test is done in an error-free environment and in an error-free manner then the results will be random guessing. All negative results are true not false.

You can find a demonstration of how seemingly infinitesimal errors can result in false positives here: Demo of how a infintesimal error can be audible (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=895047)
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 21:43:17
I wish that were true.

The test has been around for about 15 years and has been beset by false positives due to the very nature of the samples.
Ah, really, I didn't know that there were false positives - can you point to any of these ABX results prior to Amir's


Amirs were the first published postive results, and it is now known  that all of the positive results that he observed were false positives for various known reasons.


Quote
Quote
This has been going on right under your nose Mr. Kenny! The proof of that is contained in this thread which has been added to while the current thread was ongoing:
The proof of what, exactly?
Quote
New Keys Jangling Samples & associated comments (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894177)

Quote
No concern for nor any indication of the number of false negatives in the null results returned over the 15 year period that this test existed for

Quote
False claim again - and again the discussion of this has been going on right under your nose, Mr. Kenny. Please follow this link for proof: WBF Forum thread discussing Keys Jangling Samples (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different&p=276574&viewfull=1#post276574)
Can you show me where in that thread false negatives are being discussed?


Two threads were cited and since the second thread was about positive results, that is not the place to look.  Attempt to distract the discussion away from relevant facts noted.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 21:47:23
You asked for a concrete example to discuss I gave you one & now you won't discuss it. Jeez, I guess the continual demands are meant to wear one down!

No, I asked for this (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992).

You said "arny's test" followed by a bunch of falsehoods and lies. Also, I am not omniscient and do not know what nonsense is going on every other audio forum.

So you don't want to discuss an actual ABX test as you asked me to produce?

I asked you this (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992). How obtuse can you be?
Maybe you should use my post as "template" since your short term memory seems to be really short.


By definition they must be false negatives - an audible difference that these listeners didn't discern - there were known structural problems with the files as Arny admits.

More twisting. Does it not occur to you, that people actually just listened to the jangling keys part and ignored or even cut out the pure HF/IMD test tones that can cause damage to your system? Or that honest people do not abuse small time offsets between the files, because they are interested in the sound quality and not in how to abuse the ABX software?

All of this is why I asked you to clearly define this (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992). Why are you evading?


So he heard the audible issues but all the listeners in the previous 15 years didn't hear this.

Arny already explained this to you.
I, for example, didn't actually know that Arny had put these files up for a "test" as of recently. IIRC I saw the files being offered somewhere for personal experimentation quite a long time ago, or at least saw Arny mentioning jangling keys as a good candidate for HF testing.

This is why I asked you to answer these (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992) things. Actual answers would clarify all of your confusions.


I've defined what a false negative is over & over again in this & in the other closed thread. It's amazing how much ducking & diving you guys do to avoid logic.
It's entertaining to see the same poster cropping up every now & then to ask for a definition of false negatives he has already been given before & whose definition is available on-line for anybody to look up. But like the continual demands for another example or another test or another whatever, it is designed to wear down the opposition.

Here (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894932) you said "a listener not hearing differences when they are actually present". What do you mean by present? Present in measurements? What's your standard for defining what audible difference should be actually audible for somebody?
Here (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894987) your post more nonsense.

This is why I asked you to define these (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992) things as clearly as possible.


Do you get it by now?
Do you get it by now?
Here (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992) again, just for you!
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-07 21:59:40
Ah, really, I didn't know that there were false positives - can you point to any of these ABX results prior to Amir's

Whoa, Amirs results were false positives? How do you know? Were you there proctoring? How loud did he have to crank the quiet passages? What particular spectral analysis software was used?
TIA
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:02:35
Thanks pelmazo, your posts remain the most interesting & almost the least ad-hom

OK, so you don't care how or where the null results come from because you don't see the null results as evidence of anything but you see the continued lack of positive results as evidence, right?

I think you fail to see the psychological influence that a body of null results have on listeners - they are less motivated to find a difference if there is already a body of "evidence" that no difference has been heard. This is just human psychology - we want to be right. It is very easy to prevent someone hearing a difference that actually exists by building this expectation.

Someone asked me earlier in the thread, how come once one positive ABX result was reported, that others then appeared - it's because of this psychological factor. The surety was no longer in place that there was no audible difference - someone had let the genie out of the bottle

ABX tests are designed to eliminate certain types of biases but other biases are ignored or down-played & this is why I ask for controls in ABX testing. The specification for strict test design & controls in the ITU standards is an attempt to deal with these biases. Home administered ABX tests, missing any of these controls, fail in this regard.

Quote
It is very easy to miss a subtle tell if you haven't got any idea whether there is one, and what it might be. This is completely normal and no sign of anything being wrong.

Yes, I would say that a "tell" is just another "impairment". So what you are saying is that it's easy to miss an audible impairment if you haven't got any idea whether there is one. Almost guaranteed to miss if you have been told a body of null tests already exists or the test has been in existence for 15 years & no positive results in that time.

You raise another good point - in the previous 15 years people just didn't pick out any difference because they about training which I agree with - unless the listener has familiarity with the audibility of the impairment (or it is demonstrated to them) , they are unlikely to discern it - hence the ITU recommendation for trained listeners - another missing factor in ABX tests

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:06:00
I wish that were true.

The test has been around for about 15 years and has been beset by false positives due to the very nature of the samples.
Ah, really, I didn't know that there were false positives - can you point to any of these ABX results prior to Amir's


Amirs were the first published postive results, and it is now known  that all of the positive results that he observed were false positives for various known reasons.
So your statement "The test has been around for about 15 years and has been beset by false positives" is misleading as there are no positive results prior to Amir's recent positive results


Quote
Two threads were cited and since the second thread was about positive results, that is not the place to look.  Attempt to distract the discussion away from relevant facts noted.

I really have no idea what you're talking about?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:15:13
By definition they must be false negatives - an audible difference that these listeners didn't discern - there were known structural problems with the files as Arny admits.

More twisting. Does it not occur to you, that people actually just listened to the jangling keys part and ignored or even cut out the pure HF/IMD test tones that can cause damage to your system? Or that honest people do not abuse small time offsets between the files, because they are interested in the sound quality and not in how to abuse the ABX software?
Wrong on many counts:
- The test tones weren't in the original jangling keys files - they were introduced after Amir's posted positive ABX results to investigate IMD issues in the playback equipment.
- This was the same test Amir did - the jangling keys files without test tones
- your bringing up test tone sis just a red herring
- If Amir heard differences in these files, it is there to be heard. The fact that you then do revisionism on his positive results & call them abuse is purely disingenuous. People are meant to listen for differences in ABX tests not just listen to " the sound quality" - I think you are mixing up preference testing with forced choice difference testing? 
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:20:58
Xnor, I've already given you Arny's jangling keys as the test - all you need to fill in your demands:
- the test scenario:
What claims are being tested? Who is supposed to participate in it? Who is the target audience? ...
-- the analysis of the results:
How are positive and negative results analyzed and interpreted? When is the claim sufficiently evidenced? When can the claim be rejected? And everything in between ...

If you're stuck filling in any of your demands, I'm sure Arny can help you fill out your questiionnaire
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 22:23:44

There are stats for this (chances of failing a given length ABX test at a given p val when you detect the difference X percent of the time on average) but I don't have them to hand.



I'm not sure if you mean something like this...

X ~ B(n, p) with n = 16 trials

If we set P(X <= 11), which is the probability of failing the test, to 50% then success probability p needs to be 0.714, which is not that bad considering p = 0.5 would be random guessing.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 22:30:08
If you're stuck filling in any of your demands, I'm sure Arny can help you fill out your questiionnaire


Nice evasion again.
I asked you, and there's more info needed from you to clear up your "problems" with false positives, negatives, ABX in general, etc.
You have not given me anything other than "arny's test" followed by falsehoods about it.

This really feels like amir's spiel all over again - in the time you made all these evasive posts you could have actually answered and demonstrated interest in an honest discussion. I guess you're not interested.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:35:29
If you're stuck filling in any of your demands, I'm sure Arny can help you fill out your questiionnaire


Nice evasion again.
I asked you, and there's more info needed from you to clear up your "problems" with false positives, negatives, ABX in general, etc.
You have not given me anything other than "arny's test" followed by falsehoods about it.

This really feels like amir's spiel all over again - in the time you made all these evasive posts you could have actually answered and demonstrated interest in an honest discussion. I guess you're not interested.

Your continued invitation to follow you down the rabbit hole & introduce numerous McGuffins, is testament to what I said - "I'm interested in the level of these false negatives. You guys are not."
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 22:38:00
All this talk about how I'm trying to discredit ABX testing for my financial benefit when you already stated that my potential customers are not the type who would be interested in ABX tests.


As usual John, you've got the cart before the horse. The reason why your potential customers are limited to people who are not interested in ABX tests is because there are no ABX tests that verify most of your claims.

It seems logical that one or more positive ABX test(s) involving your equipment would vastly increase the size of the market for your products.

It seems logical that degrading public opinion in ABX tests would similarly vastly increase the size of the market for your products. It would overcome a major objection to your equipment.

Quote
So how do you square these two statements?


I don't. Instead I avoid the trap of relying on your false logic.

Quote
Some logic & consistency in your accusations would at least be welcome!


Likewise, I'm sure. ;-)
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:41:23
If you're stuck filling in any of your demands, I'm sure Arny can help you fill out your questiionnaire


Nice evasion again.
I asked you, and there's more info needed from you to clear up your "problems" with false positives, negatives, ABX in general, etc.
You have not given me anything other than "arny's test" followed by falsehoods about it.

This really feels like amir's spiel all over again - in the time you made all these evasive posts you could have actually answered and demonstrated interest in an honest discussion. I guess you're not interested.

I'm quite happy to discuss these issues using Arny's jangling keys ABX test as the example & have being doing so but you seem to want some of your demands met - what's your problem?

You can get definitions of false positives (Type I errors) & false negatives (Type II errors) & bring them to the discussion if you like, I'm not stopping you!
Then we can discuss them with reference to Arny's jangling keys test
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 22:44:33
Wrong on many counts:
- The test tones weren't in the original jangling keys files - they were introduced after Amir's posted positive ABX results to investigate IMD issues in the playback equipment.


IOW they were designed to identify some common sources of false positives.

BTW there is evidence that Amir continues to brag about tests that were run on equipment that fails these IM tests.

He screamed like a stuck pig when they first came out, and why did he do that if his equipment all passed?

Quote
- If Amir heard differences in these files, it is there to be heard.


There is no evidence that Amir actually listened to the files provided for download, nor is there reliable evidence that the monitoring equipment he used passed my tests.

There is a version of FB2K ABX that addresses some of these concerns, but I know of no ABX logs that show that he used it with the keys jangling files.

Furthermore, the files I was providing up until this week were downsampled in a manner that may have produced audible artifacts.

All of the above would of course lead to false positives.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-07 22:49:52
If you're stuck filling in any of your demands, I'm sure Arny can help you fill out your questiionnaire


Nice evasion again.


I don't think that it was a nice evasion. It struck me as being childish and crass.

Quote
I asked you, and there's more info needed from you to clear up your "problems" with false positives, negatives, ABX in general, etc.
You have not given me anything other than "arny's test" followed by falsehoods about it.


Yup and I rebutted those falsehoods and as is his habit, he just sorta forgot about all that.

Quote
This really feels like amir's spiel all over again - in the time you made all these evasive posts you could have actually answered and demonstrated interest in an honest discussion. I guess you're not interested.


So who learned from who? ;-)
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 22:56:23
As usual John, you've got the cart before the horse. The reason why your potential customers are limited to people who are not interested in ABX tests is because there are no ABX tests that verify most of your claims.

It seems logical that one or more positive ABX test(s) involving your equipment would vastly increase the size of the market for your products.
Wow, marketing advice - thanks - what's your expertise in marketing - the ABX comparator? I think I might just pass on this advice if you don't mind?


Quote
It seems logical that degrading public opinion in ABX tests would similarly vastly increase the size of the market for your products. It would overcome a major objection to your equipment.
Other than the biased view here, I haven't seen any major or otherwise objection to my equipment - Oh, I tell a lie, one reviewer (I think he only ever did one review) for Stereomojo said he hated the looks & design but "how does it sound? Revelatory". A twist in the usual bias of looks defining what we hear?
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 23:01:22
I'm quite happy to discuss these issues using Arny's jangling keys ABX test as the example & have being doing so but you seem to want some of your demands met - what's your problem?


Sure, since you seem incapable of clicking links:

Please define:
- the test scenario:
What claims are being tested? Who is supposed to participate in it? Who is the target audience? ...

- the test itself:
What exactly is being tested? Who actually participates? How does the test procedure look like (the main part is ABX I guess)? Are all trials logged? Are all test results collected, or can participants randomly send in their logs when they feel like it? ...

- the analysis of the results:
How are positive and negative results analyzed and interpreted? When is the claim sufficiently evidenced? When can the claim be rejected? And everything in between ...

- what you mean by false negative:
So far your definitions were either unusably vague, nonsensical or unreasonable to work with. We cannot look into the brains and make judgments based on beliefs ... we can only try to eliminate biases, not eliminate them entirely.

- how do potential false positives and false negatives effect the results and conclusion?

- how do you suggest do we detect false positives and false negatives?
Please don't just say "add a control here". Be specific.



I do not know how you would answer these questions regarding "arny's test". You brought up "arny's test". You can assume that I know very little about it. So please .. go on ..
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-07 23:10:42
IOW they were designed to identify some common sources of false positives.
You keep using these terms incorrectly, Arny - a false positive is defines as a listener registering a difference when no actual audible difference exists. An audible difference existed in your two files therefore these weren't false positives - please stop confusing everybody with your wrong use of the term.

Quote
BTW there is evidence that Amir continues to brag about tests that were run on equipment that fails these IM tests.
what evidence?

Quote
He screamed like a stuck pig when they first came out, and why did he do that if his equipment all passed?
Because you used ultrasonic test tones which were close to 0dB & not anywhere near the lower level of ultrasonics that were in your jangling keys recording. In other words it was another mcGuffin 

All the rest of your post is just more accusation lacking any evidence used in an attempt to discredit
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 23:19:18
Oh, I tell a lie, one reviewer (I think he only ever did one review) for Stereomojo said he hated the looks & design but "how does it sound? Revelatory". A twist in the usual bias of looks defining what we hear?


(http://stream1.gifsoup.com/view3/1290449/picard-facepalm-o.gif)
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-07 23:25:07
Thanks pelmazo, your posts remain the most interesting & almost the least ad-hom

If you weren't cherry-picking so much, I could actually appreciate that.

Quote
OK, so you don't care how or where the null results come from because you don't see the null results as evidence of anything but you see the continued lack of positive results as evidence, right?

Yes, that's how I described it.

Quote
I think you fail to see the psychological influence that a body of null results have on listeners - they are less motivated to find a difference if there is already a body of "evidence" that no difference has been heard. This is just human psychology - we want to be right. It is very easy to prevent someone hearing a difference that actually exists by building this expectation.

If it were that easy to deter people from "hearing differences", I would gladly do it.

Unfortunately it's the opposite: The motivation to find differences is all but invincible; audiophiles are obsessed with it. I have to say that you are making your "human psychology" up as you go. Given that my experience is the exact opposite, you'd have to provide quite solid evidence to convince me.

Quote
Someone asked me earlier in the thread, how come once one positive ABX result was reported, that others then appeared - it's because of this psychological factor. The surety was no longer in place that there was no audible difference - someone had let the genie out of the bottle

No, it was something else that was out of the bottle: The idea that the files contained a clue. Again, you are making up your theories as you go.

Quote
ABX tests are designed to eliminate certain types of biases but other biases are ignored or down-played & this is why I ask for controls in ABX testing. The specification for strict test design & controls in the ITU standards is an attempt to deal with these biases. Home administered ABX tests, missing any of these controls, fail in this regard.

I fail to see this as the most prevalent problem. There is a completely different elephant in the room, again: It is the notion that sighted tests are any good at all. If people took the problems of sighted tests seriously, they would be more conscious of potential problems in blind tests, too. Once their mindset is such that they dismiss sighted tests because of their obvious shortcomings, they are in a position to make reasonable conjectures about blind tests.

Quote
Yes, I would say that a "tell" is just another "impairment". So what you are saying is that it's easy to miss an audible impairment if you haven't got any idea whether there is one. Almost guaranteed to miss if you have been told a body of null tests already exists or the test has been in existence for 15 years & no positive results in that time.

You are again making this up as you go. The situation we are usually dealing with is completely different: We are frequently dealing with someone who is convinced to hear a difference, yet can't be brought to show any tangible evidence for it. They are completely unfazed by any amount of evidence to their disadvantage. So whatever psychological effect you have diagnosed here, the audiophiles seem to be immune to it.

Quote
You raise another good point - in the previous 15 years people just didn't pick out any difference because they about training which I agree with - unless the listener has familiarity with the audibility of the impairment (or it is demonstrated to them) , they are unlikely to discern it - hence the ITU recommendation for trained listeners - another missing factor in ABX tests

The factor isn't missing from ABX tests. In fact, even during the test the possibility to compare A with B any number of times, and for any duration, provides a training possibility, because you know that there must be a difference between them, if there is any at all. Often, ABX tests include extra training in advance of the test. Furthermore, you would expect that a person who claims to be able to hear a certain difference, would be trained to a certain extent already. So in practice it is again the opposite of what you are imagining: There frequently is no lack of training.

I would find it downright pretentious to suppose that people who claim to be able to hear certain differences, and claim to have shown that in the past and to be sure about it, as is the norm with audiophile claims, to lack training. If they are sure, who am I to object? On what grounds could I object? It is their perceived difference, if they don't know what they're talking about, who is? So they do the test and fail. We have seen that numerous times. What does that mean? That they should have been encouraged to train more? Perhaps, but it is not for me to tell them. The more immediate conclusion is that their self-confidence was unwarranted, and they should have prepared themselves better. Not my fault and not my responsibility.

In the face of audiophile claims, there is one thing you can't do: Try to nanny their test. That guarantees a brawl. I resolved to give them enough rope to hang themselves. Works admirably well.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-07 23:34:46
That was a nice attempt at explaining jkenny the obvious, but I reckon that it will still go over his head exactly as it did yesterday.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-07 23:43:26
I had a quick google to try to find this, but was hampered by the fact that a certain jkeny has posted about this topic in goodness knows how many audio forums, and google ranks your questions more highly than the answers!

Shh!  You might be on to something.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 00:33:27
Thanks again pelmazo - you put a lot of thought & time into your posts which is refreshingly different to the often knee-jerk reactions of many posters & posts here and I do enjoy reading your posts. It does challenge me rather than attack me (usually).

But you did say in a previous post that the reason listeners didn't hear, in 15 years, the audible issues in the jangling keys files was because they weren't sure there were any differences to find:
Quote
Because they didn't find the tell.

You can either stumble across a tell by pure chance, or you have prior training that makes you look for it consciously. It is very easy to miss a subtle tell if you haven't got any idea whether there is one, and what it might be.


So, in this case ABX test are being used in a different challenge to the usual challenge to audiophiles that you now refer to. The usual audiophile challenge being "well you claim you can hear this particular aspect, now let's see". The jangling keys files you say were not discerned as different because they didn't find the tell but many say that the one "tell" is obvious - a click when one file is switched to but not another. How could this not be discernible?

As you say A/B switching (training possibility) is available to the lsitener
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 00:42:25
Thanks again pelmazo - you put a lot of thought & time into your posts which is refreshingly different to the often knee-jerk reactions of many posters & posts here and I do enjoy reading them. It does challenge me rather than attack me (usually).

He hasn't been here that long and likely has more patience with such obtuseness.  Also, he might not be as jaded as other members, some of whom think you can't possibly be this stupid and are instead being this way intentionally.

...for your own amusement (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108127&view=findpost&p=888790), perhaps?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 00:48:30
Thanks again pelmazo - you put a lot of thought & time into your posts which is refreshingly different to the often knee-jerk reactions of many posters & posts here and I do enjoy reading them. It does challenge me rather than attack me (usually).

He hasn't been here that long and likely has more patience with such obtuseness.  Also, he might not be as jaded as other members, some of whom think you can't possibly be this stupid and are instead being this way intentionally.

How come all your posts to me are ad-hom in nature, greynol & usually start others to follow suit?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 00:50:37
Be sure you review your first post on this forum.  You do little more than mire down useful discussions and are batting 1000 on this count.

You're evading pertinent questions, likely because they would help to put an end to your obfuscatory side-show, choosing to instead whine about being abused.

Classic.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 00:53:16
As usual John, you've got the cart before the horse. The reason why your potential customers are limited to people who are not interested in ABX tests is because there are no ABX tests that verify most of your claims.

It seems logical that one or more positive ABX test(s) involving your equipment would vastly increase the size of the market for your products.
Wow, marketing advice - thanks - what's your expertise in marketing - the ABX comparator? I think I might just pass on this advice if you don't mind?


I did a marketing study of the ABX Comparator at the time and projected a  tiny market.


Quote
Quote
It seems logical that degrading public opinion in ABX tests would similarly vastly increase the size of the market for your products. It would overcome a major objection to your equipment.
Other than the biased view here, I haven't seen any major or otherwise objection to my equipment - Oh, I tell a lie, one reviewer (I think he only ever did one review) for Stereomojo said he hated the looks & design but "how does it sound? Revelatory". A twist in the usual bias of looks defining what we hear?


I'll bet your critical thinking is so weak John that you don't even realize that your response was totally irrelevant and unresponsive to my comment.


A friend once taught me that I should not presume malevolence when simple incompetence would suffice as an explanation.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 00:55:12
I should consult your friend.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 00:59:04
Quote
He screamed like a stuck pig when they first came out, and why did he do that if his equipment all passed?
Because you used ultrasonic test tones which were close to 0dB & not anywhere near the lower level of ultrasonics that were in your jangling keys recording.


Just a friendly reminder that that I disproved all that here:

Post showing that the peak levels of the keys jangling and test tones are similar (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894979)

(second notice)

and I showed that nearly infinitesimal errors in reproducing the keys jangling sound is highly audible here:

Posting showing that even microscopic errors can be highly audible (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=895047)
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-08 01:02:04
I've defined what a false negative is over & over again in this & in the other closed thread. It's amazing how much ducking & diving you guys do to avoid logic.
It's entertaining to see the same poster cropping up every now & then to ask for a definition of false negatives he has already been given before & whose definition is available on-line for anybody to look up. But like the continual demands for another example or another test or another whatever, it is designed to wear down the opposition.

Here (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894932) you said "a listener not hearing differences when they are actually present". What do you mean by present? Present in measurements? What's your standard for defining what difference should be actually audible for somebody?
Here (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894987) your post more nonsense.

This is why I asked you to define these (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&st=350&p=894992&#entry894992) things as clearly as possible.


Could you please stop evading jkeny? Even when I ask questions directly about your "problems" you evade.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 01:17:16
I should consult your friend.



For example I don't think that this response: " "how does it sound? Revelatory" can mean just about anything the person hearing it wants it to.

For example, the product could be Revelatory of its maker's technical incompetence and be honestly described by the reviewer.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 02:29:06
How come all your posts to me are ad-hom in nature, greynol & usually start others to follow suit?

Why have you been evading free therapy (http://www.diyaudio.com/forums/everything-else/171506-what-can-measurements-show-not-show-7.html#post2266482) for 5 years?
Notice how nice guy SAM's approach...got exactly the same results as the "mean" folks - complete evasion?
Told ya he was new. 

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 02:37:23
No offence taken but I & others have done numerous side by side comparisons of stock unit to modified unit - were they DBT, no - there is absolutely no need, the difference in sound is so glaringly obvious & noticeable from the first couple of notes. No offence but I already have given this proof with links to the reviews. Have you read any? Anecdotal, yes & not worth a damn according to those who have never heard the unit - say this to one of the people who have a unit & they will laugh at your stupidity in requiring DBT.

John, if you think it's "stupidity" to require a DBT (like ABX), why are you concerned whether those ABX results may yield "False" negatives?

Reference (http://www.diyaudio.com/forums/everything-else/171506-what-can-measurements-show-not-show-14.html#post2267178)

That is what you said, correct?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 02:45:23
If that's how he feels, why is he even here other than to troll?

Consult his first post (EDIT: on this forum or this discussion, though the latter ended up in the recycle bin*) to glean possible additional insight.

(*) let that be your first clue.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 03:03:18
If that's how he feels, why is he even here other than to troll?

Masochism? Therapy? Who knows with such minds.

Consult his first post to glean possible additional insight.

This thread or at site?

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 03:15:01
No offence taken but I & others have done numerous side by side comparisons of stock unit to modified unit - were they DBT, no - there is absolutely no need, the difference in sound is so glaringly obvious & noticeable from the first couple of notes. No offence but I already have given this proof with links to the reviews. Have you read any? Anecdotal, yes & not worth a damn according to those who have never heard the unit - say this to one of the people who have a unit & they will laugh at your stupidity in requiring DBT.

John, if you think it's "stupidity" to require a DBT (like ABX), why are you concerned whether those ABX results may yield "False" negatives?

Reference (http://www.diyaudio.com/forums/everything-else/171506-what-can-measurements-show-not-show-14.html#post2267178)

That is what you said, correct?


"No offence taken but I & others have done numerous side by side comparisons of stock unit to modified unit - were they DBT, no - there is absolutely no need, the difference in sound is so glaringly obvious & noticeable from the first couple of notes."

Classic placebophile.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 09:52:55
Thanks again pelmazo - you put a lot of thought & time into your posts which is refreshingly different to the often knee-jerk reactions of many posters & posts here and I do enjoy reading your posts. It does challenge me rather than attack me (usually).

I'm certainly not putting in the time for your sake. It was clear from the start that it would be wasted on you. I'm not a beginner in audiophile debates. I know the signs of the typical debate tactics - they're not hard to identify. My goal is to make as clear as possible both the technical issues and the personality issues to everyone following the debate, if they are open-minded enough to think about that.

I'm glad I have succeeded in challenging you. That means my strategy is working. You are consistently trying to dodge the critical points. If you can't outright ignore them, despite trying hard, you attempt to steer the discussion in a different direction, even if that direction involves flattering me. You are like a fleeing rabbit, trying to maintain the illusion that you've got everything under control. The more obvious that is to everyone, except a few lost ones, the better. That's what rewards me.

Quote
How could this not be discernible?

This question just shows that you haven't understood my argument at all, and just repeat the already answered point. I'm not going to answer it again, I will just note that this is one of the more obvious signs of you being at the end of your wits.

You ARE deluded, but of course you are not able to see that yourself. You would first have to snap out of it. That's not likely to happen when you have hung your business on it.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 10:16:59
If that's how he feels, why is he even here other than to troll?


Lets put it this way. If here were working his @$$ off getting product out to customers...  ;-)

...and if he really had a serous interest in Science and finding the truth, he was handed a good road map for finding it here: Good road map by Xnor for finding truth about thread topic issue (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&view=findpost&p=895074)..
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 13:00:40
Ah, pelmazo, it''s a shame that you reverted to type.

I am also posting for the open-minded reasonable reader who might come across this thread

The definition of a false negatives:
"a type II error is the failure to reject a false null hypothesis (a "false negative"). More simply stated, a type I error is detecting an effect that is not present, while a type II error is failing to detect an effect that is present."

So what you guys won't admit is that all of the null results for Arny's jangling keys test in the 15 years of it's existence were false negatives - the files were flawed & audibly different but no one picked that up.

It doesn't really matter what was the reason for this, does it - the audible difference was not picked up i.e false negative


The same applies to the Swedish Radio Example - where 60 expert listeners didn't hear the flaw in the codec over 2 years & 20,000 tests but Bart Locanthi did.

In both these cases we have glaring examples of audible impairments that are present but not detected in blind testing i.e. false negatives.


These examples show that there is a strong tendency towards false negative results i.e not hearing an audible difference that is present.

I'm interested in knowing how strong this tendency is in ABX tests & tried to suggest some ways of moving towards this but you all reject this effort
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 13:27:36
Ah, pelmazo, it''s a shame that you reverted to type.

I am also posting for the open-minded reasonable reader who might come across this thread

You are anything but open-minded. That you think you are is part of your delusion. It is easy for other people to see how you are dodging the critical points.

Quote
The definition of a false negatives:
"a type II error is the failure to reject a false null hypothesis (a "false negative"). More simply stated, a type I error is detecting an effect that is not present, while a type II error is failing to detect an effect that is present."

So what you guys won't admit is that all of the null results for Arny's jangling keys test in the 15 years of it's existence were false negatives - the files were flawed & audibly different but no one picked that up.

It doesn't really matter what was the reason for this, does it - the audible difference was not picked up i.e false negative

We don't need lecturing what "false negative" means. The "false negatives" in Arny's example are normal. As long as nobody knows what to search for, the flaw is easily missed. It has nothing to do with blind or non-blind, it has something to do with training. Had Arny provided guidance as to what kind of flaw to search for, I'm sure many people would have found it by listening blind.

Quote
The same applies to the Swedish Radio Example - where 60 expert listeners didn't hear the flaw in the codec over 2 years & 20,000 tests but Bart Locanthi did.

In both these cases we have glaring examples of audible impairments that are present but not detected in blind testing i.e. false negatives.

They weren't detected in non-blind testing, either, for pretty much the same reason as with Arny's test.

Quote
These examples show that there is a strong tendency towards false negative results i.e not hearing an audible difference that is present.

Yes, this tendency exists in general, no matter whether you are listening blind or non-blind. It is a tendency to miss something subtle or not so subtle that you're not expecting. It is often surprising how large an effect can go unnoticed if people aren't expecting it. This is a common topic in perception. It is being taken advantage of by a lot of people including yourself, when they try actively to steer away the focus of people from one point towards another. Given how hard you are trying to exploit this here, it is very hypocritical indeed to pretend you were unaware of it.

Quote
I'm interested in knowing how strong this tendency is in ABX tests & tried to suggest some ways of moving towards this but you all reject this effort

Sure, because it has been described to you many times how unimportant this is in the face of much bigger errors in sighted tests, and given that false negatives are so inconsequential in an ABX test. We evidently can't get you to turn your attention towards the important factors, but you shouldn't be surprised that we are reluctant to waste our own effort on it. Especially since it is clear that your interest in this does not derive from genuine scientific curiosity, but from a pigheaded attempt to distract your and our attention away from the most important points.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 13:50:11
Quote
I'm interested in knowing how strong this tendency is in ABX tests & tried to suggest some ways of moving towards this but you all reject this effort

Sure, because it has been described to you many times how unimportant this is in the face of much bigger errors in sighted tests, and given that false negatives are so inconsequential in an ABX test. We evidently can't get you to turn your attention towards the important factors, but you shouldn't be surprised that we are reluctant to waste our own effort on it. Especially since it is clear that your interest in this does not derive from genuine scientific curiosity, but from a pigheaded attempt to distract your and our attention away from the most important points.

You make the claim I have highlighted in bold without any evidence to back it up. What I'm suggesting is that you include controls in the ABX tests that will provide the evidence for your claim. Until then, it's just a claim which I dispute. It is well known in the literature about blind testing that the more you control for false positives the more prone to false negatives you make the test. It's surprising that this is ignored here.

But I'm not just talking about the statistics, I'm talking about the nature of such tests & the tendency towards false negatives which is only really dealt with by formal scientific testing by experts in the field. The message here is that all tests are acceptable because "false negatives are so inconsequential in an ABX test" - I claim this to be rubbish & politician's speak.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 14:01:54
Ah, pelmazo, it''s a shame that you reverted to type.

I am also posting for the open-minded reasonable reader who might come across this thread

The definition of a false negatives:
"a type II error is the failure to reject a false null hypothesis (a "false negative"). More simply stated, a type I error is detecting an effect that is not present, while a type II error is failing to detect an effect that is present."

So what you guys won't admit is that all of the null results for Arny's jangling keys test in the 15 years of it's existence were false negatives - the files were flawed & audibly different but no one picked that up.


That is incorrect.  There were no problems with the www.pcabx.com files that were audible to people who used  good monitoring equipment. The first potential problems came to light when people with lower quality monitoring equipment started reporting positive results. 

The former distribution of the files on the PCABX web site made some specific recommendations about monitoring equipment starting in Y2K:

"
What Makes A Good Sound System For PCABX?

(Minimum) Sound Blaster Live! or "Audigy" (better) sound card and Monsoon 1000 speakers, or equal or better competitive products from other manufacturers.
Headphones are a fine alternative to loudspeakers. PCABX's technical staff recommends the use of quality headphones such as the Sennheiser HD-580,  Sony MDR 7506 headphones, or Etymotic ER-4B earphones.

Adequate or Better Analog) Midiman "Audiophile 24/96", Echo "Mia", or Turtle Beach "Santa Cruz" sound card; good Pioneer, Kenwood,  equivalent or finer stereo receiver or separate components and speakers equal or better to the KEF Q15's, NHT S1's, or competitive speakers from Polk, Paradigm or PSB and other manufacturers of better loudspeakers. Operate in pure stereo mode (no surround).

(Adequate Or Better Digital) Midiman DIO 2448,  "Audiophile 24/96", Echo "Mia", or Turtle Beach "Santa Cruz" sound card; good Pioneer, Kenwood,  equivalent or finer stereo receiver or separate components with digital input (i.e., a Dolby Digital receiver) and speakers equal or better to the  KEF Q15's, NHT S1's, or competitive speakers from Polk, Paradigm or PSB and other manufacturers of better loudspeakers. Operate in pure stereo mode (no surround).

(Professional) Midiman Delta Series, later-production Echo Layla24 or Gina24; DAL CardD Deluxe, LynxONE or LynxTWO  sound card, self-powered professional  studio monitors like those from Mackie, Event, Vergence/NHT Pro, JBL, etc.
"

It is quite clear that these recommendations were not followed in most if not all of the tests that were run by AVS participants.  Many of them used generic laptop on-board audio interfaces which are questionable.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 14:19:41
You make the claim I have highlighted in bold without any evidence to back it up. What I'm suggesting is that you include controls in the ABX tests that will provide the evidence for your claim. Until then, it's just a claim which I dispute. It is well known in the literature about blind testing that the more you control for false positives the more prone to false negatives you make the test. It's surprising that this is ignored here.

Since my claim is about the interpretation of ABX test results, it can't be backed up by changing the ABX test method. It is evidenced by the conclusions drawn from such tests. I already clearly stated my own interpretation, and there it is very clear that false negatives have no effect at all. A lot of other people draw their conclusions in similar ways.

Even if other people should derive wrong conclusions from false negatives, it would be downright idiotic to try to combat that by changing the number of false negatives. It would be treating the symptom instead of the cause. If people draw wrong conclusions, educate them how to draw the right ones. Don't manipulate the facts to make them draw the conclusion you find preferable.

But I dispute that people are drawing such wrong conclusions because of false negatives. People aren't quite as stupid as you think. Take away all false negatives, and the fact would still be there that audiophiles don't manage to produce true positives to back up their often hideous claims. That is where the whole thing crumbles. The conclusion would still be that the claims are bunk, because it does not rely on the false negatives.

Finally, who are you to expect that others are doing the work for you? If you think you can do better ABX tests, then do it, demonstrate it, and show in which way it is an improvement. You don't need us for that. If you managed to produce something worth looking at, come back and we can discuss something real. Until then, kindly refrain from pestering us with something we consider to be a genuine waste of time.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-08 14:23:21
jkeny, still evading after several pages. Still bringing up the same nonsense. Still not interested in having an honest discussion. Still just slinging the same old mud.

It starts to get really tedious.


People aren't quite as stupid as you think.

I don't think that he does.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 14:30:30
Quote
I'm interested in knowing how strong this tendency is in ABX tests & tried to suggest some ways of moving towards this but you all reject this effort

Sure, because it has been described to you many times how unimportant this is in the face of much bigger errors in sighted tests, and given that false negatives are so inconsequential in an ABX test. We evidently can't get you to turn your attention towards the important factors, but you shouldn't be surprised that we are reluctant to waste our own effort on it. Especially since it is clear that your interest in this does not derive from genuine scientific curiosity, but from a pigheaded attempt to distract your and our attention away from the most important points.

You make the claim I have highlighted in bold without any evidence to back it up.


If making claims without evidence is such a bad thing why do you do so incessantly, Mr. Kenny?

It is clear that you don't even know what a false negative is. We see the little boy who cries wolf but who has no reliable evidence that there have been any false negatives at all. Evidence that there were false negatives could take the form of actual reliable postives but you seem to have none of those.  Evidence that there were false negatives could take the form of psychoacoustic analyses that showed the existence of artifacts that were above the thresholds of hearing, but you seem to have none of those, either. Instead, we get religious testimonies and statements of faith in know highly unreliable evidence sources.

Quote
What I'm suggesting is that you include controls in the ABX tests that will provide the evidence for your claim.


The controls are already there, but you have dismissed them with no well-reasoned explanation or evidence. Your evidence has been disproven with fact, not merely disputed.

Quote
Until then, it's just a claim which I dispute. It is well known in the literature about blind testing that the more you control for false positives the more prone to false negatives you make the test.



Mr. Kenny, AFAIK you have never properly documented that false speculation, either.  As things stand you merely assert problems, and your only evidence is assertions that has been proven false or is at best speculative.

Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 14:52:36
I am also posting for the open-minded reasonable reader who might come across this thread


Anyway, the point being that I wanted to hear what people thought of the idea that all that we hear can be measured? I don't believe so but I'm have an open mind.

One would be stupid or in denial to reject a significant body of people & their agreement on what they hear!

I & others have done numerous side by side comparisons of stock unit to modified unit - were they DBT, no - there is absolutely no need, the difference in sound is so glaringly obvious & noticeable from the first couple of notes. No offence but I already have given this proof with links to the reviews. Have you read any? Anecdotal, yes & not worth a damn according to those who have never heard the unit - say this to one of the people who have a unit & they will laugh at your stupidity in requiring DBT.

I get better, more comprehensive results with longer term (peeking) listening.


Dunning-Kruger effect (http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect).

Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-08 15:16:53
Let me give you a definition of a false negative, which jkeny could have done a 100 times in the time he posted all that noise:
A false negative is an actual 'hit' that was disregarded by the test and seen as a 'miss', or a test result indicates that a condition failed, while it actually was successful.

Simple example of a false negative: A pregnancy test is used properly (which is not as simple as just pushing a button) by a pregnant women and it shows a negative result.


One of jkeny's objections was people not actually listening, not giving their best, being distracted ...
1) That's like blaming a pregnancy test for showing 'negative' when a pregnant women drops the test into the toilet.

2) As has been pointed out a 100 times by now, that is why we usually let the people making the claim produce the evidence. The burden of proof is on the one making the claim. That's philosophy 101.
Same situation as above.
Woman: "I am pregnant, I can feel it and see it." (this literally 'sighted test' is the claim, not evidence, and she is interested in demonstrating that claim to be true)
Me: "I don't believe you. Do a test."
Woman: Uses the test incorrectly, then complains that the test itself is flawed.

Neither 1 or 2 are false negatives.
That's why I asked jkeny countless times to define his terms clearly, for example what he means by "a difference is present", etc. but I guess we'll never know since he is not interested in an honest discussion.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-08 15:41:11
I am also posting for the open-minded reasonable reader who might come across this thread

You don't even seem to know what 'open-minded' means.
Watch: Open-mindedness (https://www.youtube.com/watch?v=T69TOuqaqXI) (YouTube video)


Quote
One would be stupid or in denial to reject a significant body of people & their agreement on what they hear!

No, one would be gullible by accepting these anecdotes without evidence. By definition.
And one would also be extremely closed-minded by not accepting the 'possibility' that many of these heard differences are a product of the brain (see biases) and not due to actual audible differences. In fact, this stance should be your default position, because it is.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 15:54:51
You make the claim I have highlighted in bold without any evidence to back it up. What I'm suggesting is that you include controls in the ABX tests that will provide the evidence for your claim. Until then, it's just a claim which I dispute. It is well known in the literature about blind testing that the more you control for false positives the more prone to false negatives you make the test. It's surprising that this is ignored here.

Since my claim is about the interpretation of ABX test results, it can't be backed up by changing the ABX test method. It is evidenced by the conclusions drawn from such tests. I already clearly stated my own interpretation, and there it is very clear that false negatives have no effect at all. A lot of other people draw their conclusions in similar ways.
Conclusions based on no known evaluation of the level of false negatives in ABX tests - just your opinion. I have given you evidence of two ABX tests run over a long time (2 yrs & 15 yrs) in which false negatives were the result. You say this is just training & could happen in sighted testing too - of course it could. So you now agree that the null results over 2yrs (20,000 tests) & 15 years are false negatives - this at least is progress. So let's see the level of false negatives in ABX tests & then we can begin to analyse what percentage are a result of lack of training & what percentage are due to other causes.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 16:01:33
Quote
One would be stupid or in denial to reject a significant body of people & their agreement on what they hear!

No, one would be gullible by accepting these anecdotes without evidence. By definition.
And one would also be extremely closed-minded by not accepting the 'possibility' that many of these heard differences are a product of the brain (see biases) and not due to actual audible differences. In fact, this stance should be your default position, because it is.

I don't know who you quoted but it wasn't me. But, here's the rub - you claim you are testing "actual audible differences & yet you are using a test which you have no knowledge of the degree to which it suppresses ACTUAL audible differences.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 16:10:53
A degree which could be no more than zero, but as has been shown this is irrelevant.

You ignored another post again, BTW. It was just above the one you replied to; I don't see how you could have missed it.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 16:17:39
That is incorrect.  There were no problems with the www.pcabx.com files that were audible to people who used  good monitoring equipment. The first potential problems came to light when people with lower quality monitoring equipment started reporting positive results.

Wrong. In the rush to deny Amir was actually hearing a difference between high-res & RB your original files were analysed & found to have a timing offset between them which was advanced as the reason for Amir hearing differences - timing differences between the files are a known issue in ABX which you well know. There may have also been a noise level difference between the two files which was also claimed to be a tell.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 16:23:09
A degree which could be no more than zero,
Opinion again, based on zero evidence
Quote
but as has been shown this is irrelevant.
No it hasn't shown to be irrelevant - quite the opposite - the statement was made that the longer the timespan when no positive results ar ereported, the more convinciing that there are no audible differences. So using a test which is skewed towards suppressing actual audible differences (false negatives) is VERY RELEVANT.

Quote
You ignored another post again, BTW. It was just above the one you replied to; I don't see how you could have missed it.
Did I? Is this an interrogation?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 16:36:51
That only serves as more evidence demonstrating a lack of intellectual honesty on jkeny's part.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 16:41:43
That's why I asked jkeny countless times to define his terms clearly, for example what he means by "a difference is present", etc. but I guess we'll never know since he is not interested in an honest discussion.

Your examples are laughable & not worth responding to except to say that they show your desperation

I have consistently given my understanding of a false negative but it doesn't get past your barrier of denial

"a difference is present" should be extended to " an audible difference is present" & this audible difference is not discerned. I have given you two real examples of blind tests that show there was an audible difference which went unreported for 2yrs & 20,00 tests in one case & 15 years & unknown number of tests in Arny's jangling keys test. Do you wish to deny this?

Making excuses for why they were unnoticed is not relevant to the ABX statistic I'm interested in - it's proneness towards suppressing the discernment of actually audible differences - the false negative measure.

It really doesn't matter if the reason for the false negative is/was:
- lack of training
- lack of suitable playback equipment
- improper conditions of the test environment
- psychological stress
- fatigue or loss of focus
- etc.

What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?

I've no problem with doing the converse on sighted listening tests i.e getting an actual measure of how prone a sighted test run is to false positives
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 17:10:27
Conclusions based on no known evaluation of the level of false negatives in ABX tests - just your opinion. I have given you evidence of two ABX tests run over a long time (2 yrs & 15 yrs) in which false negatives were the result. You say this is just training & could happen in sighted testing too - of course it could. So you now agree that the null results over 2yrs (20,000 tests) & 15 years are false negatives - this at least is progress. So let's see the level of false negatives in ABX tests & then we can begin to analyse what percentage are a result of lack of training & what percentage are due to other causes.

My conclusions don't rely on the level of false negatives! If my conclusions don't rely on the weather, do I still need to evaluate the weather to get your ok to draw a conclusion?

You are completely out of your wits, therefore you just try to force your way through with ignorance and mindless repetition. This is also evidenced by your continued suggestion that "we" look at the level of false negatives. Haven't I made it clear enough that you will have to do that without us?

And, by the way, when I was writing my last posting, it did cross my mind that I could be seen as going along with your terminology of a false negative. I briefly paused to think if I should put the term in apostrophes, to make clear that I'm not sharing your view. I knew it would be tempting for you to take this as my tacit agreement with your usage of the term, so I decided to check if I was right, and left it as it was. You confirmed my suspicion with unfailing consequence. You are so predictable, it is chilling.

Quote
Making excuses for why they were unnoticed is not relevant to the ABX statistic I'm interested in - it's proneness towards suppressing the discernment of actually audible differences - the false negative measure.

Apart from the fact that nobody else is interested in the statistic you are interested in, for the obvious reason of it being irrelevant, you are showing your anti-ABX bias and agenda again. You claim without evidence that discerning actual audible differences is being suppressed by the ABX test method. That's quite a lot stronger than saying that audible differences may go unnoticed. The examples you offer certainly do not show that ABX testing suppresses anything like that.

Quote
What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?

The test is useful regardless of the frequency of what you call false negatives. It is an opportunity for the claimants to support their claim with tangible evidence. If they manage to show a true positive, they have achieved that, completely irrespective of any false negatives there may be. I call that useful, and any sane person would.

Now that we have made abundantly clear that you will get no help at all from us for what you describe as being your interest, why don't you f*** off and do the work yourself? We won't hold you back from wasting your time on it, so why do you insist on wasting ours?
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 17:21:15
I don't know who you quoted but it wasn't me. But, here's the rub - you claim you are testing "actual audible differences & yet you are using a test which you have no knowledge of the degree to which it suppresses ACTUAL audible differences.


That would be yet another unsupported false claim.

There have been numerous efforts using ABX tests to duplicate the measurements of various thresholds of hearing various sounds and artifacts which are documented in the annals of Science. The Threshold of Hearing per the Fletcher Munson curves, being one example. Spot checks suggest that ABX tests produce thresholds that are if anything a little lower.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 17:29:13
I've no problem with doing the converse on sighted listening tests i.e getting an actual measure of how prone a sighted test run is to false positives

How do you propose to do that?

John, we understand you lack the cognitive ability to grasp we see right through the facade, but there is no way you are going to convince sane people that you sighted daydreams are the equivalent of ABX due to the possibility of those "false negatives" you've conjured up.
I know you can't comprehend this, but there it is. You're statements about the "stupidity" of those asking for controlled (blind) tests on those who "hear" "organic" DAC sounds, Santa Claus, deceased relatives, etc, in large numbers, reflect rather poorly on your position. Ration people are going to ask for evidence for exceptional claims.
"Organic" DAC "sound" that get to the brain via means other than analog signals/soundwaves that are measurable, qualifies.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 17:32:04
Quote
What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?

The test is useful regardless of the frequency of what you call false negatives. It is an opportunity for the claimants to support their claim with tangible evidence. If they manage to show a true positive, they have achieved that, completely irrespective of any false negatives there may be. I call that useful, and any sane person would.
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives) is not very useful as a tool to reveal what is audible Vs what is not

Quote
Now that we have made abundantly clear that you will get no help at all from us for what you describe as being your interest, why don't you f*** off and do the work yourself? We won't hold you back from wasting your time on it, so why do you insist on wasting ours?

why don't you f*** off - yes that has been the underlying sentiment since the start - it's good to see it out in the open.
BTW, if you don't want to waste your time, just don't post & ignore this thread  I love the illogicality of this & see it also in people who complain about a TV programme as if someone was forcing them to view it.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 17:39:30
You claim without evidence that discerning actual audible differences is being suppressed by the ABX test method. That's quite a lot stronger than saying that audible differences may go unnoticed. The examples you offer certainly do not show that ABX testing suppresses anything like that.

Quoted for truth.

Quote
you f*** off and do the work yourself?

It's unfortunate that you gave him a distraction from what I quoted above.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 17:43:14
In other words how useful is the test?


Given all of the different Software-based ABX Comparators that have gone into widespread distribution in the past 15 years since I introduced PCABX...


Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 17:44:11
A degree which could be no more than zero,

Opinion again, based on zero evidence

The operative was "could."

In the meantime, you insist the degree was greater than zero with what is a strong likelihood of false positives as your evidence.  You'll have to do better than that.

Repeating refuted claims without providing real evidence  = intellectual dishonesty.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 17:45:16
I don't know who you quoted but it wasn't me. But, here's the rub - you claim you are testing "actual audible differences & yet you are using a test which you have no knowledge of the degree to which it suppresses ACTUAL audible differences.


That would be yet another unsupported false claim.

There have been numerous efforts using ABX tests to duplicate the measurements of various thresholds of hearing various sounds and artifacts which are documented in the annals of Science. The Threshold of Hearing per the Fletcher Munson curves, being one example. Spot checks suggest that ABX tests produce thresholds that are if anything a little lower.

Yes, with well-controlled, normally lab run formal ABX tests that use people who have been trained in what to listen for & run by people with expertise in what they are doing - there is a good level of discernment & sensitivity.

Home run, ABX tests fall well short of these conditions by a large amount & hence are prone to a high level of false negatives as evidenced in your 15 years on null results with jangling keys files.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 17:46:53
Repeating refuted claims without providing real evidence = intellectual dishonesty.

BTW, peaches, you were already called out on a lack of understanding of type-ii errors.  Why did you avoid acknowledging it?
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 17:48:31
A degree which could be no more than zero,

Opinion again, based on zero evidence

The operative was "could."
Yes, supposition & opinion based on no evidence presented

Quote
In the meantime, you insist the degree was greater than zero with what is a strong likelihood of false positives as your evidence.  You'll have to do better than that.

Repeating refuted claims without providing real evidence  = intellectual dishonesty.

Arny's 15 year jangling keys test is plenty of evidence that such ABX tests deliver false negatives & you dispute this?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 17:49:30
I just did, yes.  Please go back and read the previous post I made when you were busy projecting my lack of evidence.

You seem to be losing your cool.  Everything OK over there?  You must have better things to do than get disabused over something you don't even care about.

Maybe you should seek solace listening to one of your organic DACs.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 17:55:46
Quote
What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?

The test is useful regardless of the frequency of what you call false negatives. It is an opportunity for the claimants to support their claim with tangible evidence. If they manage to show a true positive, they have achieved that, completely irrespective of any false negatives there may be. I call that useful, and any sane person would.
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives) is not very useful as a tool to reveal what is audible Vs what is not


I'm happy to stipulate that the above is true. 

ABX's utility for the purpose stated above is a well-known fact.

For example in this paper authored by among others Robert Stuart of Meridian, that well known producer of high end DACs, in  AES Convention Paper 9174
"The audibility of typical digital audio Filters in a high-fi delity playback system":

"...ABX tests are viewed as the gold standard" for objective measures of listening."

The corollary to the statement above would appear to be:
Any person who does not realize that a test which has as good of a reputation for demonstrating subtle but audible differences (true positives) as ABX is a very useful as a tool to reveal what is audible Vs what is not, must be insane.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 17:59:30
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives) is not very useful as a tool to reveal what is audible Vs what is not

Any sane person realizes that ABX tests do not have a preponderance of suppressing audible differences. This is purely your own fabrication, and the examples given do not support it.

If anything, the opposite holds true: Sighted tests have been credibly described* as being less sensitive than blind tests, because the clues which are inevitably there in a sighted test are likely to distract the listener away from the audible differences. That would be much more fittingly described as "suppression". Blind testing hence tends to remove such "suppression" by facilitating the listener's concentration on the actual differences.

You are barking louder, but it is still the wrong tree.

* "The psychological biases in the sighted tests were sufficiently strong that listeners were largely unresponsive to real changes in sound quality [...]. In other words, if you want to obtain an accurate and reliable measure of how the audio product truly sounds, the listening test must be done blind." (Sean Olive in a text that you have abused yourself a while ago)
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 18:07:31
Sean Olive in a text that you have abused yourself a while ago

I was just going to point out this irony, but you beat me to it.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 18:08:01
You claim without evidence that discerning actual audible differences is being suppressed by the ABX test method. That's quite a lot stronger than saying that audible differences may go unnoticed. The examples you offer certainly do not show that ABX testing suppresses anything like that.

Quoted for truth.


Only in your mind is the phrase "discerning audible differences is being suppressed" stronger than "audible difference may go unnoticed" Both the Swedish Radio tests & Arny's jangling keys test produced results in which "audible differences may go unnoticed" - when this occurs over extended time frames 2 yrs & 15 year I consider that something is responsible for suppressing the identification of differences. In the case of the Swedish Radio files the artifact which turned out to be an audible idle tone at 1.5KHz.

So if a test, which is being used to examine "actual audible differences" as stated by xnor delivers results which include instances where "audible differences may go unnoticed" - it's not a very reliable or useful test, IMO.

If there is another agenda the state what your agenda is.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 18:14:07
Quote
What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?

The test is useful regardless of the frequency of what you call false negatives. It is an opportunity for the claimants to support their claim with tangible evidence. If they manage to show a true positive, they have achieved that, completely irrespective of any false negatives there may be. I call that useful, and any sane person would.
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives) is not very useful as a tool to reveal what is audible Vs what is not


I'm happy to stipulate that the above is true. 

ABX's utility for the purpose stated above is a well-known fact.

For example in this paper authored by among others Robert Stuart of Meridian, that well known producer of high end DACs, in  AES Convention Paper 9174
"The audibility of typical digital audio Filters in a high-fi delity playback system":

"...ABX tests are viewed as the gold standard" for objective measures of listening."

The corollary to the statement above would appear to be:
Any person who does not realize that a test which has as good of a reputation for demonstrating subtle but audible differences (true positives) as ABX is a very useful as a tool to reveal what is audible Vs what is not, must be insane.

Yes, Arny, as I said already ABX tests that are run by people who know what they are doing is a lot different to home run ABX tests. As I said before, perceptual testing should be confined to those who know what they are doing - otherwise you need to start using internal controls which all here patently refuse to do
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 18:14:37
My agenda is to identify intellectual dishonesty in order to help move the discussion forward.

Belligerently flogging the same impotent argument over and over is a perfect example of this.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 18:17:57
Yes, Arny, as I said already ABX tests that are run by people who know what they are doing is a lot different to home run ABX tests. As I said before, perceptual testing should be confined to those who know what they are doing - otherwise you need to start using internal controls which all here patently refuse to do

You have yet to demonstrate a need to alter a perfectly functional test protocol.  In the meantime there exist other tests that have what you're asking for which are more than suitable to demonstrate whether your DAC sounds organic that doesn't include playing peekaboo.  Your next assignment will be to yammer on through 10 pages of obfuscation in a futile attempt to discredit each of them as well.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 18:30:18
That is incorrect.  There were no problems with the www.pcabx.com files that were audible to people who used  good monitoring equipment. The first potential problems came to light when people with lower quality monitoring equipment started reporting positive results.

Wrong. In the rush to deny Amir was actually hearing a difference between high-res & RB your original files were analysed & found to have a timing offset between them which was advanced as the reason for Amir hearing differences - timing differences between the files are a known issue in ABX which you well know.


As usual, you're confusing speculation with facts.

I just looked at the two files.  You can do so too, here: Link to picture post in HA Uploads Forum (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=895182)

The timing offset was on the order of 13 samples @ 96 KHz or about 150 microseconds or 0.15 milliseconds. It was inherent in the downsampling of the file to 44/16.  I know of no psychoacoustic evidence that this short of a delay is audible in an AB comparison of two different audio files.

Rule of thumb is that the time delay during switch over masks timing differences between the files until that delay is at least a millisecond or more. At 10 milliseconds it may be heard as a slight echo if the switching is nearly instantaneous.

Quote
There may have also been a noise level difference between the two files which was also claimed to be a tell.


Well one of the files had been converted to 16 bits and back again and this was part of the test.

However the current versions of these files address both items. ABX them at your convenience and prove me wrong!

You can download the files from here: Current 2496 files for ABXing (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=894877)
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 18:37:41
Only in your mind is the phrase "discerning audible differences is being suppressed" stronger than "audible difference may go unnoticed"

Small correction: Only in your mind is it not stronger. Does one need more evidence for your complete delusion?

Quote
Both the Swedish Radio tests & Arny's jangling keys test produced results in which "audible differences may go unnoticed" - when this occurs over extended time frames 2 yrs & 15 year I consider that something is responsible for suppressing the identification of differences. In the case of the Swedish Radio files the artifact which turned out to be an audible idle tone at 1.5KHz.

I already wrote what I believe was responsible for the result at Swedish Radio. I am quite certain that it was not the blind testing. This should be trivial to see when one realizes that such codecs are being listened to in a sighted way any number of times in the lab during their development. So the flaw was not spotted in blind as well as non-blind tests, until Bart Locanthi came along and found it. I am equally certain that he would have found it in a blind test just as easily as in a non-blind test. It is a shame he can't testify anymore; I'm sure he would have supported my statement. It is a testament to your and Harley's reckless truth-bending that you continue to put the blame on blind testing. Both of you have no leg to stand on in this matter. It is pure anti-ABX propaganda.

Quote
So if a test, which is being used to examine "actual audible differences" as stated by xnor delivers results which include instances where "audible differences may go unnoticed" - it's not a very reliable or useful test, IMO.

That's the reality for any listening test. No test can offer you a guarantee that actual audible differences will always get detected. Blind tests, especially ABX, come closer than anything else, however.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 18:39:47
Quote
What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?

The test is useful regardless of the frequency of what you call false negatives. It is an opportunity for the claimants to support their claim with tangible evidence. If they manage to show a true positive, they have achieved that, completely irrespective of any false negatives there may be. I call that useful, and any sane person would.
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives) is not very useful as a tool to reveal what is audible Vs what is not


I'm happy to stipulate that the above is true. 

ABX's utility for the purpose stated above is a well-known fact.

For example in this paper authored by among others Robert Stuart of Meridian, that well known producer of high end DACs, in  AES Convention Paper 9174
"The audibility of typical digital audio Filters in a high-fi delity playback system":

"...ABX tests are viewed as the gold standard" for objective measures of listening."

The corollary to the statement above would appear to be:
Any person who does not realize that a test which has as good of a reputation for demonstrating subtle but audible differences (true positives) as ABX is a very useful as a tool to reveal what is audible Vs what is not, must be insane.

Yes, Arny, as I said already ABX tests that are run by people who know what they are doing is a lot different to home run ABX tests. As I said before, perceptual testing should be confined to those who know what they are doing - otherwise you need to start using internal controls which all here patently refuse to do


As usual Mr. Kenny you are claiming falsely and in the process being libelous of several people who are acting in good faith.

One such person is the author of FB2K and its ABX plug-in who has added numerous controls that are so good that people like your good friend Amir seem to not be able to find the time to do ABX tests of my latest keys jangling files with their  new internal controls and the new FB2K internal controls in place. AFAIK he's only used the updated FB2K  for "slam dunk" type tests. 

Several others here have helped me improve the internal controls on my latest Version 5 Keys Jangling tests, and that has happened right under your nose, Mr. Kenny.

You wonder why nobody treats you with much respect Mr. Kenny? You have no right to expect so much better treatment than you give.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 19:15:13
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives)

Statistic for this "high preponderance" please...or said sane people will think this was posterior generated.
John, the statistics for this specious claim?

I said already ABX tests that are run by people who know what they are doing is a lot different to home run ABX tests

Excellent. Then we shall all summarily dismiss your and Amirs homespun ABX tests.
Thanks for concurring.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 19:33:57
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives)

Statistic for this "high preponderance" please...or said sane people will think this was posterior generated.
John, the statistics for this specious claim?


John needs to take this issue up with Robert Stuart who has put his name on this affirmation of ABX testing:

"...ABX tests are viewed as the gold standard" for objective measures of listening."

Quote
I said already ABX tests that are run by people who know what they are doing is a lot different to home run ABX tests

Excellent. Then we shall all summarily dismiss your and Amirs homespun ABX tests.
Thanks for concurring.


Since we have already dispensed with the fiction that ABX tests are more difficult or are based on different kinds of listening than other listening tests, we must then dispense with any "homespun" listening test whether done by Mr. Kenny, his close associate Amir or any of Kenny's homespun clients or reviewers.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 19:34:53
Btw, can anyone find any statistics for this purported radio test and Arny's key thingy?
I'm starting to think this may be yet another of Johns daydreams, as he is rather prone to.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 19:43:49
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives)

Statistic for this "high preponderance" please...or said sane people will think this was posterior generated.
John, the statistics for this specious claim?
Well, there's a statistical value (the preponderance or risk) for Type II errors (false negatives) that can be calculated for a test given the number of trials & the significance level being used. I'm sure you can find this, if you are interested.

But apart from this Type II error risk value, there are other factors which are not based on statistical analysis. I've outlined some of these other factors already here (http://www.hydrogenaud.io/forums/index.php?showtopic=108668&view=findpost&p=895161). To know how prevalent these factors are would need to have some controls embedded in ABX tests which gave a measure of these. Something that nobody here is interested in doing so your question is circular logic.

Quote
I said already ABX tests that are run by people who know what they are doing is a lot different to home run ABX tests

Excellent. Then we shall all summarily dismiss your and Amirs homespun ABX tests.
Thanks for concurring.

Who are you trying to fool - you summarily dismissed these results from the get-go, anyway
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 19:44:35
Btw, can anyone find any statistics for this purported radio test and Arny's key thingy?
I'm starting to think this may be yet another of Johns daydreams, as he is rather prone to.

The Swedish Radio Tests (it's actually two) are described in an article by Christer Grewin and Thomas Rydén in 1991 (http://www.aes.org/e-lib/browse.cfm?elib=5396). That includes statistics. Since it is an AES article, it is behind a paywall.

Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-08 19:46:21
Since we have already dispensed with the fiction that ABX tests are more difficult or are based on different kinds of listening than other listening tests, we must then dispense with any "homespun" listening test whether done by Mr. Kenny, his close associate Amir or any of Kenny's homespun clients or reviewers.

The same answer applies to you as to AJ - Who are you trying to fool - you summarily dismissed these results from the get-go, anyway
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 19:46:57
John needs to take this issue up with Robert Stuart

Maybe John has 6 friends who say the same thing, thereby not only making it true by numbers, but "stupidity" to ask for actual statistics.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 19:47:18
To know how prevalent these factors are would need to have some controls embedded in ABX tests which gave a measure of these. Something that nobody here is interested in doing so your question is circular logic.

It is heartening to see how elegantly you avoid the alleged circular logic by assuming a high preponderance without any supporting evidence.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 19:51:42
To know how prevalent these factors are would need to have some controls embedded in ABX tests which gave a measure of these.

This sounds like a tacit admission that you have no evidence.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-08 19:53:00
Well, there's a statistical value (the preponderance or risk) for Type II errors (false negatives) that can be calculated for a test given the number of trials & the significance level being used. I'm sure you can find this, if you are interested.

Neither I nor any other sane person, need to do anything, as dictated by logic. The burden of proof lays squarely on you to show this statistical "preponderance" you claim. Your proof. The stats please.
Start reaching around... 



Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 19:57:40
Btw, can anyone find any statistics for this purported radio test


It is identified as an apocryphal story in several formal publications:

2012 The meaning of Format (http://dss-edit.com/prof-anon/sound/library/Sterne,%20Jonathan%20-%20MP3.%20The%20Meaning%20of%20a%20Format.pdf)
DUKE  UNIVERSITY PRESS P. 179

and

2004 "Golden Ears and Meter Readers The Contest for Epistemic Authority in Audiophilia" by Dr. Marc A. Perlman Associate Professor of Music, Brown University, Published in Social Studies Of Science

Abstract:

"
Scientific claims to knowledge and the uses of technological artifacts are both inherently contestable, but both are not usually contested together. Consumers of 'specialty' audio equipment (known as the 'high end'), however, connect both forms of resistance. These 'audiophiles' construct their own universe of meaning around their equipment; they cultivate a distinctive vocabulary and set of attitudes. In this they resemble other groups of users dedicated to supposedly antiquated technology. But they also engage in controversy to defend themselves against knowledge-claims that would delegitimize their universe of meaning. These debates concern recording formats or media (the relative merits of the compact disk [CD] and long-playing record [LP]), user 'tweaks' of purchased equipment, and the supposed audibility of differences between different brands of amplifiers, cables, or CD players. In all of these cases, audiophiles resist the claims of audio engineering by privileging their personal experiences, and they argue against scientific methodologies that seem to expose those experiences as illusory. Some of these patterns of epistemic contestation resemble those in non-musical domains (such as biomedicine). But audiophiles also make epistemic use of values crucial to their identity as music-lovers. They appeal to a common understanding of music as an exemplary locus of subjectivity, emotion, and self-surrender, in order to ward off the criticisms directed at them from a science they construe as objective, detached, and dispassionate.
"
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 20:08:07
Btw, can anyone find any statistics for this purported radio test and Arny's key thingy?
I'm starting to think this may be yet another of Johns daydreams, as he is rather prone to.

The Swedish Radio Tests (it's actually two) are described in an article by Christer Grewin and Thomas Rydén in 1991 (http://www.aes.org/e-lib/browse.cfm?elib=5396). That includes statistics. Since it is an AES article, it is behind a paywall.



The abstract of that paper is:

"The Swedish Broadcasting Corporation (SR) has performed subjective
assessments on low bit-rate audio codecs for ISO/MPEG/Audio. As it is likely
that the same codec can be used for DAB the evaluation is of great importance
for broadcasters. This paper presents the methodology, results and conclusions
from the two listening tests performed in July 1990 and April/May 1991

I have the full text as a PDF and Locanthi or the Harley anecdote don't seem to be mentioned.  The paper seems to be mostly about a formal comparison between two codecs, one called ASPEC and MUSICAM.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-08 20:12:20
Since we have already dispensed with the fiction that ABX tests are more difficult or are based on different kinds of listening than other listening tests, we must then dispense with any "homespun" listening test whether done by Mr. Kenny, his close associate Amir or any of Kenny's homespun clients or reviewers.

The same answer applies to you as to AJ - Who are you trying to fool - you summarily dismissed these results from the get-go, anyway


That is another false claim. The discussion of Amir's tests has been considerable, and your tests appear to be no better than sighted evaluations in accordance with your own account of them. That's conviction based on reliable evidence, not dismissal out of hand.
Title: How do you listen to an ABX test?
Post by: pelmazo on 2015-04-08 20:20:25
I have the full text as a PDF and Locanthi or the Harley anecdote don't seem to be mentioned.  The paper seems to be mostly about a formal comparison between two codecs, one called ASPEC and MUSICAM.

I'm not sure the Locanthi or Harley anecdote even existed at the time the paper was written.

A small bit of history, perhaps (correct me if I'm wrong, Arny - you might have been there): In the autumn of 1991 the AES convention in New York had audiophile guests, amongst them Mr. Harley, who delivered his "Listener's Manifesto". The conference where the Swedish Radio colleagues delivered their test report had happened shortly before, in September, so I assume that it also was a topic in New York. I wasn't there, but I assume that Harley attended a paper session where Locanthi related his own finding. Harley may not have understood much about the details of the tests, nor cared about them, but he got enough factoids on which he could construct his story.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-08 20:58:43
How come all your posts to me are ad-hom in nature, greynol & usually start others to follow suit?



Because your disingenuousness merits nothing better?
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 21:11:43
Quote
ad-hom

I like the abbreviated version.  It's as if he's regularly involved in discussions where it gets used a lot.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-08 21:20:52
Quote
What I'm interested in is how often these conditions arise & cause false negatives in ABX testing
In other words how useful is the test?
The test is useful regardless of the frequency of what you call false negatives. It is an opportunity for the claimants to support their claim with tangible evidence. If they manage to show a true positive, they have achieved that, completely irrespective of any false negatives there may be. I call that useful, and any sane person would.
Any sane person would realise that a test which has a high preponderance of suppressing audible differences (false negatives) is not very useful as a tool to reveal what is audible Vs what is not

Then again, it can be the very small magnitude of a difference , rather than *ABX testing inherently*, which results in the vast majority of people not being able to detect it, no? 

You do realize, don't you, that even after Amir  some online ninja reports (unproctored, but let's assume honest) detection of a difference at p<0.05, it doesn't mean everyone who ever took the test before, or will thereafter, perform the same? (Thougb it is vastly curious to me that we had a veritable 'spate'  -- ok, actually just a small handful -- of such detections, after that one.  The implications of that are interesting.) 

You seem to have your knickers in a twist with the idea of generalizing from multiple reports of : 'this subject's results support the null hypothesis' to 'if you think you hear this difference sighted, you're probably wrong' .    Do you have a similar problem with going from one or a few reports of 'this subject's results do not support the null hypothesis' to 'if you think you hear this difference sighted, you're probably right'?

I find ABX to be a very telling method indeed to answer whether an individual's current claim of 'I hear this' is based on them hearing a real difference.  And when multiple individuals 'fail', I conclude the difference is difficult if not impossible to hear by normal listeners and normal conditions.  A terrific example is 320kbps mp3 vs source.  I don't doubt that *some* can hear it, in some cases - and not surprisingly these have tended to be codec developers, or well 'trained' listeners, or using 'killer' samples, and the 'sensitives' tend to call the difference subtle--  and for those reasons I very much doubt that most people who dismiss any and all mp3, including 320kbps, as obviously , 'I can tell right away' deficient, as being full of..themselves. 


And that's the high end in a nutshell.
 
A few people (incidentally with obvious stake in the results) finding a 'tell' in Arny's jangly keys file or the AIX files -- moreover, a tell that apparently does not depend on actually possessing much upper-frequency hearing --  doesn't tell me jack sh*t about whether hi rez and redbook actually sound as different as has been COMMONLY, ROUTINELY, VOCIFEROUSLY, GRANDIOSELY claimed by hi-rez  advocates since the first 96/24 files appeared.

When you can show that your DAC is actually 'revelatory' to your excited reviewer when he's listening under DBT conditions, much less to the people that actually buy it, get back to me. (Hint: that sort of quality scale judgement would require a DB test other than ABX)
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 21:24:45
Quote
BTW, if you don't want to waste your time, just don't post & ignore this thread  I love the illogicality of this & see it also in people who complain about a TV programme as if someone was forcing them to view it.

Spoken from the person responsible for the off-topic shift in the discussion, no less.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-08 21:44:10
You claim without evidence that discerning actual audible differences is being suppressed by the ABX test method. That's quite a lot stronger than saying that audible differences may go unnoticed. The examples you offer certainly do not show that ABX testing suppresses anything like that.

Quoted for truth.


Only in your mind is the phrase "discerning audible differences is being suppressed" stronger than "audible difference may go unnoticed" Both the Swedish Radio tests & Arny's jangling keys test produced results in which "audible differences may go unnoticed" - when this occurs over extended time frames 2 yrs & 15 year I consider that something is responsible for suppressing the identification of differences. In the case of the Swedish Radio files the artifact which turned out to be an audible idle tone at 1.5KHz.



Your use of conspiratorial language is amusing.

Would you call really subtle difference self-'suppressing'?

   





Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-08 22:00:21
Speaking of...

The implications of that are interesting.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-09 00:49:28
For an example of a DBT with a clear tell that leads to false postives (Thanks Mzil for pointing this out)  please see:

Post in uploads forum detailing false positve in AVS/AIX high resolution test files (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=895219)

It should be pointed out that this problem was clearly pointed out long ago, but was never corrected. One can only presume that the goal of the test organizer was to produce misleading evidence that three were audible differences due to high resolution.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-09 12:44:50
For an example of a DBT with a clear tell that leads to false postives (Thanks Mzil for pointing this out)  please see:

Post in uploads forum detailing false positve in AVS/AIX high resolution test files (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=895219)

It should be pointed out that this problem was clearly pointed out long ago, but was never corrected. One can only presume that the goal of the test organizer was to produce misleading evidence that three were audible differences due to high resolution.

Again, Arny, you prove yourself adept at mixing up definitions & introducing confusion - they are not false positives - let me again state the definition: false positives means that the listener discerned the files as audibly different when there was no actual difference. These files are different & audibly different - it's correct that the listener/test should discern these audible differences otherwise the test would be delivering false negatives.

It's also quiet correct to identify what those underlying differences are & show that they are audible - and to show, as a result, that these particular audible differences are not directly related to one of them being high-res but more likely the result of the resampling process itself.

But please read up on the definition of false positives & negatives & stop confusing everyone, including yourself
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-09 13:13:54
For an example of a DBT with a clear tell that leads to false postives (Thanks Mzil for pointing this out)  please see:

Post in uploads forum detailing false positve in AVS/AIX high resolution test files (http://www.hydrogenaud.io/forums/index.php?showtopic=107570&view=findpost&p=895219)

It should be pointed out that this problem was clearly pointed out long ago, but was never corrected. One can only presume that the goal of the test organizer was to produce misleading evidence that three were audible differences due to high resolution.


Again, Arny, you prove yourself adept at mixing up definitions & introducing confusion - they are not false positives - let me again state the definition: false positives means that the listener discerned the files as audibly different when there was no actual difference.


No John, what you've been so kind to do is once again to demonstrate once again is that you have no clue, idea, or even passing interest in the proper design and execution of scientific experiments.

This thread has logged several well-intentioned efforts to raise your knowledge in this area above that which most are taught at the early primary school level, but  once again you have shown  that it was all for naught.

Quote
These files are different & audibly different - it's correct that the listener/test should discern these audible differences otherwise the test would be delivering false negatives.


Of course. In fact I'm willing to stipulate that the above is true. The problem is that what the listeners discern is irrelevant to an idea that you have repeatedly ignored - the idea that the experiment has a purpose which is not studying irrelevant influences caused by AIX's substandard preparation procedures. BTW I believe that the technical problem is due to the use of hardware resamplers, which most competent workers know is a poorer way to go than to use more widely used and accepted software resampling. The quality failure this file shows seems to be a good argument for avoiding recordings from this source.

I am amused that your good buddy Amir brags about his performance listening to this file, citing his performance with it as part of the body of evidence that he has "Conclusive "Proof" that higher resolution audio sounds different"

One of the consequences of the irrelevant influences is that the listener can reasonably be expected to be distracted from hearing any differences that are relevant to the purpose of the experiment.

Mr. Kenny, one might think that you might be interested in promoting listening tests that have the highest possible sensitivity, but obviously you aren't. You repeatedly ignore the fact that a distracted listener is a listener with reduced sensitivity.  That seems like self-destructive behavior for a DAC manufacturer to demonstrate over and over again.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-09 13:54:20
Your examples are laughable & not worth responding to except to say that they show your desperation

That's cool, because they are analogies to some of your laughable objections. You don't need to admit to your desperation though, it's pretty obvious.


I have consistently given my understanding of a false negative but it doesn't get past your barrier of denial

No, you were demonstrably inconsistent.


I have given you two real examples of blind tests that show there was an audible difference which went unreported for 2yrs & 20,00 tests in one case & 15 years & unknown number of tests in Arny's jangling keys test. Do you wish to deny this?

We already have two examples? You haven't even provided crucial information (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992) for the first one yet.


Making excuses for why they were unnoticed is not relevant to the ABX statistic I'm interested in - it's proneness towards suppressing the discernment of actually audible differences - the false negative measure.

Ask your favorite omniscient source. No, seriously.
How did you determine what is "actually audible" for other individuals with their audio setup? Before making a hasty reply, please see my response below to the quote directed at arny.


It really doesn't matter if the reason for the false negative is/was:
- lack of training
- lack of suitable playback equipment
- improper conditions of the test environment
- psychological stress
- fatigue or loss of focus
- etc.

What I'm interested in is how often these conditions arise & cause false negatives in ABX testing

You are still confusing things. ABX is a method for comparing two stimuli. Lack of training is not a problem of ABX per se, but the overall test. This is why I've asked you this (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992).

Lack of suitable playback equipment is again not necessarily a false negative, depending on how you define it. When an individual listens with his setup, and he cannot hear a difference despite his best efforts, then it is a true negative for that individual with his setup.
A DBT (which might make use of ABX) test includes the equipment and test environment. Given these conditions, if the participants cannot hear a difference despite giving their best, then we can again say these are true negatives depending on the definition.
Logically, we cannot argue from that that nobody can hear the claimed differences under any conditions, although multiple such failed tests with different individuals and conditions will increase the probability that this is true. But we don't even need to say that anyway, because it is you who e.g. made the first positive claim that a difference is generally audible. That's why it is your job to provide positive evidence for that claim.

Psychological stress, fatigue, loss of focus has already been dealt with several times in this thread. You can read about how this is dealt with in more formal tests it in the literature. It again has little to do with ABX per se.



I've no problem with doing the converse on sighted listening tests i.e getting an actual measure of how prone a sighted test run is to false positives

Again, how did you determine what is "actually audible"?


Again, Arny, you prove yourself adept at mixing up definitions & introducing confusion - they are not false positives - let me again state the definition: false positives means that the listener discerned the files as audibly different when there was no actual difference. These files are different & audibly different - it's correct that the listener/test should discern these audible differences otherwise the test would be delivering false negatives.

No, it's again a matter of what the test expects (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=894992) from the listeners and what the precise definitions are.
Is the goal to find out differences in sound quality between the samples, or irrelevant differences which the ABX method can be very sensitive to depending on the implementation? Some people completely ignore the latter - they are just looking for differences in sound quality; others don't even notice it because they don't want to use the tool against itself, so to speak, or are ignorant about these sensitivities in certain ABX implementations so they won't even try e.g. setting the end of a segment such that small time differences turn into audible artifacts (this can still happen subconsciously and result in a false positives), and so on..
Honest people will either not abuse these sensitivities or tell the test conductors if they deem it to be a problem for that particular test.

Amir, as a prominent example, demonstrated dishonesty by evading these issues for several pages and finally gave lame excuses. I wonder if something similar will happen in this thread.


But please read up on the definition of false positives & negatives & stop confusing everyone, including yourself

I don't think he is confused.


PS: The quote (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=108668&view=findpost&p=895152) which I took over from ajinfla's post seems to come from your banned diyaudio.com account. I don't like pulling in quotes from the past, because you could have changed your mind (unlikely in this case, but in principle..), but this was a general comment anyway.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-09 13:58:17
John, we're still waiting on those stats for the whole "preponderance" of "false negatives" claim for ABX you wishfully concocted.
Was it based on the "6 friends makes it fact" theory? A complete fabrication like this (http://www.avsforum.com/forum/86-ultra-hi-end-ht-gear-20-000/1136745-establishing-differences-10-volume-method-14.html)? Yet another daydream of yours?
It would be helpful moving the ABX discussion forward if you could clarify where you pulled that claim from, rather than leave it to speculation.
Now, if you were to utilize your same "preponderance" statistical model and apply it to your type positives for "more comprehensive" results "hearing" DACs, would they be mostly false, or true positives?
Is the minimum for a true positive "6 people said so"? Would it still be "stupidity" to question if only, say, 3-4 people "said so"?
Just curious how this all works, thanks.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-09 15:37:05
Speaking of...

The implications of that are interesting.


Well, Arny himself is now robustly detecting differences  (IM based, if I'm gleaning the right message from skimming that thread (http://www.hydrogenaud.io/forums/index.php?showtopic=107570), which I may well not be)  in his various keys jangling files (of whose many iterations I have now completely lost track).

Yes, the implications for this entire train wreck  episode remain interesting.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-09 16:08:24
Oh, you crazy mixed up kids - I know I'm no expert in perceptual testing & don't pretend to be but if these are the understandings you picked up in the primary school perceptual testing classes then you should go back there as you really weren't listening.

But actually, this was not what was taught - your problem is your learning motivation - you are so caught up in trying to make everything support your biased viewpoint that you twist everything into this blinkered perspective - so we have false positives being incorrectly defined as above. No problem with people making mistakes but when pointed out to them & they show a lacl of knowledge & logic, it really does reveal the motivational barriers to learning. I suggest that you go onto some psychology forum & ask questions about perceptual testing

You don't even know what is the objective of the ABX test:
- is it to determine what is audible
- or is it to determine what a non-trained, inattentive, distracted listener with mediocre playback system can discern?

If you can answer that question for yourself you may realise some things.

The test environment & test procedure to answer the first question is hugely important to the test results & why the ITU standards specify many procedural & environmental recommendations for how to conduct such tests.

If the test objective is some undefined, amorphous objective whose only point that you all agree on is to prove someone is wrong then you will come up with the mistaken notions & silliness that you all demonstrate here - it is also why I'm suggesting that your ABX tests should have internal controls because they will all suffer from experimenter's bias  & this needs to be uncovered.

The sort of silliness & ignorance of how to do perceptual testing has permeated through this thread but it has finally culminated in some classic examples in the above posts:
- Arny has consistently mixed up false negatives & false positives & not even done so consistently- the definition changes to make what he believes is a point.
- "the idea that the experiment has a purpose which is not studying irrelevant influences caused by AIX's substandard preparation procedures" - demonstrating that you haven't the first clue about how to design &/or analyse tests & their results. Of course any flaws in the files need to be discovered & analysed - jeez!!
- "Mr. Kenny, one might think that you might be interested in promoting listening tests that have the highest possible sensitivity, but obviously you aren't. " It's always ironic when someone who wants I get accused of the exact opposite of what I'm doing
- It does now make sense of an earlier post which I paraphrase here " we are not doing academic research" - in other words stop asking for good experimental design
- "Lack of training is not a problem of ABX per se, but the overall test"
- "Lack of suitable playback equipment is again not necessarily a false negative, depending on how you define it. When an individual listens with his setup, and he cannot hear a difference despite his best efforts, then it is a true negative for that individual with his setup." hehe
- "Psychological stress, fatigue, loss of focus has already been dealt with several times in this thread. You can read about how this is dealt with in more formal tests it in the literature. It again has little to do with ABX per se." hehe
- Another attempt at trying to redefine the meaning of "false negative" - "No, it's again a matter of what the test expects from the listeners and what the precise definitions are. Is the goal to find out differences in sound quality between the samples, or irrelevant differences which the ABX method can be very sensitive to depending on the implementation? Some people completely ignore the latter - they are just looking for differences in sound quality; others don't even notice it because they don't want to use the tool against itself, so to speak, or are ignorant about these sensitivities in certain ABX implementations so they won't even try e.g. setting the end of a segment such that small time differences turn into audible artifacts (this can still happen subconsciously and result in a false positives), and so on..
Honest people will either not abuse these sensitivities or tell the test conductors if they deem it to be a problem for that particular test." There is an audible difference between the files, right. Some people find this audible difference & some don't. Therefore the ones that don't are recording false negatives in their results. Easy!!
If they had reported that they heard a difference but weren't aware that it was a structural flaw in one of the files, you would be quick to point this out. This is not a false positive, btw - it is a correct positive as there is an audible difference between the files which they have discerned. Jeez, you guys just continue to bend any definition into whatever you want.
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-09 16:12:17
You don't project much, do you John?

In the mean time you're still flogging the same refuted arguments (thus continuing your transparent intellectual dishonesty).
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-09 16:13:28
Speaking of...

The implications of that are interesting.


Well, Arny himself is now robustly detecting differences  (IM based, if I'm gleaning the right message from skimming that thread (http://www.hydrogenaud.io/forums/index.php?showtopic=107570), which I may well not be)  in his various keys jangling files (of whose many iterations I have now completely lost track).

Yes, the implications for this entire train wreck  episode remain interesting.
Yes, I saw that & is he calling them false positives?
I also see that the quality of the playback equipment NOW MATTERS once the potential IMD acts as a tell to differentiate the files but according to XNOR the quality of the playback equipment doesn't matter once null results are returned.

Jeez, you guys! You can't even be consistently wrong 
Title: How do you listen to an ABX test?
Post by: greynol on 2015-04-09 16:17:36
Yes, I saw that & is he calling them false positives?

Issues were discovered with multiple versions of the samples.  These issues may result in false positives, yes.  Considerable effort was made to ascertain a description of the difference that was heard but nothing elucidating was ever provided.  This combined with the overwhelmingly pathological evasiveness by the participants in question, I don't find it unreasonable to see people going a step further and actively discounting the results as false positives.

I also see that the quality of the playback  equipment NOW MATTERS once the potential IMD acts as a tell to  differentiate the files

It always mattered.

but according to XNOR the quality of the  playback equipment doesn't matter once null results are returned.

Not only is this correct, the reason why has been explained to you countless times already, not just in this discussion but also in the previous discussion which I had to close.
Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-09 16:20:04
Speaking of...

The implications of that are interesting.


Well, Arny himself is now robustly detecting differences  (IM based, if I'm gleaning the right message from skimming that thread (http://www.hydrogenaud.io/forums/index.php?showtopic=107570), which I may well not be)  in his various keys jangling files (of whose many iterations I have now completely lost track).

Yes, the implications for this entire train wreck  episode remain interesting.


Yes, I saw that & is he calling them false positives?


For the 987th time, it depends on the experiment that you are doing at the time.  Focus is important!

Quote
I also see that the quality of the playback equipment NOW MATTERS once the potential IMD acts as a tell to differentiate the files but according to XNOR the quality of the playback equipment doesn't matter once null results are returned.

It always did matter, it is just that depending on the purpose, one need not go the Golden Ear route to obtain suitable performance.

I see from a little searching John that you don't want anybody with test equipment to give your products a try. Do you even have any good test equipment other than a digital multimeter?  So you even have one of those? Don't they take away your Golden Ears card if you obtain any non-trivial test gear?
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-09 16:29:28
jkeny seems to finally realize that this is all going over his head.
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-09 17:16:28
jkeny seems to finally realize that this is all going over his head.

He does seem to have an occasional flash into reality once in a blue moon.
Like when he said this in the DIY thread (http://www.diyaudio.com/forums/everything-else/171506-what-can-measurements-show-not-show-13.html#post2267133)
Quote
Originally Posted by jkeny
I'm no expert in digital audio

Epitomizes the "High End"/"Audiophile" designer. Having that "fresh", unencumbered outlook on issues of perceptual testing, DAC design, etc.....by not having a clue about them. Need a more "organic" DAC? Here, let's just fiddle around and try this.... 
Just blessed with being able to "hear" stuff, along with 6 friends, that those deaf sciency types always miss.

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-09 17:53:38
I see that a lot of things brought up in this thread had already been explained to him 5 years ago. 


edit: jkeny, here's a really simple question for you:
What do you believe is currently the best way to determine what is audible to an individual?
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-09 18:49:17
I see that a lot of things brought up in this thread had already been explained to him 5 years ago. 
Yes, but the whole ABX test "false negative preponderance" fact wish theory(?) seems relatively new.
Did I mention adaptation early in the thread?

edit: jkeny, here's a really simple question for you:
What do you believe is currently the best way to determine what is audible to an individual?
As recently as 8/14 (http://www.whatsbestforum.com/showthread.php?15255-Conclusive-quot-Proof-quot-that-higher-resolution-audio-sounds-different&p=279920&viewfull=1#post279920):
Quote
Originally Posted by jkeny
Yes, as I said...an informal blind test is just another angle - not necessarily a better angle, just another one. But as I said, this is usually a test for a specific difference - I get better, more comprehensive results with longer term "listening".
(involves lots of peeking, etc).

cheers,

AJ


Title: How do you listen to an ABX test?
Post by: Arnold B. Krueger on 2015-04-09 21:14:29
I see that a lot of things brought up in this thread had already been explained to him...


Late last year (11/17/14) on the Pink Fish Media forum:

A futile attempt at straightening out John Kenny (http://www.pinkfishmedia.net/forum/showpost.php?p=2456729&postcount=12)

Down a few posts and someone sagely answers:

"What makes you claim there isn't a similar concern? Most professional/research tests make sure they include low anchors and other calibration measures."

Just guessing, but it seems very likely that the phrase "Low anchor" flew right over John's head.

Later on John attacks me behind my back.

edit: jkenny, here's a really simple question for you:
What do you believe is currently the best way to determine what is audible to an individual?



Don't expect much of an response from John unless he can somehow work the phrase "ABX test false negative" into it.

Quote
I get better, more comprehensive results with longer term listening.


Yet another repetition of the oft-disproven lie that ABX tests can't involve long term listening.
Title: How do you listen to an ABX test?
Post by: 2Bdecided on 2015-04-09 23:27:57
Hasn't this thread finished yet?

The controls jk wants are explicitly included in the official listening tests put together here on HA.

The controls jk wants are implicitly included when ever someone demonstrates (or fails to demonstrate) via an ABX test any difference first heard by them in a sighted test.


What jk is complaining about isn't really anything to do with blind testing. It's that people report they can't hear a difference. Damn them. They should listen harder, or care more, or practice, or not tell anyone, or something.

I still haven't found those stats relating Type II errors (false negatives) to number of trials, p-value, and/or theta. Though "night and day" differences render any such calculations pointless.

Cheers,
David.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-10 16:02:36
I see that a lot of things brought up in this thread had already been explained to him 5 years ago. 
Yes, but the whole ABX test "false negative preponderance" fact wish theory(?) seems relatively new.

Discussion  (albeit at a somewhat more sophisticated level) of Type II error in audio DBTs goes all the way back to Larry Leventhal's'  arguments* in the mid  1980s, which Stereophile (http://www.stereophile.com/features/141/index.html) made as big a stink about then  as jkeny is now about 'false negatives'.

jkeny seems utterly unaware of any such 'issues' he himself has not 'discovered'.  People who reinvent the wheel are tiresome.



(* go here:
http://www.aes.org/e-lib/ (http://www.aes.org/e-lib/)

enter 'Leventhal' in the search box.

See all the AES papers about error rates in audio DBTs, including ABX.  Some by Leventhal, some not.


jkeny could have done this.  Didn't.)
Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-10 17:21:58
Meant new to him.
Yeah, of course it's all old news, which only adds to the entertainment 

cheers,

AJ
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-10 17:37:40
The problem is that he evaded providing the information needed to answer his own question.

It's a bit like asking: "how many vehicles crash?", and expect a number such as 12.3%.
Title: How do you listen to an ABX test?
Post by: krabapple on 2015-04-10 18:41:38
The problem is that he evaded providing the information needed to answer his own question.

It's a bit like asking: "how many vehicles crash?", and expect a number such as 12.3%.


A tactic practiced by Amir some online audio ninjas.  jkeny learns *some* things fast; other things, not so much.
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-10 20:12:15
I see that a lot of things brought up in this thread had already been explained to him 5 years ago. 
Yes, but the whole ABX test "false negative preponderance" fact wish theory(?) seems relatively new.

Discussion  (albeit at a somewhat more sophisticated level) of Type II error in audio DBTs goes all the way back to Larry Leventhal's'  arguments* in the mid  1980s, which Stereophile (http://www.stereophile.com/features/141/index.html) made as big a stink about then  as jkeny is now about 'false negatives'.

jkeny seems utterly unaware of any such 'issues' he himself has not 'discovered'.  People who reinvent the wheel are tiresome.



(* go here:
http://www.aes.org/e-lib/ (http://www.aes.org/e-lib/)

enter 'Leventhal' in the search box.

See all the AES papers about error rates in audio DBTs, including ABX.  Some by Leventhal, some not.


jkeny could have done this.  Didn't.)
I already mentioned the Type II error statistics but of course, you didn't understand what I was talking about:
Quote
Well, there's a statistical value (the preponderance or risk) for Type II errors (false negatives) that can be calculated for a test given the number of trials & the significance level being used. I'm sure you can find this, if you are interested.

But apart from this Type II error risk value, there are other factors which are not based on statistical analysis. I've outlined some of these other factors already here. To know how prevalent these factors are would need to have some controls embedded in ABX tests which gave a measure of these.

What I'm talking about is a different approach to evaluating the preponderance of a test to Type II errors - different to the staistics that Levanthal explains very well. What I'm talking about is a more direct approach which uses internal controls as a means of measuring this aspect of a test rather than the more abstract statistical calculation. Both approaches are focussed on the same characteristic which boils down to an evaluation of the power of the test.
Title: How do you listen to an ABX test?
Post by: xnor on 2015-04-10 22:13:31
What I'm talking about is a more direct approach which uses internal controls as a means of measuring this aspect of a test rather than the more abstract statistical calculation. Both approaches are focussed on the same characteristic which boils down to an evaluation of the power of the test.


What internal controls? Can you make a single concrete point, e.g. how these controls would look like, and what they would do for you?

Keep in mind that if those controls are just more easily detectable, randomly inserted trials you:
a) increase the length of the test
b) increase confusion
c) disturb listener concentration
...
all of these points will likely have a negative effect.

It seems you just want to dismiss some of the negative results that we already ignore to begin with in these informal tests and which are rarely posted anyway.

Title: How do you listen to an ABX test?
Post by: ajinfla on 2015-04-10 22:56:14
What I'm talking about is a different approach to evaluating the preponderance of a test to Type II errors - different to the staistics that Levanthal explains very well. What I'm talking about is a more direct approach which uses internal controls as a means of measuring this aspect of a test rather than the more abstract statistical calculation.

Oh, so now its "evaluating the preponderance". Hmmm, that doesn't sound like you had any stats to go on before. Peddler "hunch" maybe?
Ok. Well, what are you waiting for? Christmas? 6 more friends? 
You've got an awful lot of retesting to do, why are you wasting time here just squawking about this theory of yours.
Get busy!! 
Unless of course.....you have no plan, it's all a smokescreen and want others to do all the testing for you, much like the DIY thread.
This despite having at least 6-7 vetted organic hearers....and no guarantee sane people will pull the load for you or be able to "hear" this stuff anyway.
Hmmmm....
Title: How do you listen to an ABX test?
Post by: jkeny on 2015-04-10 23:25:12
What I'm talking about is a more direct approach which uses internal controls as a means of measuring this aspect of a test rather than the more abstract statistical calculation. Both approaches are focussed on the same characteristic which boils down to an evaluation of the power of the test.


What internal controls? Can you make a single concrete point, e.g. how these controls would look like, and what they would do for you?

Keep in mind that if those controls are just more easily detectable, randomly inserted trials you:
a) increase the length of the test
b) increase confusion
c) disturb listener concentration
...
all of these points will likely have a negative effect.

It seems you just want to dismiss some of the negative results that we already ignore to begin with in these informal tests and which are rarely posted anyway.

Ah, a real question, rather than an attack - thank you, much appreciated
I already mentioned some. The important aspect of these controls would be that they are hidden controls i.e the listener isn't telegraphed in advance that this trial is a control trial.

In essence what I'm trying to do is insert some known audible differences randomly in the testing to check if the listener is able to discern this difference. So the listener is going through listening trials by evaluating if X is sample A or B. I'm suggesting that randomly the software introduces an X which is an A or B that has some change made to it that should be audible - it could be a change in volume of A. So a small volume increase is added to A & this is presented as X. The listener comparing X to