How do you listen to an ABX test?

Topic: How do you listen to an ABX test? (Read 343927 times) previous topic - next topic

0 Members and 2 Guests are viewing this topic.

How do you listen to an ABX test?

2015-03-18 19:33:20

I'm wondering how people listen when doing ABX tests, as related to what order they prefer to listen to the alternatives.

IOW:

ABX and done (next trial) (classic 1950 Munson and Gardiner ABX testing)

ABX and then repeat as desired

XABXABX... repeat as desired

any other preferred ordering?

No preference, random as the spirit leads..

Why have you chosen the method that you use?

How do you listen to an ABX test?

Reply #1 – 2015-03-18 21:01:21

I tried to do an ABX test once but after a couple of trials the "cognitive load" was just so overwhelming my head started to spin, I became nauseous, and my vision became blurred, so I walked over to my frig to relax with a cold soda. I opened it and saw both Coke and Pepsi, so my head exploded. Not a pretty scene.

OK seriously. If A and B are readily, discernibly distinguishable prior to the test:

X, I vote [since I already know what A and B are from the list order in the previous, pre-test screen], next trial...

X, I vote, next trial,

X, vote, next trial...repeat to end

If the difference is pretty easy, but not dead obvious:

X, A, "Did I hear any change at all?" I ask myself, vote, next trial...repeat to end

If the difference is subtle:

A, B, X, A "Did that last transition cause a change or not?", either vote or keep trying X, A, X, A, X, B, X, B etc, vote, next trial...repeat to end.

The concept of "juggling three things cognitively" never even occurs, for me, I'm focusing instead on which two options has an audible change when I transition, and which doesn't. And if you think about it, the real question I'm asking my brain is "Was that last transition audible?" So from my perspective I'm only jugging ONE concept.

How do you listen to an ABX test?

Reply #2 – 2015-03-18 21:02:47

I don't give up when I can't tell which is which, and I will maybe alternate longer listening and rapid ababababababababab. whatever feels like I have a chance. that until I can actually hear something changing, or until my brain decides that it's enough and fools me into thinking I heard something different(he does it a lot, I must be boring him to death). then I listen to X as a kind of way to check if I can identify what I thought I had. so it can be another drama with multiple switches again before the actual vote.
most of the times relatively rapid ABABA are what gives me the best results. not crazy fast, but no more than 2 or 3sec each on a part where I hope to find something. at least that's what I noticed when going at it with different mp3 levels.
and listening to half a song then switch is total bullshhhhh, I never got anything right that way.

anyway doing 20 trials does take me a good deal of time, mostly because I try to win!!!!!

How do you listen to an ABX test?

Reply #3 – 2015-03-18 21:11:16

In the first few trials I normally switch between A B and X in whatever order I feel like at the time - or even all 4 if the mood takes me.

Once I have the differences down I tend to listen only to X and A, or only X and B (if there is an artifact in one of A or B, then I generally listen to the other to compare X to).

If I'm not confident about the difference I guess I use the starting method throughout the trial.

How do you listen to an ABX test?

Reply #4 – 2015-03-18 22:51:57

I usually don't bother with X until I think I've identified the difference between A and B. I then try to find that difference between A and X or B and X.

How do you listen to an ABX test?

Reply #5 – 2015-03-19 13:08:48

It may need to be pointed out that the ABX test as implemented by the various ABX Comparators is a vastly different test than the ca. 1950 ABX test. The major difference is interactivity. In the 1950 version of the test, even as practiced today the three alternatives being A, B, and X were very brief often of duration only 100s of milliseconds, presented once in that order and then a choice was demanded. All test parameters are chosen in advance and are not under the control of the listener. This may be very appropriate for the purpose at hand which usually relates to the audibility of short sounds such as vowel and consonant sounds, which is vastly different from listening to music for enjoyment.

All modern ABX boxes that the author is aware of put the sonic alternative being listened to (which is a musical selection) under the full control of the listener. He may listen to the alternatives as desired in any order and number that the listener desires. Furthermore, most modern ABX Comparators allow the listener to choose the sound segment he bases his decisions on from a longer selection of music.

It should be pointed out that while A & B have always been instantly available to the listener, they are not always referred to by the listener. A listener who is familiar with the test being run can "Run the X's", referring to only the randomized selections, and still obtain perfect or at least statistically significant accuracy.

In the scientific literature ABX has been analyzed several different ways. Some observers have called it a 3AFC test, thinking that there are 3 alternatives: A, B, and X. However X is not a different alternative than A or B. It is always one or the other so there are only at most 2 alternatives.

Others have called ABX a 2AFC test thinking that there are 2 alternatives being A or B. However the listening task is not one of choosing from among two alternatives since the actual problem posed to the listener is whether X is either known sound or not. If it is one known sound then it cannot be the other. The presentation of both known sounds is optional, instantaneous and provided as a convenience for the listener to exploit as he will. ABX is in this way at most a highly desirable" "Yes/No" test.

It does not really matter which known sound (A or B) is chosen as the reference to compare X to because choosing either A or B as the reference can produce the same reliable outcome if the listener can reliably hear the difference between A and B.

Using either known sound as a reference is equally easy and convenient. Since the sounds are accessed via an instantaneous random access technique and even the sound itself is under the control of the listener, either known sound can be juxtapositioned as close to the unknown sound as is needed or desired. Thus any claims that ABX requires the memorization of three different sounds are completely false. At most just one sound may need be memorized.

A representative explanation of ABX testing takes the form of a video that can be viewed online via this URL: http://www.homebrewedmusic.com/2014/07/30/a-new-abx-tool/ . This test is representative the author's own experiences and the experiences of many others going back over 30 years to original ABX Comparator that the author built in the late 1970s.

The listener starts out by determining the sonic nature of the difference between A and B by means of a hypothesis about the technical nature of the difference and listening to a musical selection and using a sample editing feature common to ABX Comparators to isolate a note of music that he feels illustrates that difference. He initially compares the unknown sample at hand to either A and B and records his choices but shifts to just listening to unknown samples and recording his results. He seems pleased with the accuracy and reliability of his results.

It is arguable that over especially the latter trials, the listener is not basing his conclusions on comparisons of memorized musical selections but rather relying on a single qualitative judgment about the spectral balance of each unknown sample. It is comforting to observe that the technical difference that was hypothesized can be confirmed by detailed technical measurements. In the author's experience this sort of thing is a very common practice among experienced ABX testers.

Therefore claiming that ABX testing necessarily involves memorization of musical selections may be false.

Modern theories about short term memory suggest that approximately 7 such items can be remembered for up to 10 seconds or more. Since the workload in this ABX test example is just one item, and the druation of memory required can be very short even less than a second, it should be easy enough.

The author suspects that ABX's reputation for being tough is the natural result of being compared to sighted evaluations, which are not actually tests at all. A real test is always more work than a sham.

How do you listen to an ABX test?

Reply #6 – 2015-03-19 14:10:19

I do it similar to mzil.

X is enough for easy tests, else X-A switching. If the difference between A and B is hard to detect I usually also do some additional X-B switching or even some switching between A and B to refresh my memory what to listen for.

How do you listen to an ABX test?

Reply #7 – 2015-03-19 14:43:43

Quote from: Arnold B. Krueger on 2015-03-19 13:08:48

In the scientific literature ABX has been analyzed several different ways. Some observers have called it a 3AFC test, thinking that there are 3 alternatives: A, B, and X. However X is not a different alternative than A or B. It is always one or the other so there are only at most 2 alternatives.

Others have called ABX a 2AFC test thinking that there are 2 alternatives being A or B. However the listening task is not one of choosing from among two alternatives since the actual problem posed to the listener is whether X is either known sound or not. If it is one known sound then it cannot be the other. The presentation of both known sounds is optional, instantaneous and provided as a convenience for the listener to exploit as he will. ABX is in this way at most a highly desirable" "Yes/No" test.

I'm not sure to which scientific literature you are referring, but in standard psychophysics/signal detection theory terminology, ABX is a 2AFC match-to-sample test. The answers the subject gives are chosen from 2 alternatives (A or B) and are forced ("I don't know", "I can't decide" and "it sounds like a third" are NOT allowed), thus 2AFC. Because they match X to one of the two samples, A or B, it is "match-to-sample".
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Of course, even forced-choice tests can be terminated at any time by the subject! ;-) It's not *that* forced.

How do you listen to an ABX test?

Reply #8 – 2015-03-19 16:23:33

Quote from: pelmazo on 2015-03-18 22:51:57

I usually don't bother with X until I think I've identified the difference between A and B. I then try to find that difference between A and X or B and X.

This is what I do too. With fooABX I also sometimes check out 'Y' too, after homing in on a difference between A and B.

How do you listen to an ABX test?

Reply #9 – 2015-03-19 16:38:59

Quote from: SoundAndMotion on 2015-03-19 14:43:43

Quote from: Arnold B. Krueger on 2015-03-19 13:08:48
In the scientific literature ABX has been analyzed several different ways. Some observers have called it a 3AFC test, thinking that there are 3 alternatives: A, B, and X. However X is not a different alternative than A or B. It is always one or the other so there are only at most 2 alternatives.

Others have called ABX a 2AFC test thinking that there are 2 alternatives being A or B. However the listening task is not one of choosing from among two alternatives since the actual problem posed to the listener is whether X is either known sound or not. If it is one known sound then it cannot be the other. The presentation of both known sounds is optional, instantaneous and provided as a convenience for the listener to exploit as he will. ABX is in this way at most a highly desirable" "Yes/No" test.

I'm not sure to which scientific literature you are referring, but in standard psychophysics/signal detection theory terminology, ABX is a 2AFC match-to-sample test. The answers the subject gives are chosen from 2 alternatives (A or B) and are forced ("I don't know", "I can't decide" and "it sounds like a third" are NOT allowed), thus 2AFC. Because they match X to one of the two samples, A or B, it is "match-to-sample".

As I subsequently pointed out an ABX test can and often is completed quite successfully without listening to either A or B, and they not infrequently completed successfully without listening to either.

Quote

Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Agreed that the constraints on a particular ABX test can be added or relaxed per the preferences of the designer. We have long known that whatever constraints we put on a test, someone would make a big thing about how our results would have proven their hobby horse theories if only... We found that most people run out of energy for critical listening soon enough...

I don't see a response to the fairly frequent case where only A or B but not both are listened to, or the less frequent but documented cases where neither is listened to for all or part of the trials.

It is easy enough to reduce an ABX test trial to the simple question: "Does X sound like (pick one) A or B", which is a simple true or false question. The remaining options were put in to make things easier for the listener.

Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other. IOW people seemed to need to be refreshed as to what same and different sounded like.

Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time. So we changed the question to the listener's choice of "Does X sound like A" or "Does X sound like B?" do that all the listener had to do is listen to either A or B and then X. In many cases we let the user just identify X by its sound without any further listening because that was what he wanted to do and once we scored the results we found that he was doing it accurately.

These are the sorts of things that happen when you compare two things that actually sound different. ;-)

Quote

Of course, even forced-choice tests can be terminated at any time by the subject! ;-) It's not *that* forced.

The point is that many critics of ABX including most of the golden ears such as Hartley and the Meridian/Dolby gang make up their own rules for ABX and then say its a bad test because of those rules.

I have reviewed a goodly number of papers and other documents on the topic, and almost all do not provide any rules or even suggestions for organizing the sound samples or actually taking the test.

In 1982 when Clark's paper was published quite a bit of what is now known or suspected about long and short term memory was either not well known or known at all. Therefore we agreed among ourselves to not constrain people with our possibly mistaken ideas about how to listen in an ABX environment. It is several decades later and more is known.

How do you listen to an ABX test?

Reply #10 – 2015-03-19 17:58:54

Quote from: Arnold B. Krueger on 2015-03-19 16:38:59

Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other. IOW people seemed to need to be refreshed as to what same and different sounded like. Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.

I disagree. If you want maximum sensitivity to tiny differences you should completely eliminate ANY need for the test subject, the listener, to either invoke their memory circuits or perform some task, even if you deem it to be mundane, like "These are called A, B, X, (and Y). Your task is to listen to all of them and then map out for me which should correctly be paired with each other." Yikes, that is making it way more complex than it needs to be and sounds like the convoluted arguments given by the scaremongers, like Stuart, to terrify people that they will need to jungle many concepts at once, in this FORCED task. (Forced being a scary word used to intimidate people that they shouldn't download FB2K/ABX because they will be forced into something. IT WORKS. Only a handful of people actually did so and posted results in the "AIX records test" AVSForum thread, for example.)

I challenge anyone to find me any web reference which describes, for example, a true/false test as being a "2AFC test", even though it actually is one. Nobody wants to be forced into anything and THAT's why they, Stuart et al, have invoked the terminology, if you ask me.

Quote

That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other.

But they aren't mutually exclusive. I listen to A for a while. I listen to B for a while. I listen to a rapid fire transition from A to B, and maybe B to A, and maybe even A to A and/or B to B, just to be sure I have a good feel for what zero difference will sound like when I transition for the perception testing phase of the trial. AND THEN, I'm ready for the actual perception test when I rapid fire switch between A and X, and ask myself one very simple question:"Is there a difference?" Once I've made up my mind after applying my focused listening attention and 100% concentration, only then do I have to switch gears and do some actual thinking, not just perceiving, and process the concept "OK, I heard no difference on that last transition I repeated over and over again, so that means I'm supposed to select this box to vote". Easy, no memory involved, no concept juggling, no cognitive load from being forced to map anything to anything else during the perception stage, at all, and examples A, B, A to A, A to B, and B to B are all readily available both before and after the perception stage, should I choose to invoke them.

How do you listen to an ABX test?

Reply #11 – 2015-03-19 18:39:41

Quote from: mzil on 2015-03-19 17:58:54

Quote from: Arnold B. Krueger on 2015-03-19 16:38:59
Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other. IOW people seemed to need to be refreshed as to what same and different sounded like. Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time.

I disagree. If you want maximum sensitivity to tiny differences you should completely eliminate ANY need for the test subject, the listener, to either invoke their memory circuits or perform some task, even if you deem it to be mundane, like "These are called A, B, X, (and Y). Your task is to listen to all of them and then map out for me which should correctly be paired with each other.

I showed how as much as is possible of the above is not necessary with an ABX Comparator. There is no need to listen to A, B, X, and possibly Y. All anybody who can actually reliably hear the difference at hand needs to listen to is either A or B (but not necessarily both) and compare the one they pick if only by random, to each X.

The process of doing an A/B test demands some means of evaluating the sound, but that means does not have to be based on comparing memorized sounds.

The golden ears can easily go on and on about obvious differences in sound quality. They obviously believe that they have cracked the code and know exactly what the audible differences are - thy write pages of prose about just that. The tiny little problem they have is that even if they can read all that wonderful prose while they are listening, it does nothing to improve their actual accuracy as listeners. They are some of the most likely to be random guessers!

Furthermore if a listener uses ABX and memory, he only needs to memorize one sound, probably one of the known sounds. When he listens he compares the sound of the unknown he is listening to, to the sound he memorized but he does the comparison in real time so no memory of the unknown sound is required.

If you can reliably discern differences in sound quality, for example the difference between more bass and less bass (which the golden ears tell us in massive missives exists), then one should be able to use that ability to discern whether each X is either one of the knowns. It either has the same amount of bass as the known or it has a different amount of bass.

One of the tricks to ABX if there are any tricks at all is to realize that you can hear differences without any memory of the music. If you use memory, you at worst need to memorize one musical sample, one of the two references. The perceptual load is thus minimized either way.

All the other stuff was put into ABX to facilitate learning what the audible difference is, and learning what the audible difference is, is an irreducible part of the problem. The only reason why it isn't such an apparent problem with sighted evaluations is the fact that sighted evaluations aren't really tests. No test, no bother!

There are two stages to hearing differences - one is knowing what the difference is, and the other is applying that knowledge to the music at hand. If step two can't be executed to obtain a statistically significant score, then the results of step one are in doubt.

How do you listen to an ABX test?

Reply #12 – 2015-03-19 19:12:56

Quote from: Arnold B. Krueger on 2015-03-19 16:38:59

As I subsequently pointed out an ABX test can and often is completed quite successfully without listening to either A or B, and they not infrequently completed successfully without listening to either.
Quote
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Quote
Agreed that the constraints on a particular ABX test can be added or relaxed per the preferences of the designer. We have long known that whatever constraints we put on a test, someone would make a big thing about how our results would have proven their hobby horse theories if only... We found that most people run out of energy for critical listening soon enough...
I don't see a response to the fairly frequent case where only A or B but not both are listened to, or the less frequent but documented cases where neither is listened to for all or part of the trials.

It is easy enough to reduce an ABX test trial to the simple question: "Does X sound like (pick one) A or B", which is a simple true or false question. The remaining options were put in to make things easier for the listener

Well.... then you've changed the discrimination task to a detection task. Then it's a yes/no detection test, and it's inappropriate to call it ABX. ...perhaps AX, yes/no detection. This would (should) be used when different goals, analysis and type of results are planned.

Quote

Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other. IOW people seemed to need to be refreshed as to what same and different sounded like.
Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time. So we changed the question to the listener's choice of "Does X sound like A" or "Does X sound like B?" do that all the listener had to do is listen to either A or B and then X. In many cases we let the user just identify X by its sound without any further listening because that was what he wanted to do and once we scored the results we found that he was doing it accurately.

These are the sorts of things that happen when you compare two things that actually sound different. ;-)

It sounds as though you explored many things... but it also sounds as though some weren't really designed to give a meaningful result... you were just exploring. If A and B are so blatantly different that you don't even need to listen to both, why are you doing a test?

Quote

The point is that many critics of ABX including most of the golden ears such as Hartley and the Meridian/Dolby gang make up their own rules for ABX and then say its a bad test because of those rules.

Criticizing ABX as a tool is like criticising a hammer; if used well, no reason for criticism, if used poorly, criticize the design, not the tool. I don't know to which ABX tests you're referring, but if the results are accepted by a journal, it is worth evaluating critically, if not, who cares?

Quote

In 1982 when Clark's paper was published quite a bit of what is now known or suspected about long and short term memory was either not well known or known at all. Therefore we agreed among ourselves to not constrain people with our possibly mistaken ideas about how to listen in an ABX environment. It is several decades later and more is known.

Well, yes, Massaro's work was in the mid-70's, but that and Cowan's work from the 80's show why the ITU standards recommend that an experienced researcher should do the experimental design, unless you're just messing around... in which case you can do what you want. :-)

EDIT: Sorry, I'm delayed. Have horrible internet on my cell phone... I'll get back in the loop in about 12 hours.

How do you listen to an ABX test?

Reply #13 – 2015-03-19 19:45:37

Quote from: SoundAndMotion on 2015-03-19 19:12:56

Quote from: Arnold B. Krueger on 2015-03-19 16:38:59
As I subsequently pointed out an ABX test can and often is completed quite successfully without listening to either A or B, and they not infrequently completed successfully without listening to either.
Quote
Whether the listener (subject) hears A and/or B (and/or X) once or multiple times, and whether it is timed or subject-chosen is up to the test designer, in line with a proper experimental design. For certain types of listening tests, your suggestion (subject-control) makes a lot of sense.

Quote
Agreed that the constraints on a particular ABX test can be added or relaxed per the preferences of the designer. We have long known that whatever constraints we put on a test, someone would make a big thing about how our results would have proven their hobby horse theories if only... We found that most people run out of energy for critical listening soon enough...
I don't see a response to the fairly frequent case where only A or B but not both are listened to, or the less frequent but documented cases where neither is listened to for all or part of the trials.

It is easy enough to reduce an ABX test trial to the simple question: "Does X sound like (pick one) A or B", which is a simple true or false question. The remaining options were put in to make things easier for the listener

Well.... then you've changed the discrimination task to a detection task. Then it's a yes/no detection test, and it's inappropriate to call it ABX. ...perhaps AX, yes/no detection. This would (should) be used when different goals, analysis and type of results are planned.

The name ABX had nothing to do with the details of the listening task because the details of that task were unknown until after the box was built and names. The name had to do with what the box did.
Also remember that an ABX Comparator can and should be used as either a training tool or a testing tool.

I think its funny when people demand precise orthodoxy in the development of something almost 40 years after it was developed. In this case more orthodoxy was possible, as there were papers published in 1975 that were miles ahead of us in terms of orthodoxy, but appear in retrospect to have been headed down the primrose path.

Quote

Quote
Our DBTing experiments actually started out by presenting a pair of sounds and asking: "Do they sound the same or different?" That was found to be too hard as compared to allowing people to continually refresh their minds about A and B sounded like as compared to each other. IOW people seemed to need to be refreshed as to what same and different sounded like.
Once we gave them free control over A and B it became obvious that presenting two samples and then asking the same/different question was a waste of time. So we changed the question to the listener's choice of "Does X sound like A" or "Does X sound like B?" do that all the listener had to do is listen to either A or B and then X. In many cases we let the user just identify X by its sound without any further listening because that was what he wanted to do and once we scored the results we found that he was doing it accurately.

These are the sorts of things that happen when you compare two things that actually sound different. ;-)

It sounds as though you explored many things... but it also sounds as though some weren't really designed to give a meaningful result... you were just exploring.

Actually, we were giving the Golden Ears much more credibility than they were found to deserve. We didn't know for sure that so many of their ideas couldn't possibly give a meaningful result. In those days masking was not nearly as well understood as it is today. Many in those days thought that the Fletcher and Munson thresholds of hearing set the limits to audibility, but they imply much lower thresholds than we observed and we were concerned about that.

Quote

If A and B are so blatantly different that you don't even need to listen to both, why are you doing a test?

How do you know that without a DBT? People say the darndest things, especially in a stereo shop. If I had a nickel for every time someone told me that A and B are so blatantly different that you don't even need to listen to both, and everybody ended up randomly guessing in a DBT... ;-) Ever hear the story about Steve Zipser and Tom Nousaine?

Quote

Quote
The point is that many critics of ABX including most of the golden ears such as Hartley and the Meridian/Dolby gang make up their own rules for ABX and then say its a bad test because of those rules.

Criticizing ABX as a tool is like criticising a hammer; if used well, no reason for criticism, if used poorly, criticize the design, not the tool. I don't know to which ABX tests you're referring, but if the results are accepted by a journal, it is worth evaluating critically, if not, who cares?

IME people may care more about many things that never get accepted to a journal than many things that do get accepted. ;-)

How do you listen to an ABX test?

Reply #14 – 2015-03-19 19:52:43

As a newbie who's only tried a couple ABX tests, the general pattern I follow is:

I have "Keep playback position when changing tracks" unchecked in foo_abx, so whenever I click the A/B/X/Y buttons, playback jumps to the cue point and continues playing. At the start, I alternately click "Play A" and "B" on 1-2 second intervals, so I get a short section of the track looping and switching between A and B on each loop. Once I think I hear a difference, I might do X/A and X/B comparisons, Y/A, Y/B, etc.

Also, I found taking 5 second pauses is important - If A and B sound the same, stopping for 5 seconds seems to help sometimes.

In my limited experience, it doesn't feel like there's any memory involved when doing the test this way, you're just getting a stream of sound and listening for either a change or no change. I found it really important that pressing the A/B/X/Y buttons also jumps the playback back to the cue point. Otherwise, when you switch from A to B, you're hearing B but on a part of the clip you might not have heard for a while, so it feels like memory plays a bigger role.

How do you listen to an ABX test?

Reply #15 – 2015-03-20 09:17:45

Quote from: Arnold B. Krueger on 2015-03-19 19:45:37

The name ABX had nothing to do with the details of the listening task because the details of that task were unknown until after the box was built and names. The name had to do with what the box did.
Also remember that an ABX Comparator can and should be used as either a training tool or a testing tool.

I think its funny when people demand precise orthodoxy in the development of something almost 40 years after it was developed. In this case more orthodoxy was possible, as there were papers published in 1975 that were miles ahead of us in terms of orthodoxy, but appear in retrospect to have been headed down the primrose path.

Wow! So in your usage, "ABX test" is any test done with an "ABX box". That's not a usage with which I'm familiar. If I measure current with an old VOM, is that measurement a "Volt-Ohm" or an "ampere" measurement? Who named the box "ABX" and why, if not to do an "ABX" test (2AFC match-to-sample)?
Everywhere else I've seen "ABX" used (but I haven't read everything! ;-), it means a 2AFC match-to-sample test.
I don't know what happened 40 years ago that you refer to it, but Psychophysics was created over 150 years ago by G.T. Fechner and "forced-choice" (particularly 2AFC) and even "ABX" as an auditory implementation began over 60 years ago. Since Blackwell first describing them in 1952, these methods have been refined and applied to all sensory systems since then. What happened 40 years ago?

I thought you presented ABX as a scientific method, and presented yourself as someone wanting to do scientifically valid tests (especially since you say "DBT" so much). If so, you wouldn't mock orthodoxy, when scientific validity often requires it.

Quote

Quote
If A and B are so blatantly different that you don't even need to listen to both, why are you doing a test?

How do you know that without a DBT?

I can easily distinguish two blatantly different sounds without a DBT, can't you? If the sounds are so dissimilar that an ABX test (the common usage, not yours) isn't needed (only an AX, as you describe), I doubt I need a DBT. Any reasonable attempt to "blind" the listener would prove convincing and a rigorous test with a statistically significant number of trials is unneeded. I guess I'm making assumptions about the goals of the test. Clearly you don't wish to publish it... is it for a purchase, or to bolster or discredit someone, or...?

Quote

Ever hear the story about Steve Zipser and Tom Nousaine?

No, what happened?

Quote

IME people may care more about many things that never get accepted to a journal than many things that do get accepted. ;-)

You are right: scientific validity and popularity are not the same. I misunderstood your goals... sorry :-)

How do you listen to an ABX test?

Reply #16 – 2015-03-20 10:38:52

Quote from: SoundAndMotion on 2015-03-20 09:17:45

I can easily distinguish two blatantly different sounds without a DBT, can't you?

That's not the problem.
What if a person claims that some lossy codecs cause blatant differences and therefore no DBT is needed - you should just accept his claim.
What if it is amplifiers? Or cables? Digital cables? Different audio players? Different buffering settings in audio players?

In some cases there may be big differences in sound, but in others there will not be real audible differences.

How do you listen to an ABX test?

Reply #17 – 2015-03-20 12:18:59

Quote from: xnor on 2015-03-20 10:38:52

Quote from: SoundAndMotion on 2015-03-20 09:17:45
I can easily distinguish two blatantly different sounds without a DBT, can't you?

That's not the problem.
What if a person claims that some lossy codecs cause blatant differences and therefore no DBT is needed - you should just accept his claim.
What if it is amplifiers? Or cables? Digital cables? Different audio players? Different buffering settings in audio players?

Sorry, I have to admit I'm rather confused about who is doing a DBT or ABX and why... my fault.
"a person claims" is exquisitly vague though. Who? Why do they claim it? Are they selling something? Is it published in a high-impact peer-reviewed journal? I guess I can say I typically wouldn't accept anyone's claim without a better understanding as to why they make the claim.

Is there a common scenario that would apply?

How do you listen to an ABX test?

Reply #18 – 2015-03-20 12:43:48

Quote from: SoundAndMotion on 2015-03-20 09:17:45

So in your usage, "ABX test" is any test done with an "ABX box".

Not at all. For example you can do a sighted evaluation with an ABX box (ignore the X's), and that is obviously not an ABX test.

Quote

Everywhere else I've seen "ABX" used (but I haven't read everything! ;-), it means a 2AFC match-to-sample test.

That's the ca. 1950s test.

Wikipedia is wrong when it says: "A subject is presented with two known samples (sample A, the first reference, and sample B, the second reference) followed by one unknown sample X that is randomly selected from either A or B. The subject is then required to identify X as either A or B. " That is the 1950 test.

The test I invented in ca. 1975 works like this:

A subject is presented with the opportunity to listen to two known samples (sample A, the first reference, and sample B, the second reference) and an unknown sample X (which is either A or B) at will. To complete a trial, the subject must identify X as being either A or B by listening to all 3 samples in full or in part or not at all, as he wishes. Repeating and editing samples is allowed and even encouraged as long as the edits are precisely applied equally to all 3 samples. Trials are repeated as was initially planned. Standard statistical tests are used to determine whether the listener was successfully detecting a difference or just guessing.

Quote

I don't know what happened 40 years ago that you refer to it, but Psychophysics was created over 150 years ago by G.T. Fechner and "forced-choice" (particularly 2AFC) and even "ABX" as an auditory implementation began over 60 years ago. Since Blackwell first describing them in 1952, these methods have been refined and applied to all sensory systems since then. What happened 40 years ago?

Some rebellious children of the 1960s took audio testing into their own hands because the audio establishment (The AES and IEEE, not the ASA) had failed, and somewhat independently created a reasonably disciplined and scientific test that gave as many advantages as is reasonably possible to the listener.

Quote

I thought you presented ABX as a scientific method, and presented yourself as someone wanting to do scientifically valid tests (especially since you say "DBT" so much). If so, you wouldn't mock orthodoxy, when scientific validity often requires it.

I have heard the opinion granted that you can do Science without wearing a strait jacket. ;-)

Quote

I can easily distinguish two blatantly different sounds without a DBT, can't you?

All the golden ears say that, and then they can't back their claims up as soon as we make them actually do a listening test instead of a sham (sighted evaluation).

Quote

Any reasonable attempt to "blind" the listener would prove convincing and a rigorous test with a statistically significant number of trials is unneeded. I guess I'm making assumptions about the goals of the test. Clearly you don't wish to publish it... is it for a purchase, or to bolster or discredit someone, or...?

That's what the golden ears say.

I say that claims of audibility are most easily judged for audible effects that have readily measurable relevant parameters, which among other things excludes lossy encoders. But it does include power amplifiers and cables. For these things, the thresholds of hearing for easily measured artifacts are known or knowable. We've run a lot of these things through DBTs and we know what is clearly audible and what is clearly inaudible to a useful degree. Use measurements to judge them, because it is so fast and easy.

For everything else, in those cases where you have doubts that the audible effect is well-described by measurements, do DBTs.

Quote

Quote
Ever hear the story about Steve Zipser and Tom Nousaine?
No, what happened?

Steve Zipser was the owner of an audio store in a house on US 1 on the south side of Miami named Sunshine Stereo (ironic name because there were a lot of shady deals reported) who posted on the Usenet rec.audio.opinion forum back in the 1990s when it was a vibrant and relevant place. Every few months Steve would come up with some new wunder amplifier designed by Nelson Pass or someone like that, that sounded as he would say "Mind Blowingly Better". He would hoot and holler about it online every night for weeks, he would sell some, and he would get the new purchasers to post what a great amp and what a great dealer Steve was.

Nousaine was writing for a number of audio publications and did a certain amount of debunking both online and in print publications. He challenged Steve to a DBT which was talked about for weeks and eventually happened. Tom (who lived in the Detroit area) went down to Miami with a friend and an ABX box and set up an ABX test of Steve's current darling amplifier in Steve's favorite system in Steve's house with Steve's favorite recordings. Steve of course did a brilliant job of guessing randomly. Steve was pretty honest about the test and its results for about a week, but then he started making excuses and tried to spin the story.

Tom's life continued on until his untimely death probably related to his diabetes, just lately. He kept writing, getting published, and doing fun things. Steve's life went down hill. A number of months maybe a year of more after the DBTs Steve was found dead in his home or found alive but very ill by his wife who called the EMS and Steve was DOA. Stories vary, but that was the end of Steve, and shortly after that was the end of Sunshine Stereo. This can all be confirmed from posts in Google Groups.

The point is that the claim "This difference is so great that a DBT is unnecessary" was first made a few months after we started doing DBTs in the mid 1970s, and it has ended up biting a lot of people who believed it. It has also led to the conversion some of them (like Tom Nousaine) from Golden Earism to Science.

How do you listen to an ABX test?

Reply #19 – 2015-03-20 14:19:09

Quote

Quote

Quote

Ever hear the story about Steve Zipser and Tom Nousaine?

No, what happened?

Steve Zipser was the owner of an audio store in a house on US 1 on the south side of Miami named Sunshine Stereo (ironic name because there were a lot of shady deals reported) who posted on the Usenet rec.audio.opinion forum back in the 1990s when it was a vibrant and relevant place. Every few months Steve would come up with some new wunder amplifier designed by Nelson Pass or someone like that, that sounded as he would say "Mind Blowingly Better". He would hoot and holler about it online every night for weeks, he would sell some, and he would get the new purchasers to post what a great amp and what a great dealer Steve was.

To correct some factual errors:

Sunshine Stereo was at 9535 BISCAYNE BLVD MIAMI SHORES FL 33138 which is on the North side of Miami.

The ABX test with Nousaine was on August 25th 1997, and Zipser passed on Dec 31, 2000 so there were several years in between.

How do you listen to an ABX test?

Reply #20 – 2015-03-20 14:45:24

Quote from: SoundAndMotion on 2015-03-20 12:18:59

"a person claims" is exquisitly vague though. Who? Why do they claim it? Are they selling something? Is it published in a high-impact peer-reviewed journal? I guess I can say I typically wouldn't accept anyone's claim without a better understanding as to why they make the claim.

Is there a common scenario that would apply?

Everyone interested in audio, who wants to make informed buying decisions, will eventually stumble upon such claims in any audio forum. Even here on HA where there are strict rules such claims will appear occasionally. In many other audio forums you will see this as the rule rather than the exception - and asking for evidence can even get you banned or your thread moved into a marginalized section. Or look into some audio magazines, not just the ridiculous cable ads but the articles themselves contain such claims.

Those are the sources where the average audio-interested Joe will get his "information" from. And while repeated assertion does not make something true, people will still tend to believe it.

How do you listen to an ABX test?

Reply #21 – 2015-03-20 15:58:30

Quote from: Arnold B. Krueger on 2015-03-19 19:45:37

The name ABX had nothing to do with the details of the listening task because the details of that task were unknown until after the box was built and names. The name had to do with what the box did.

Quote from: Arnold B. Krueger on 2015-03-20 12:43:48

Quote from: SoundAndMotion on 2015-03-20 09:17:45
So in your usage, "ABX test" is any test done with an "ABX box".

Not at all. For example you can do a sighted evaluation with an ABX box (ignore the X's), and that is obviously not an ABX test.

Well, now I’m confused as to how you use “ABX”! Is it the type of test or the box? I see that ABX without X is not ABX. Do you see that ABX without B is not ABX?

Quote from: Arnold B. Krueger on 2015-03-20 12:43:48

Some rebellious children of the 1960s took audio testing into their own hands because the audio establishment (The AES and IEEE, not the ASA) had failed, and somewhat independently created a reasonably disciplined and scientific test that gave as many advantages as is reasonably possible to the listener.

Quote
I thought you presented ABX as a scientific method, and presented yourself as someone wanting to do scientifically valid tests (especially since you say "DBT" so much). If so, you wouldn't mock orthodoxy, when scientific validity often requires it.

I have heard the opinion granted that you can do Science without wearing a strait jacket. ;-)

I *don’t* want to argue about the definition of “do Science”, but I can tell you haven’t published anything. That is not meant as a slight; I just believe we use some words differently. Let’s step way back and ask “why would you do an ABX or DBT anyway?” I assume you want to convince someone of something. Maybe it’s you, before a purchase, wanting to be sure of your decision. You would not need to be concerned with scientific validity, any more than needed to convince yourself. Very similar would be your friends or people who trust you. Do enough to convince them given the assumptions they grant you. But if you want to convince strangers, skeptics or opponents, the power of scientific rigour, meaning using sound experimental design, is quite the opposite of a “straight jacket”; it “frees” you to draw conclusions that will be convincing to the skeptic. But that may not be your goal, which is fine. You and I live in different worlds.

Quote from: Arnold B. Krueger on 2015-03-20 12:43:48

All the golden ears say that, …

Quote
Any reasonable attempt to "blind" the listener would prove convincing and a rigorous test with a statistically significant number of trials is unneeded. I guess I'm making assumptions about the goals of the test. Clearly you don't wish to publish it... is it for a purchase, or to bolster or discredit someone, or...?

That's what the golden ears say.

HEY! My ears are made of flesh and bone and have an appropriate, non-metallic color. So I don’t care what golden ears say. My comments stand on their own. If you are sure you know the result of a test without doing it, why do it? There is a world of room between no test and a scientifically valid one. I would choose a useful balance of convenience and rigour, depending on my goal. That may include sighted, or “sort of” blind, or definitely blind but only 5 trials… on and on, depending on the goal.

Quote from: Arnold B. Krueger on 2015-03-20 12:43:48

I say that claims of audibility are most easily judged for audible effects that have readily measurable relevant parameters, which among other things excludes lossy encoders. But it does include power amplifiers and cables. For these things, the thresholds of hearing for easily measured artifacts are known or knowable. We've run a lot of these things through DBTs and we know what is clearly audible and what is clearly inaudible to a useful degree. Use measurements to judge them, because it is so fast and easy.

For everything else, in those cases where you have doubts that the audible effect is well-described by measurements, do DBTs.

I question your expertise in understanding auditory perception, but I’m open to being convinced! :-) By the way, you say “we” a lot; is that a royal “we” or who are your collarborators?

Quote

(snip)
It has also led to the conversion some of them (like Tom Nousaine) from Golden Earism to Science.

That’s a nice ABX story. I like it. *That* is a good use of ABX. Do you have any stories with *your* ABX tests?

How do you listen to an ABX test?

Reply #22 – 2015-03-20 16:12:25

Quote from: xnor on 2015-03-20 14:45:24

Quote from: SoundAndMotion on 2015-03-20 12:18:59
"a person claims" is exquisitly vague though. Who? Why do they claim it? Are they selling something? Is it published in a high-impact peer-reviewed journal? I guess I can say I typically wouldn't accept anyone's claim without a better understanding as to why they make the claim.

Is there a common scenario that would apply?

Everyone interested in audio, who wants to make informed buying decisions, will eventually stumble upon such claims in any audio forum. Even here on HA where there are strict rules such claims will appear occasionally. In many other audio forums you will see this as the rule rather than the exception - and asking for evidence can even get you banned or your thread moved into a marginalized section. Or look into some audio magazines, not just the ridiculous cable ads but the articles themselves contain such claims.

Those are the sources where the average audio-interested Joe will get his "information" from. And while repeated assertion does not make something true, people will still tend to believe it.

Of course. That makes sense. Here (HA) you need to provide proof of a claim. Elsewhere, I would, and would recommend, ignoring implausible, unverified claims. And to protect those who can't ignore it, we should criticize false claims. But when we do so, we must not make similar mistakes. Our counter-claims should not be equally flimsy. Doing an ABX, where you "know" the result before doing it, introduces its own biases. Publishing a null result when you "know" there is no audible difference is suspect, easily and correctly challenged, and only confuses the person we're trying to help.

How do you listen to an ABX test?

Reply #23 – 2015-03-20 16:25:31

It's hard to reason a person out of a position that he/she did not reason into. That's one reason why I've mostly given up on making counter arguments.

The burden of proof is on the one making the claim. Ask for evidence, over and over and over again, until either the person weasels out, manages to censor you, admits to failing to provide it or actually provides it.
Only in the last case we can investigate further to see if the claim really is true.

How do you listen to an ABX test?

Reply #24 – 2015-03-20 16:29:22

Quote from: SoundAndMotion on 2015-03-20 12:18:59

Quote from: xnor on 2015-03-20 10:38:52
Quote from: SoundAndMotion on 2015-03-20 09:17:45
I can easily distinguish two blatantly different sounds without a DBT, can't you?

That's not the problem.
What if a person claims that some lossy codecs cause blatant differences and therefore no DBT is needed - you should just accept his claim.
What if it is amplifiers? Or cables? Digital cables? Different audio players? Different buffering settings in audio players?

"A person claims" is exquisitely vague though.

Only true if you haven't seen it happen in many places at many times. Many of us have been there and done that. I am surprised that you are surprised... ;-)

Start here: Neil Young Hates MP3s for fun and profit

The highest profile example of this sort of thing we've seen lately is probably the story of Neil Young, the Kickstarter web site, and the Pono digital player which you are invited to Google.

Notice