Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: How do you listen to an ABX test? (Read 344480 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

How do you listen to an ABX test?

Reply #225
................
Especially those with strong pecuniary interests.

cheers,

AJ

You talk about my pecuniary interests. So you think that my posting on here has something to do with my €500 DACs (my most expensive product).
I guess your $8,500 speakers are 17 times more likely to be a pecuniary motivation for you to post here & try to discredit anything other than speakers

How do you listen to an ABX test?

Reply #226
I think by his [jkeny] logic, if some student were to fill out a true/false test in school without reading the questions and simply checking all the "true" boxes, this would prove that true/false tests "don't work".

You obviously know nothing about how tests are designed & administered to answer a specific question.

In the student test, it is designed to ascertain "true knowledge" from random guessing & if he ticks all the "true" boxes he will fail - similarly if he ticks all the "false" boxes, similarly if he randomly ticks true & false boxes.

Get your test design right to answer the question under examination - this concept is patently missing from your thinking about ABX testing

How do you listen to an ABX test?

Reply #227
Guessing "true" on every question in a true/false test, without even reading the question, is the same as guessing B on every trial of an ABX test, without really listening to X. In both instances it is the fastest way to complete a test when the only concern is simply to determine how fast a test can mechanically be completed due to the delay in being forced to give a response and then clicking a "move on to the next trial" button.  I've completed such tests in this forum for exactly that purpose, not caring what the score was since my only purpose, at the time for THAT test, was to check for how quickly a test can be completed due to the physical demands of moving one's cursor from place to place and clicking when appropriate.

I've also shown the fastest I can do where I actually listened to X and had to think and process the sound to make a decision and I demonstrated that I wasn't just randomly guessing, in THAT test, by getting a perfect score. That's not as fast though as just guessing.

How do you listen to an ABX test?

Reply #228
You talk about my pecuniary interests.

Yep, you're "In the biz". The biz that ABX is very bad for.

So you think that my posting on here has something to do with my €500 DACs (my most expensive product).

Yes and the "organic" way they "sound"....free of ABX. You said so yourself.

I guess your $8,500 speakers are 17 times more likely to be a pecuniary motivation for you to post here & try to discredit anything other than speakers

Actually I just sold a $20k set...without resorting to nonsensical claims about magic power supplies, interleaving, gold alchemy and smell of flowers, etc.
Or pounding the drums of doubt/rejecting blind/ABX testing.

cheers,

AJ
Loudspeaker manufacturer

How do you listen to an ABX test?

Reply #229
Guessing "true" on every question in a true/false test, without even reading the question, is the same as guessing B on every trial of an ABX test, without really listening to X. In both instances it is the fastest way to complete a test when the only concern is simply to determine how fast a test can mechanically be complete due to the delay in being force to give a response and then click a "move on to the next trial" button.  I've completed such tests in this forum for exactly that purpose, not caring what the score was since my only purpose was to test for how quickly a test can be completed due to the physical demands of moving one's cursor from place to place and clicking when appropriate.
Yes & without your admission there is no way to know that you randomly guessed. ArnyK didn't admit this initially when he posted his null ABX test results, - he only admitted to this eventually when questioned on his timing (btw, he beat you for fasted random key hitter)

The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.

How do you listen to an ABX test?

Reply #230
. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.
Nope. You don't get it. Not hearing a difference, on that day, with that person, with that system, with that song, doesn't prove anything. It's not really evidence one way or the other. It could have been guessing, a poor song selection, a poor listener, almost anything. You can't prove a negative. It's only when a person gets a good score, where a strong statistical difference is shown, that you start to say you have a suggestion of having found some pay dirt. But I'm not going to waste my time trying to explain this concept to you. Bye.

How do you listen to an ABX test?

Reply #231
The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.

Yes, exactly. And that is how it should be. The null result is the default assumption anyway, so it doesn't matter how many results are being piled in there. You see, it all makes perfect sense! 

There is no alternative anyway. Or do you have a way to look into people's heads to determine whether they were deliberately cheating or otherwise unfit?

How do you listen to an ABX test?

Reply #232
You talk about my pecuniary interests.

Yep, you're "In the biz". The biz that ABX is very bad for.
I have no problem with anybody doing personal blind tests - it's truth for them. When it comes to presenting such "evidence" for public dissemination as "evidence" for a particular stance then more stringent criteria & examination are required. 

Quote
So you think that my posting on here has something to do with my €500 DACs (my most expensive product).

Yes and the "organic" way they "sound"....free of ABX. You said so yourself.
I never posted that here - you pulled it off my website & posted it here. Much like your statement "Our products reflect the philosophy that loudspeakers should strive to sound like the real thing." Who are you trying to fool? Maybe yourself? We all know that audio playback is not about trying to sound like the "real thing"- that's as silly as it gets. Nobody was ever fooled that listening to playback is like a live event. Audio playback is about creating an illusion - an illusion that appeals to our auditory perception as somewhat realistic. There is no system that comes anyway near being mistaken for a live musical event. 

Quote
I guess your $8,500 speakers are 17 times more likely to be a pecuniary motivation for you to post here & try to discredit anything other than speakers

Actually I just sold a $20k set...without resorting to nonsensical claims about magic power supplies, interleaving, gold alchemy and smell of flowers, etc.
Or pounding the drums of doubt/rejecting blind/ABX testing.

cheers,

AJ

Well done! I'm sure we would all be interested in the blind test results of your $20,000 speakers Vs your $1,600 speakers - I don't see any posted on your website. So what does the extra $18,400 deliver?

How do you listen to an ABX test?

Reply #233
Quote
Quote

. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.
Nope. You don't get it. Not hearing a difference, on that day, with that person, with that system, with that song, doesn't prove anything. You can't prove a negative. But I'm not going to waste my time trying to explaining this concept to you. Bye.
Yep, I get it. Treating all null results as valid & piling them into a block of evidence designed to create an edifice of damning evidence - this is what your tactic is all about.

The point is that there is no way of knowing from the published test results that they are simply just random guesses completed by a deaf monkey. Similarly for many other situations where the test isn't actually taken i.e. there is no serious attempt at listening involved. Yet all these tests are treated as valid & lumped into the null "evidence" pile.

Yes, exactly. And that is how it should be. The null result is the default assumption anyway, so it doesn't matter how many results are being piled in there. You see, it all makes perfect sense! 
No, it doesn't make sense unless you ignore all null results & not use them as a body of evidence. Yes, it all makes per sense if you have a particular position you want to advance & want to use this test which is obviously rife with experimenter's bias

Quote
There is no alternative anyway. Or do you have a way to look into people's heads to determine whether they were deliberately cheating or otherwise unfit?

Well designed tests use pre-screening, pre-training & internal controls to eliminate as many issues as possible - this will eliminate some listeners/playback systems from the test. Internal controls are used in well-designed tests to catch problems within the test. For instance, let's say this ABX test was testing high-res Vs RB - so we have two files that are A & B - randomly, in some trials, the software could introduce a difference of 0.5dB or whatever (is considered an agreed audible difference) in X. The listener, if he is doing the test correctly should be able to identify this difference. This would verify that the listener is actually listening on every trial & isn't just randomly guessing or isn't too tired & lost focus. The software can report these trials as controls separately to the other trials. The expected result is known for these controls & if the listener doesn't get correct results for these controls then his other results should be discarded

How do you listen to an ABX test?

Reply #234
I have no problem with anybody doing personal blind tests - it's truth for them.

We do when it's a complete farce. A facade due to knowing the derisive laughter elicited by admitting they were sighted. "Adaptation" goes beyond the audible variety. 
Now all sorts of audiophiles are doing "personal" "blind tests" like the one I linked earlier and of course, your claimed ones. While Stereophile et al rejects them.

When it comes to presenting such "evidence" for public dissemination as "evidence" for a particular stance then more stringent criteria & examination are required.

Yup. Unless dealing with those who still hear Santa Claus and get enraged by by those who doubt them.

I never posted that here - you pulled it off my website & posted it here.

You said it, you own it. 

Much like your statement "Our products reflect the philosophy that loudspeakers should strive to sound like the real thing." Who are you trying to fool? Maybe yourself? We all know that audio playback is not about trying to sound like the "real thing"- that's as silly as it gets. Nobody was ever fooled that listening to playback is like a live event. Audio playback is about creating an illusion - an illusion that appeals to our auditory perception as somewhat realistic. There is no system that comes anyway near being mistaken for a live musical event.

Quite a bit of blathering to say nothing. Must be an "In the biz" thing. 

Well done! I'm sure we would all be interested in the blind test results of your $20,000 speakers Vs your $1,600 speakers

To verify what claim vs said $1600 speakers? What would be ABX'd? Now blind tests are valid?

I don;t see any posted on your website.

Right. No claims about organic and all that crap.

So what does the extra $18,400 deliver?

Some pretty large, pretty loud, pretty speakers. 

cheers,

AJ
Loudspeaker manufacturer

How do you listen to an ABX test?

Reply #235
So what does the extra $18,400 deliver?

Oh yes, deeper bass, higher output, more adjustability, fully active (10ch amplification), larger soundstage (indirect drivers), 6 cabinets, in home setup by me.
Loudspeaker manufacturer

How do you listen to an ABX test?

Reply #236
So what does the extra $18,400 deliver?

Some pretty large, pretty loud, pretty speakers. 

cheers,

AJ

OK, got it- $18,400 for extra loudness (which I can get by turning up the volume) & some hi-fi jewellery which I would get better bang for my buck (literally ) if I spent this on my wife's jewellery.

How do you listen to an ABX test?

Reply #237
@jkeny: Let me make it easier for you to understand with a hypothetical;

You claim that your DAC sounds better than a much less expensive DAC. I know of no reason why it should so my assumption is that it does not.

You take an ABX test and fail - no change, I still assume that they sound the same.

Ten million more people try to ABX a difference, and they also fail - no change, I still assume that they sound the same.

It turns out that all of them were guessing randomly, or that they were all monkeys - still no change, I still assume that they sound the same.

One person passes an ABX test convincingly - finally a change - my original assumption was incorrect, and there is an audible difference (although this does not necessarily prove that yours sounds "better").


How do you listen to an ABX test?

Reply #239
@jkeny: Let me make it easier for you to understand with a hypothetical;

You claim that your DAC sounds better than a much less expensive DAC. I know of no reason why it should so my assumption is that it does not.

You take an ABX test and fail - no change, I still assume that they sound the same.

Ten million more people try to ABX a difference, and they also fail - no change, I still assume that they sound the same.

It turns out that all of them were guessing randomly, or that they were all monkeys - still no change, I still assume that they sound the same.

One person passes an ABX test convincingly - finally a change - my original assumption was incorrect, and there is an audible difference (although this does not necessarily prove that yours sounds "better").

If you don't know what to listen for either because it has been pointed out to you in sighted listening & you can also successfully identify this in sighted listening, why would you go into blind testing? Are you expecting something to jump out at you in blind testing that you haven't identified already in a sighted test?

I would consider this very optimistic. This is not the way to enter into a blind test.

So the accumulation of null results are really just senseless & only serve one purpose - to build a body of "evidence" to support a particular stance. if you don't hear a difference between my DAC & another then just state that as the case & choose the cheaper one.

This pretence of the ABX test bringing something "extra" to this is just bunkum. What will happen in the ABX test is what has already been posted here in a closed thread - knowing that they haven't heard any difference in sighted listening they really don't bother to listen & just hit random keys because "life's too short" was the excuse given by one

How do you listen to an ABX test?

Reply #240
So you would make the assumption that all of those failed ABX tests were because the subjects didn't know what to listen for - fair enough. I have no problem with that. But that still has zero effect on my initial assumption that there is no audible difference.

The difference is that I have no ax to grind - I would be just as happy to have my assumption proven wrong as not. It is, however, in your best interest to prove a difference, so you are the one that will be pursuing a positive ABX, not me.

How do you listen to an ABX test?

Reply #241
Jkeny, roughly how many foobar ABX tests have you taken? Just that one you posted on AVS?





 

How do you listen to an ABX test?

Reply #242
So you would make the assumption that all of those failed ABX tests were because the subjects didn't know what to listen for - fair enough.
No, I wouldn't state that & never did. Lack of training is just one of the many reasons why a "false" null result could be returned.
Quote
I have no problem with that. But that still has zero effect on my initial assumption that there is no audible difference.
Yes, but you haven't tried to design the test so that this null result (no audible difference) is more likely to be the result of there being no ACTUAL AUDIBLE difference to be heard rather than there is no difference because it is masked by bad test design

Quote
The difference is that I have no ax to grind - I would be just as happy to have my assumption proven wrong as not. It is, however, in your best interest to prove a difference, so you are the one that will be pursuing a positive ABX, not me.

As I said, I have no problem with anyone doing their own personal blind test on my DACs as I have done & this is fine for me & for most people who want to make buying decisions. The rest of this is just people playing science & aping what they think the grown-ups do in their laboratories but not really understanding much of it

How do you listen to an ABX test?

Reply #243
Jkeny, roughly how many foobar ABX tests have you taken? Just that one you posted on AVS?

Attempting another diversionary tactic.

How do you listen to an ABX test?

Reply #244
As I thought it would seem. Just one.

How do you listen to an ABX test?

Reply #245
So as is done on this thread, let's summarise:

1) - it's claimed that a null result "proves" nothing

2) - it is admitted that an accumulation of null results is a strong indication of there being no ACTUAL audible difference

3) - thus the number of null results has a direct bearing on how strong this indication is perceived to be - the higher the number the stronger the indication

4) - treating all null results as valid & piling them all into the valid null results pile is knowingly skewing the overall number of null results towards (2) & (3) above

How do you listen to an ABX test?

Reply #246
As I thought it would seem. Just one.

Ah, the old logic fallacy game, eh? When did you stop beating your wife, then?

You have a habit of making claims that you then ask "the accused" to disprove.

How do you listen to an ABX test?

Reply #247
I think by his [jkeny] logic, if some student were to fill out a true/false test in school without reading the questions and simply checking all the "true" boxes, this would prove that true/false tests "don't work" and that the entire test methodology should be scrapped.


Pretty much.

jkeny seems to think a lot of people would 'randomly guess' for nefarious purposes -- to somehow game the cumulative results toward negative.  That seems far-fetched to me, given the ABX trials I've seen time and again here on HA (including many positives, even for something like 320 kbps mp3 vs source).  And surely the eager audiophiles who find themselves at a loss in some of the more famous audio DBTs, could not have been purposely gaming the results towards*negative*.  When John Atkinson failed his own amp DBT, was it some brilliant ploy to discredit DBTs? Don't think so.

So the 'nefarious guessing' complaint seems rather silly... but when even 'award winning' audio DBT results from Meridian et al. yielded only a very, very small difference, under highly specified conditions, what's a DAC salesman to do? 

So let's leave jkeny's desperate argument for a moment, and take a look at one of HA's own 'best practices' guides.

When subjects find themselves 'guessing' on ABX tests, I aver that it's typically without nefarious intent.  I'm sure many who have ever taken one and found they can't with 100% confidence say X is A or B during a particular trial, have resorted to their 'gut' or 'best guess'.  I know I have.   

But our own author of HA's sticky post about ABX tests, written in 2003 (or so), would not approve  -- though not for jkeny's reason:

Quote
3. The p values given in the table linked above are valid only if the two following conditions are fulfilled :

-The listener must not know his results before the end of the test, except if the number of trials is decided before the test.

...otherwise, the listener would just have to look at his score after every answer, and decide to stop the test when, by chance, the p value goes low enough for him.

-The test is run for the first time. And if it is not the case, all previous results must be summed up in order to get the result.
Otherwise, one would just have to repeat the serial of trials as much times as needed for getting, by chance, a p value small enough.
Corollary : only give answers of which you are absolutely certain ! If you have the slightest doubt, don't answer anything. Take your time. Make pauses. You can stop the test and go on another day, but never try to guess by "intuition". If you make some mistakes, you will never have the occasion to do the test again, because anyone will be able to accuse you of making numbers tell what you want, by "starting again until it works".


(bold black emphasis mine)

As best I can tell, English is/was not Pio's first language, and I think the term 'absolutely certain' there is very unfortunate.  I do get the point that if you find are *only* guessing from the get-go, with utterly no feeling that there might be a difference, and your confidence does not increase during the test, you might consider the test pointless and should just stop -- you can't hear the difference.  (Though one might ask, what if you can *unconsciously* 'sense' a difference (vide Oohashi, et al)?    You won't 'know' unless you complete the test!)  I also get the point that if you have become fatigued, you should stop and resume again when you feel sharp.  That's all to maximize your chance of hearing a real difference.  But if you're feeling aurally alert *yet* you find yourself perhaps less than 'absolutely certain' that X is A (or X is B) at some point, don't stop, just finish the test as best you can.  I would bet that has happened to all of us.


Pio's reasoning is laid out further:


Quote
Of course you can train yourself as much times as you whish, provided that you firmly decide beforehand that it will be a training session. If you get 50/50 during a training and then can't reproduce this result, too bad for you. the results of the training sessions must be thrown away whatever they are, and the results of the real test must be kept whatever they are.

Once again, if you take all the time needed, be it one week of efforts for only one answer, in order to get a positive result at the first attempt, your success will be mathematically unquestionable ! Only your hifi setup, or your blind test conditions may be disputed. If, on the other hand, you run again a test that once failed, because since then, your hifi setup was improved, or there was too much noise the first time, you can be sure that there will be someone, relying on statistic laws, to come and question your result. You will have done all this work in vain.



His points about training and  about picking/choosing results, and earlier (not shown), about  deciding beforehand on trial number,  are valid, but I don't think the logic extends to banning any response that involves the 'intuitive'.


I would also expand on this:

Quote
4. The test must be reproducible.
Anyone can post fake results. For example if someone sells thingies that improve the sound, like oil for CD jewel cases of cable sheath, he can very well pretend to have passed a double blind ABX test with p < 0.00001, so as to make people talk about his products.
If someone passes the test, others must check if this is possible, by passing the test in their turn.


Reproducing the positive result with *other* subjects is one sort of verification; having the original subject replicate the result, under monitored conditions, would be another.  Though of course, a 'difference' that was only EVER demonstrated with *one* subject (n=1), would not be terribly significant for the sorts of claims routinely made in audio-land.

How do you listen to an ABX test?

Reply #248
No, results presuppose that a "test" was actually taken. In the ABX null results posted here by mzil, where he just randomly guessed - do you consider that he "took a test" & delivered "results"?

It doesn't matter. Noone except him could tell the difference. That's why your whole idea of an invalid null result is bunk.

And let me add this: The fact that it doesn't matter is an important quality of such a test. It speaks for the ABX test method, and is a major factor in its usefulness.

Quote
If I sat a monkey down in front of the keyboard would I most likely get the same null results? Would you count the monkey's results among the null results?

I am not supposed to judge the monkey. If he produces a null result, I count it as a null result. If he produces a result that deviates significantly from chance, I count it accordingly. If I did anything else I'd be rigging the test.

Quote
Would you consider the results produced by a deaf person as a valid test? What about someone who demonstrated a hearing impairment in the audible area being tested? What about someone who has demonstrated that they are pre-biased to not hearing any differences? What about someone who is so tired that they aren't focussed? What about playback equipment that is unsuitable for revealing differences? Do I need to go on?

No, you needn't go on. I would count all of them as valid results. Doing anything else would put my own judgement above their results. I would effectively override their test results, thereby making the test invalid. Any test is invalid if the test administrator is allowed to override the test results of selected participants. Isn't that abundantly clear?

Now, it is true that a hearing test conducted with monkeys or deaf people might be regarded as pointless. That is an unfortunate consequence of a poor test design. If you aren't interested in the hearing abilities of monkeys or deaf people, you should exclude them from the test before the start. Once they are in, they are in - this doesn't make the test invalid. It may merely make it useless, depending on what the question was. This is so by design, I have to emphasize it again! It is not a fault or deficiency of ABX, quite the opposite, it is a major factor of its usefulness.

Quote
Yes, & the important word here is "result" - you have to decide what is a result & what should be eliminated from "results". In your statements you are including everything into results even the deaf monkeys "results"

Yes, that is quite deliberately so. You are trying to ridicule my position, but you miss that it isn't ridiculous at all. It is the opposite: Your position puts the designer/administrator of the test into a position where he can manipulate the criteria for result acceptance to his liking, potentially even after the fact. That's what I would call invalid!

And you wonder why other people don't trust you as a test designer/administrator? Amusing!

Wow, I'm glad that you laid all this out in black & white for all to see - it really proves the experimenter bias that underlies the thinking by many DBT supporters on thsi forum.

How do you listen to an ABX test?

Reply #249
Yep, I get it. Treating all null results as valid & piling them into a block of evidence designed to create an edifice of damning evidence - this is what your tactic is all about.

The null results are predominantly provided by the audiophiles themselves. There's very little need for a tactic here. It suffices to give them enough rope to hang themselves. They reliably will.

Quote
No, it doesn't make sense unless you ignore all null results & not use them as a body of evidence. Yes, it all makes per sense if you have a particular position you want to advance & want to use this test which is obviously rife with experimenter's bias

I don't use the null results as a body of evidence. I've said that before, but you don't appear to take it seriously. I treat the null result as the expected result anyway, hence I don't need extra confirmation by more null results. This position is inherent in the test method, and needs to be.

The most damning evidence that confirms my position isn't the null results. It is the way audiophiles react to the null results. In other words how they completely fail to get the point of the tests, and how they come to misrepresent the test fundamentals, in a futile attempt to change the interpretation in their favor.

Quote
Well designed tests use pre-screening, pre-training & internal controls to eliminate as many issues as possible - this will eliminate some listeners/playback systems from the test. Internal controls are used in well-designed tests to catch problems within the test. For instance, let's say this ABX test was testing high-res Vs RB - so we have two files that are A & B - randomly, in some trials, the software could introduce a difference of 0.5dB or whatever (is considered an agreed audible difference) in X. The listener, if he is doing the test correctly should be able to identify this difference. This would verify that the listener is actually listening on every trial & isn't just randomly guessing or isn't too tired & lost focus. The software can report these trials as controls separately to the other trials. The expected result is known for these controls & if the listener doesn't get correct results for these controls then his other results should be discarded

No problem. If you think you want to do a test like that, go ahead and do it. If it helps avoiding wasted test effort, by avoiding null results, I'm all for it. It is definitely ok to do pre-test screening and remove inadequate testers. If you remove testers based on their performance in the actual test, I want to know exactly how that's determined in an impartial way, in order to be sure this isn't being used to skew the test.

However, be aware that such measures only have the effect you desire if the extra control trials measure the property you are after in the real test. For example, if you are trying to find the minimum detectable level difference, your control trials will introduce a known level difference. If you are looking for something entirely different, or even an unknown kind of difference, the control trials may not have any benefit at all.

However, don't expect too much of such a test. It won't magically yield the result you crave for. Don't be surprised if it confirms the objectivist position. Be prepared to produce another null result which goes onto the pile.

Quote
Wow, I'm glad that you laid all this out in black & white for all to see - it really proves the experimenter bias that underlies the thinking by many DBT supporters on thsi forum.

You're welcome. Except it doesn't prove any bias. It just describes how ABX testing works. There is not and cannot be any symmetry between positive and negative results. Your expectations are completely off, and your insistence just shows your ignorance.

Now, please do me a favor and disseminate my statement as widely as possible as "proof" that the objectivists are biased. You show how tempting it is for you. It would only reinforce my conviction, and my amusement.