ABX questions

Topic: ABX questions (Read 4008 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

ABX questions

2004-03-27 17:21:30

Hi, I have a few quick questions to knowledgeable people here about the Hide Results button on computer ABX programs.

1) What was the original intention of the Hide Results button?

2) Should it be a personal choice whether to hide the results or not?

3) Which gives fairer results, to hide the results or not?

4) In properly designed listening tests, are the results hidden or shown during the test?

5) (off-topic) Shouldn't FILEABX be listed in the FAQ section as an ABX program?

ABX questions

Reply #1 – 2004-03-27 17:39:31

Quote

3) Which gives fairer results, to hide the results or not?

Depends on how your test is set up: Do you fix the number of total trials before you start the test, or do you decide when to stop based on your results? In the first case both tests yield the same statistical significance (showing the results is of no use to a guessing listener), while in the second case the confidence value is a little skewed.

(There are, however, ways to compensate for this effect, see "profiles" in the more or less recent statistics threads.)

ABX questions

Reply #2 – 2004-03-27 17:59:38

The point about hiding results during the test is this:

The p-values all recent ABX programs show, only give you the correct "probability to reach the current score (or better) by guessing" if the number of trials has been fixed before the test starts. If the tester is allowed to stop the test whenever he likes (e.g. when a certain p-value is reached), the real "probability ..." is bigger. I tried to explain this in the 1st post of this thread.

One way to avoid this is to force the tester to enter the number of trials he's going to perform at the beginning of the test, another one is to hide the results.

So the advantage of hiding the results is that the p-values are correct, the disadvantage is (obviously) that the tester doesn't have feedback and e.g. might not notice that he becomes fatigued quickly enough.

A rule-of-thumb workarround would be to "perform one trial extra", i.e. if you want to reach a "probability ..." of 0.01, you don't stop after reaching 7/7 or 10/11 (-> p-value < 0.01) but perform one additional trial, i.e. you stop if 8/8 or 11/12 is reached. This method works good enough for < 30 trials.

A safer method would be to use the little program I uploaded here to calculate "stop points" before you start the test (or use the examples from the post), e.g.:
You want to reach a "probability ..." of 0.01. Now depending on how the stop points are calculated (you have the choice - are you aiming for quick success = 7/7 or do you expect some hard fight scoring e.g. 36/50 in the end?), the test will be finished successfully e.g. at

Quote

1. Stop point: (8/8) C-Value: 0.00390625
2. Stop point: (11/12) C-Value: 0.00585938
3. Stop point: (13/15) C-Value: 0.00769043
4. Stop point: (16/19) C-Value: 0.00844574
5. Stop point: (18/22) C-Value: 0.00913858
6. Stop point: (21/26) C-Value: 0.00942713
7. Stop point: (23/29) C-Value: 0.00969638
8. Stop point: (26/33) C-Value: 0.00981075
9. Stop point: (29/37) C-Value: 0.00986504
10. Stop point: (31/40) C-Value: 0.00991874
11. Stop point: (34/44) C-Value: 0.0099426
12. Stop point: (36/47) C-Value: 0.00996601

Or you give up without reaching one of those scores and the test is failed.

I hope this answers most of your questions, otherwise feel free to ask.

ABX questions

Reply #3 – 2004-03-28 04:04:17

Thanks guys for your replies.

ABX questions

Reply #4 – 2004-03-28 06:32:43

One more quick question while we're at it

There are two flavors of WinABX. There is ABX and there is ABA. I think ABA might be better when it comes to avoiding fatigue influence, because with ABA you can reach the target p-value quicker than ABX, therefore less fatigue factor. What do you think?

ABX questions

Reply #5 – 2004-03-28 06:48:56

Quote

One more quick question while we're at it

There are two flavors of WinABX. There is ABX and there is ABA. I think ABA might be better when it comes to avoiding fatigue influence, because with ABA you can reach the target p-value quicker than ABX, therefore less fatigue factor. What do you think?

Possibly, although you might have to put in more effort to figure out which sample is the odd man out. For obvious artifacting, it is more efficient than ABX, but then again, there isn't much use in ABX'ing obvious artifacts anyway. It's debatable whether or not this is more effective for defects which are at the edge of perception. Try them both and decide which way works best for you.

ff123

Notice