Regarding the sequential ABX test problems mentioned, for me it would be more comfortable something like always knowing your "classic" p value, but needed to reach a different value to achieve test pass confidence, depending on the number of trials performed. Since I'm not very good at statistics, would it be possible to calculate the needed p-values or something similar that you need to achieve, depending on the nº of trials performed?
I know, but I would rather prefer an alternative method as the one I suggested that would not impose a fixed number of trials required.
QuoteE.g. You want to reach 95%-confidence (in the classical sense) and stop as soon as this condition is satisfied. Now the following are your win-conditions:5/5, 7/8, 9/11, 10/13, 12/16, 13/18, ...So, the probability to pass this is test by guessing is not only 0.05 but something like:P(5/5) + P(7/8 and not 5/5) + P(9/11 and neiter 5/5 nor 7/8) + ...which tends to 1 .Are you sure? It's counterintuitive to me (as are many statistics, but anyway )
E.g. You want to reach 95%-confidence (in the classical sense) and stop as soon as this condition is satisfied. Now the following are your win-conditions:5/5, 7/8, 9/11, 10/13, 12/16, 13/18, ...So, the probability to pass this is test by guessing is not only 0.05 but something like:P(5/5) + P(7/8 and not 5/5) + P(9/11 and neiter 5/5 nor 7/8) + ...which tends to 1 .
It's P(5/5) + P(7/8 or 8/8 and not 5/5) + P(9/11 or 10/11 or 11/11 and not 5/5 or not 7/8 or not 8/8) + ...
The chances are interdependent, failure on the first influences success on the second one and so on.
A silly test is to write a simulation that keeps guessing in ABX, if you are right it has to pass eventually.
A description of the formulas to derive the two lines are in that monster statistics thread. I think we discarded this because it was less sensitive than the profile method. However, it does have the advantage of simplicity and the trials don't have to be fixed beforehand.
IIRC, I had some reservations about it, because I didn't understand the implied calculations. I suspect that they could be only approximations. But it definitely would be possible to construct a test with infinite length. The only problem is, that the test would be really hard at later stages. (The above sum must converge to e.g. 0.05, so each following term has to be smaller and smaller.)
So we always needed to know the upper limit of trials involved.
Yes, there is a method. I think I posted a graph of it once here, but I must have deleted it earlier. Here it is again:
Example: The probability to pass an "traditional" 0.95-test by guessing when one's allowed to stop at every point up to 30 is 0.129! (you can test this with my Excel-sheet from above)