Why does a single sample generate only one test trial? 2008-12-02 21:44:24 So I participated briefly in Sebastian's test (thanks!) - I picked castanets (by accident). Once I found like three different artifacts in a single encode, a really curious thing happened: the test was no longer blind, and my ratings were 99% subjective. Because I ABX'd everything, ABC/HR trusted everything I did, and I wound up trying to average ratings across all the artifacts to yield a final rating for that encode. There was real listening behind the numbers, but I'm not terribly happy with the results. And I probably missed listening to a few artifact/encoder combinations in some places, etc.What my experience suggests to me is that, for encoder configurations which have "pointlike" artifacts - they do not continuously distort, but rather have a countable set of audible distortions - the tests could be reoriented to test those specific positions in the files, rather than the entire files. This happens for all modern codecs above 96kbps. In practice, this means telling listeners exactly where the artifacts are supposed to be, and running ABX tests and ratings on an artifact-by-artifact basis rather than on a sample-by-sample basis.I see the following advantages to this scheme:Far larger number of trials available from the same number of samples for improved statistical powerLower total listening time for the same power of resultsGentler learning curve for newbie listeners - "the artifact's right here, all you have to do is ABX it and rate it"Because the listener is no longer required to subjectively aggregate different artifact ratings into a single sample rating, statistical power should further improve.My blinders are on right now as to exactly what the disadvantages to this scheme might be. Comments?