List of typical problems and shortcoming of "common" audio t
Reply #1 – 2015-12-15 23:37:25
I guess you are talking about blind tests. My comments: 1) ABX software may just decode all files before the test even starts, so the problem is mainly a difference in the decoded format (samplerate, bit depth, could even be channels or PCM vs DSD). 2) Yes, but you don't even need to forge logs. See spectrum analyzers. 3) I consider cutoffs as part of the codec. Different codecs make different tradeoffs, but that's what you want to test ... if there is an audible difference. 4) Yeah and I think this is commonly overlooked. 5) Hmm, you mean like ABC/HR? Have you checked out the multiformat listening tests, exclusion criteria (like getting N samples wrong), statistical analysis? It's pretty solid if you have enough people and samples. 7) There's been some discussion about p-values and ABX comparator results. Without some idea about statistics the results are indeed not trivial to understand. That's why it is supposed to be a blind test, you shouldn't "see" which is which. Another one: sometimes files are created by an incompatible codec (or decoded even, that would be point 1 again) which can e.g. result in added silence at the beginning. Another big one: differences between the files that you didn't even want to test for. For example there have been several invalid resampler test in the past. The guys wanted to test whether the resampler's filter caused audible differences, but didn't notice that some resamplers introduced a time delay. On fast switching between tracks this could give away which is which ... this could have even happened without the participants noticing it as such, but just perceiving some vague difference when switching. It's a tragedy when such mistakes are not detected in the preparation of the test but only much later, or not at all, but conclusions are still drawn as if it was a valid test. Some dishonest people won't even acknowledge the flaws...