New Public Multiformat Listening Test (Jan 2014) Reply #150 – 2013-12-13 14:19:27 Quote from: Kamedo2 on 2013-12-11 20:06:25It's almost the same as the squashed version. You've said that "All the information about variability that you get from multiple listeners is forever gone", but I can say that data is not lost by the squashing.This is a strange result, to me. The multiple listeners per sample give you information on how stable the sample score is, i.e. they tell you the uncertainty on the rating of the samples. So you are concluding this information does not help to establish the uncertainty of the final scores? We know that squashing the scores means the error on those ratings becomes lower, but why does knowing the *distribution of the error* not help you in the conclusion?Imagine I gave you a list of 30 samples and each sample had been listened to by 1M listeners, i.e. the error on the score would be extremely small. I give you another list of 30 samples and each one has only been listened to by 1 listener. Your confidence on the (eventual) means of both examples is the same as long as the mean values are the same? This is weird.In the calculation of the variability of the eventual mean, I would expect a weighting term related to the per-sample error. The variability of the eventual mean (i.e. the spread of all samples over the codec average) should not increase as much if we're adding a sample that has a mean that could be pretty far off from reality, compared to when we're adding a sample that we know we measured pretty accurately. I would also expect to weight the mean towards measurements with more certainty. (This is pure intuition speaking - maybe there's a mathematical result that firmly explains why this isn't needed or correct).Maybe it doesn't end up mattering because the variability for the listeners per sample and the resulting variance is actually fairly equal over all samples?I want to think a bit more about this and play with some simulations because it seems so strange.