So I ran again Guruboolez data in the analyzer, but with Friedmann analysis, this time, in case of an Anova computation failure ...)

However, MPC doesn't win over Megamix q6 anymore. This time, it tells that there is one chance out of two for getting this result by chance !

I have simulated the addition of more results (i.e samples). I've just reproduced the scores obtained for the 10 first samples.

With 70 results (=10 x 7), the Friedman conclusion:

http://ff123.net/

Friedman Analysis

Number of listeners: 70

Critical significance: 0.05

Significance of data: 0.00E+00 (highly significant)

Fisher's protected LSD for rank sums: 43.386

Ranksums:

MPC-q5 MGX-q6 MP3-V2 MGX-q5.9 MGX-q5.5 MP3-V3

385.00 343.00 220.50 213.50 185.50 122.50

---------------------------- p-value Matrix ---------------------------

MGX-q6 MP3-V2 MGX-q5.9 MGX-q5.5 MP3-V3

MPC-q5 0.058 0.000* 0.000* 0.000* 0.000*

MGX-q6 0.000* 0.000* 0.000* 0.000*

MP3-V2 0.752 0.114 0.000*

MGX-q5.9 0.206 0.000*

MGX-q5.5 0.004*

-----------------------------------------------------------------------

MPC-q5 is better than MP3-V2, MGX-q5.99, MGX-q5.5, MP3-V3

MGX-q6 is better than MP3-V2, MGX-q5.99, MGX-q5.5, MP3-V3

MP3-V2 is better than MP3-V3

MGX-q5.99 is better than MP3-V3

MGX-q5.5 is better than MP3-V3

With 7 times the same bunch of results, MPC can't still be said better than Vorbis -Q 6 with confidence. Even if 56 samples were superior with MPC and only 14 superior with Vorbis... Weird.

It's only with 8 times the same results that significance is reached:

http://ff123.net/

Friedman Analysis

Number of listeners: 80

Critical significance: 0.05

Significance of data: 0.00E+00 (highly significant)

Fisher's protected LSD for rank sums: 46.381

Ranksums:

MPC-q5 MGX-q6 MP3-V2 MGX-q5.9 MGX-q5.5 MP3-V3

440.00 392.00 252.00 244.00 212.00 140.00

---------------------------- p-value Matrix ---------------------------

MGX-q6 MP3-V2 MGX-q5.9 MGX-q5.5 MP3-V3

MPC-q5 0.043* 0.000* 0.000* 0.000* 0.000*

MGX-q6 0.000* 0.000* 0.000* 0.000*

MP3-V2 0.735 0.091 0.000*

MGX-q5.9 0.176 0.000*

MGX-q5.5 0.002*

-----------------------------------------------------------------------

MPC-q5 is better than MGX-q6, MP3-V2, MGX-q5.99, MGX-q5.5, MP3-V3

MGX-q6 is better than MP3-V2, MGX-q5.99, MGX-q5.5, MP3-V3

MP3-V2 is better than MP3-V3

MGX-q5.99 is better than MP3-V3

MGX-q5.5 is better than MP3-V3

Now, if I suppose that the following scores I've initially planned to add to the first bunch of 10 results will not really differ from the 10 first, I need to find and test about 70 additional samples to claim that MPC is superior to vorbis "megamix" -q 6,00 without risking the banishment. Forget guruboolez's test: I've other things to do in my life

With ANOVA analysis, the situation is less pathetic:

http://ff123.net/

Blocked ANOVA analysis

Number of listeners: 20

Critical significance: 0.05

Significance of data: 0.00E+00 (highly significant)

---------------------------------------------------------------

ANOVA Table for Randomized Block Designs Using Ratings

Source of Degrees Sum of Mean

variation of Freedom squares Square F p

Total 119 90.75

Testers (blocks) 19 7.35

Codecs eval'd 5 52.03 10.41 31.50 0.00E+00

Error 95 31.38 0.33

---------------------------------------------------------------

Fisher's protected LSD for ANOVA: 0.361

Means:

MPC-q5 MGX-q6 MGX-q5.9 MP3-V2 MGX-q5.5 MP3-V3

3.82 3.15 2.34 2.30 2.23 1.88

---------------------------- p-value Matrix ---------------------------

MGX-q6 MGX-q5.9 MP3-V2 MGX-q5.5 MP3-V3

MPC-q5 0.000* 0.000* 0.000* 0.000* 0.000*

MGX-q6 0.000* 0.000* 0.000* 0.000*

MGX-q5.9 0.826 0.546 0.013*

MP3-V2 0.701 0.023*

MGX-q5.5 0.057

-----------------------------------------------------------------------

MPC-q5 is better than MGX-q6, MGX-q5.99, MP3-V2, MGX-q5.5, MP3-V3

MGX-q6 is better than MGX-q5.99, MP3-V2, MGX-q5.5, MP3-V3

MGX-q5.99 is better than MP3-V3

MP3-V2 is better than MP3-V3

If the next 10 samples I'll test have the same notation as the 10 first one, then I could conclude about mpc superiority.

May I suggest to forget the "Friedman/non-parametric Fisher" analysis for analysing ABCHR scores? Could be helpful for testers...