Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: EAqual results for the AAC@128v2 listening test (Read 8501 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

EAqual results for the AAC@128v2 listening test

Hello.

Jan S. gave me the idea of running the AAC@128kbps listening test through Eaqual to see how close ODG results would come to the real results provided by listeners. So here they are:

Code: [Select]
                Nero    Real    Faac    iTunes  Compaact
BigYellow       -0.53   -0.56   -0.46   -0.62   -0.67
BodyHeat        -0.63   -0.70   -0.62   -0.62   -0.90
DaFunk          -0.54   -0.55   -0.68   -0.52   -0.75
gone            -0.65   -0.58   -0.67   -0.56   -0.85
Hongroise       -0.30   -0.09   -0.55   -0.06   -0.23
Mahler          -0.63   -0.66   -1.03   -0.56   -0.95
mybloodrusts    -0.54   -0.84   -0.89   -0.84   -1.19
NewYorkCity     -0.85   -0.71   -0.54   -0.68   -0.94
OrdinaryWorld   -0.79   -0.76   -0.75   -0.76   -0.99
Quizas          -0.60   -0.55   -0.76   -0.53   -0.69
velvet          -0.85   -0.91   -1.48   -0.81   -0.66
Waiting         -0.73   -0.76   -0.80   -0.72   -1.10
              -----------------------------------------
Sum             -7.64   -7.67   -9.23   -7.28   -9.92    
Average         -0.636  -0.639  -0.769  -0.606  -0.826


The Pearson correlation to the two data sets is 0.698884. Quoting ff123:

Quote
0.9 would be excellent correlation
0.7 would be a fairly good correlation
0.5 would be fair
0.3 would be weak

I expect about 0.7
   <-- The guy is Nostradamus reincarnated


If there is enough interest, I can do the same to a few other tests I conduced.

Regards;

Roberto.


EAqual results for the AAC@128v2 listening test

Reply #2
Interesting.

The correlation coefficient you calculated was for 60 values (12 samples x 5 codecs)?

What happens if you separate the data by codecs or by samples?

ff123

EAqual results for the AAC@128v2 listening test

Reply #3
Quote
The correlation coefficient you calculated was for 60 values (12 samples x 5 codecs)?

It was actually calculated on averages. (5 values)

Is it supposed to output a different coefficient if all the 60 values are used?

Quote
What happens if you separate the data by codecs or by samples?


I will try that next.

EAqual results for the AAC@128v2 listening test

Reply #4
I would think it's best to use the 60 values in calculating the Pearson r rather than the 5 averages.

For those who might not know, the Pearson r is a measure of how linear the relationship is between human ratings and EAQUAL ratings.

ff123

EAqual results for the AAC@128v2 listening test

Reply #5
Also, I would expect the Pearson r to be somewhat better for the 64 kbit/s test because more of the 1-5 rating scale is covered.  Correlation tends to be lower when the range of rating values are restricted.

ff123

EAqual results for the AAC@128v2 listening test

Reply #6
Maybe it is a coincidence, but your data seems to indicate that for better codecs (according to the listening test) the correlation is lower.


Well, in fact it would be logical.

EAqual results for the AAC@128v2 listening test

Reply #7
Quote
Also, I would expect the Pearson r to be somewhat better for the 64 kbit/s test because more of the 1-5 rating scale is covered.  Correlation tends to be lower when the range of rating values are restricted.

ff123

My humble guess is that it will be worse since Eaqual seems to be format biased AFAIC.
I recall vorbis giving positive values etc.

edit: that is why I suggested to test aac first.

EAqual results for the AAC@128v2 listening test

Reply #8
Interesting  Can you also post the results for the other contending codecs (of earlier test?)? If you have PEAQ and Opticom Opera, could you post their results as wel?
The object of mankind lies in its highest individuals.
One must have chaos in oneself to be able to give birth to a dancing star.

EAqual results for the AAC@128v2 listening test

Reply #9
Quote
Interesting  Can you also post the results for the other contending codecs (of earlier test?)? If you have PEAQ and Opticom Opera, could you post their results as wel?
[a href="index.php?act=findpost&pid=275781"][{POST_SNAPBACK}][/a]


I can try to produce results later. It's quite a pain to do so because there's no way to automate the process - you must run Eaqual on each single test stream - and besides you have to detect offsets...

PEAQ is almost the same thing as Eaqual, with the difference that it is much slower.

And sure, I will use Opera if you buy it for me. Last time I checked, it cost several thousands of dollars

EAqual results for the AAC@128v2 listening test

Reply #10
If you calculate the correlation based on the five averages, all you can say is that the averages have a correlation of 0.7 - it says nothing about the correlation between the underlying data sets.

For example, take the following two negatively correlated data sets, made up of 2 listening sessions:
Code: [Select]
                     Encoder A       Encoder B

sample 1.1           5                    7

sample 1.2           7                    5

sample 1.3           6                    6

sample 2.1          4                     6

sample 2.2          5                     5

sample 2.3          6                     4          

--------------------------------------------

avg session 1      6                     6

avg session 2      5                     5


so here the encoders are negatively correlated (approx -0.5) (ie a good sample for encoder A is a bad sample for encoder B), but if you work on the averages, they appear perfectly positively correlated.

Ok, I know, the data sets are totally contrived, but the point still stands that correlations (and many other statistics) are not valid if performed on summary data (like averages).

This is a minor statistical point, but so much of what we do here relies on stats that we might as well try to get it right   

Anyway, love everything else you do,

regards,
Matt