Re: Personal blind listening test – MultiCodec at ~192 VBR kbps
Reply #21 – 2020-10-08 11:40:26
Thanks very much, Igor, for this high-bit-rate test! It is very helpful for me, as the developer of exhale, since my hearing is not good enough anymore for testing such high rates myself. My 2 cents :First cent but if xHE-AAC didn't bomb on Fatboy and got a 5 like AAC-LC, Musepack and Opus, it would've just come third. Again, we could make similar "what ifs" observations about all the encoders. It's just that "killer sample" - from your personal listening test - is not handled well by xHE-AAC compared to how well it handled all the other audio samples. I see your point. And Yes, we are here basically speculating. I'll try to put it this way. "What if" Opus was being lucky on those difficult samples? ... bun on all of them? IMHO these are very good observations and discussions, and both issues (one lower-quality score on exhale, all-identical scores on Opus) are the two extremes of the same issue (outliers), which are addressed by proper statistical analysis. For those not familiar with such things: When your statistical sample size is small (as is the case here, 1 listener times 12 samples), it is important to consider possible effects caused by the small sample size, in order to avoid drawing conclusions which are not, let's say, statistically solid. Luckily, it's very easy to do a "robust" statistical analysis: there's a publicly available tool called Friedman by ff123 which e.g. Kamedo2 regularly uses on his personal tests. I fed this tool with a text file containing the first 13 lines of the text box that Kamedo2 posted yesterday:MP3 xHE-AAC Vorbis AAC-LC Musepack Opus 3.5 5.0 5.0 5.0 5.0 5.0 4.5 4.0 4.6 5.0 5.0 5.0 3.8 4.7 4.8 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 4.2 4.8 4.9 4.9 4.9 5.0 4.2 4.9 4.9 5.0 5.0 5.0 4.8 5.0 5.0 4.7 4.9 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 4.7 5.0 4.8 5.0 5.0 5.0 4.8 5.0 5.0 4.9 4.8 5.0 4.0 4.9 4.8 4.8 4.8 5.0 No matter which option of Friedman.exe I choose (blocked ANOVA, Friedman/Fischer, Tukey's HSD), I always get this report: AAC-LC is better than MP3 Musepack is better than MP3 Opus is better than MP3 Vorbis is better than MP3 xHE-AAC is better than MP3 Note that it does not say that Opus is better than xHE-AAC or Vorbis, and Kamedo2 privately confirmed to me that I'm using the tool correctly. That's what I mean with "robust statistical analysis". Essentially, Friedman.exe's comments on your discussion could be phrased as: yes, Opus may be "lucky" on at least one sample, and exhale may be "unlucky" on at least one sample (though it can't say which sample). It also means that the plots should be taken with a grain of salt, which is often the case. As Garf once put it on this forum: "Basically, the graphics suck, but they look cute " Usually, when you add more samples (like Guruboolez and Kamedo2 did) or listeners (as was done in HA's recent public tests), those issues tend to disappear. Nonetheless, exhale's quality on Fatboy could be a bit higher even to my ears, which brings me to mySecond cent Indeed, Igor, the reason is an excessive chain of transients on the first half of Fatboy . A few weeks ago I had a possible solution for this, but at the lower bit-rates it degraded the audio quality on some other samples, so I decided not to follow up on that issue. And, as Kamedo2 mentioned, at 96 kbit/s, where the overall audio quality is lower, the score on Fatboy doesn't degrade much more (3.5), see his personal test . So I could try to "fix" Fatboy only for CVBR mode 8 or 9, but are people really using these modes much? This would be a fix specific to the first few seconds of that single sample, I don't know any other sample where this issue occurs. If so, I could try to come up with a "test.m4a" file to check if the fix works on the weekend. Chris