[span style='font-size:14pt;line-height:100%']TEST #4: --preset standard listening test[/u][/span]
This test is probably the most difficult one. I’m not even sure that useful or significant results will be revealed at the end of such test. I know that --standard preset is rarely fully transparent to my ears (see for example my own listening test performed last summer on non-critical samples), but in order to find those small (and less small differences), I generally need to be highly concentrated and to take many rests. Testing high quality encodings supposes therefore a lot of time, and even more motivation.
Currently, I’m not motivated enough to perform a complete and careful listening test, which imply plenty of successful ABX tests. The following test has consequently some important limitations:
• ABX tests results are not necessary significant. To limit the listening/motivation tiredness, I’ve decided to limit the number of trials to 8 (rarely more and in some cases less), which allows one mistake only to stay in the significant area. The latest ABC/HR beta of ff123 doesn’t allow performing a second test: once results revealed to the user, the game is over. If results are not good enough or if you want to test another range or problem, you can’t launch a second test (except by using the training mode, which is enough for the tester, but I’m afraid not for the reader). It implies that notation I gave to encoded files is not necessary legitimated by ABX results.
• ABX comparison between two encoded files each others are missing too.
• When differences between encodings are too small to be quickly differentiated, I gave the same notation to all similar files rather than loosing time and patience in order to find ridiculous differences. In other words, some files will obtain the same notation, but it doesn’t mean than files are objectively or even perceptually (in some conditions) identical.
The 20 samples used for this test are still unchanged: the two ff123’s suit, and few additional samples. Hardware & software are also the same as before.
SETTINGS[/u]
For this difficult test, I had to drastically limit the number of challengers. The first idea was to simply oppose 3.90.3 --alt-preset standard to 3.97a6 --preset standard (corresponding to –V 2). But I was not completely serene. Recent tests at lower bitrate clearly revealed that --vbr-new (used by default for fast mode) performs better than defaulted vbr mode. Is it still true for higher bitrate encoding? To answer this question, I’ve tested three settings instead of the two initially planned:
• lame 3.90.3 --alt-preset standard
• lame 3.97a6 –V 2
• lame 3.97a6 –V 2 --vbr new
Average bitrate for the whole suit is: 207 kbps (3.90.3), 201 kbps (3.97a6) & 202 kbps (3.97a6 –vbr-new).
RESULTS[/u]
3.90.3 3.97a6 3.97a6
standard -V 2 -V 2 vbr-new
ATrain 5,0 3,0 4,0
BachS1007 5,0 5,0 5,0
BeautySlept 3,5 2,0 2,5
Blackwater 4,5 3,0 4,5
castanets2 1,0 1,0 1,0
dogies 4,2 3,0 3,7
FloorEssence 3,5 2,5 4,5
fossiles 3,0 4,0 4,0
SinceAlways 3,0 2,0 3,7
Layla 3,0 4,0 4,5
LifeShatters 4,5 4,0 5,0
LisztBMinor 3,0 4,5 4,5
macabre 4,0 3,5 4,5
MidnightVoyage 4,0 3,5 4,0
Orion II (2.1) 4,0 3,0 2,5
rawhide 4,0 3,5 4,5
thear1 4,5 4,5 4,5
TheSource 3,0 2,5 4,0
Waiting 2,0 1,5 3,0
wayitis 3,0 2,5 4,0
3,59 3,13 3,90
Click here for log files
PERSONAL ANALYSIS of results[/u]
• first, I have to say that test was not as difficult as feared. First results were encouraging enough to convince me to follow the test, and to not give up. Thanks to problems heard with –V 2 setting, which were manifest as much as necessary to be distinguished from reference first and then from other contenders.
• lame 3.90.3 --standard preset is obviously better than 3.97a6 --standard preset. There are few exceptions (fossiles.wav & Layla.wav : pre-echo issues and some distortions [metallic colour] on percussive signals; LisztBminor.wav: ringing, fluctuating noise). Lame 3.90.3 is close to transparency, with minor problems and rarely annoying artifacts (except pre-echo, inherent to mp3 limitations).
• --vbr-new switch represents (again) a big quality step compared to standard –V 2 (3.97a6) profile. It sounded worse only once, with the micro-attacks (Orion II.wav) sample. Compared to –V 2 --vbr-new, -V 2 suffers from a typical artifact: lack of musical matter, which deteriorates some precious informations of some instruments, mainly cymbals (false sounding, and also pre-echo). Additional pre-echo is also noticeable, at least with non-critical samples (castanets2.wav is apart). 3.97a6 –V 2 encodings were most often easier to ABX. In other words, --vbr-new mode lead to proper encodings, sharper sound and ringing free files.
• lame 3.90.3 --standard is apparently inferior to lame 3.97a6 –V 2 --vbr-new. But I can’t be fully affirmative, for some reasons:
— Statistically, Friedman’s analysis tool computes other conclusions. According to them, both encoders are tied, despite of overall superiority of the alpha encoder. Even with a less drastic level of confidence (10%), only ANOVA analysis would lead to the conclusion of 3.97a6 --vbr-new superiority.
— I’m a bit perplexed when I take a look to the table of results. At the beginning, lame 3.90.3 was always ranked better (or identical) to its contenders (cf. green cells). But after 6 samples, lame 3.97a6 appeared always as the best (with two exceptions on a total of 14 samples). It looks strange. It’s hard to believe that most favorable samples for 3.90.3 are grouped at the beginning of the whole series. Coincidence? Or did my sensibility changed during the test? Did my attention shifted on other problems? It’s possible, I don’t really know.
Anyway, at the end of this test, I’m sure about two things:
• lame 3.97a6 --preset standard (V2) have serious quality issues
• lame 3.97a6 --preset fast standard (V2 vbr-new) will personally replace lame 3.90.3 --preset standard, whatever the chosen priority:
- security/quality: lame 3.97a6 seems to offer most often better quality compared to 3.90.3
- speed: lame 3.97a6 --vbr-new is more than twice faster than lame 3.90.3, approaching or surpassing the speed of modern audio format, like musepack, WMAPro, Vorbis or AAC (QT & Ahead)¹.
- efficiency: using lame 3.97a6 will reduce the size of most encoding, for better results when compared to 3.90.3
STATISTICAL ANALYSIS of results[/u]
Table:
3.90.3 3.97a6 3.97a6new
5.0 3.0 4.0
5.0 5.0 5.0
3.5 2.0 2.5
4.5 3.0 4.5
1.0 1.0 1.0
4.2 3.0 3.7
3.5 2.5 4.5
3.0 4.0 4.0
3.0 2.0 3.7
3.0 4.0 4.5
4.5 4.0 5.0
3.0 4.5 4.5
4.0 3.5 4.5
4.0 3.5 4.0
4.0 3.0 2.5
4.0 3.5 4.5
4.5 4.5 4.5
3.0 2.5 4.0
2.0 1.5 3.0
3.0 2.5 4.0
• ANOVA (5% confidence):
Number of listeners: 20
Critical significance: 0.05
Significance of data: 5.84E-004 (highly significant)
---------------------------- p-value Matrix ---------------------------
3.90.3 3.97a6
3.97a6ne 0.096 0.000*
3.90.3 0.015*
-----------------------------------------------------------------------
3.97a6new is better than 3.97a6
3.90.3 is better than 3.97a6
• ANOVA (10% confidence):
Number of listeners: 20
Critical significance: 0.10
Significance of data: 5.84E-004 (highly significant)
3.97a6new is better than 3.90.3, 3.97a6
3.90.3 is better than 3.97a6
3.97a6 < 3.90.3 < 3.97a6 –vbr-new.
• TUKEY PARAMETRIC (5% confidence):
Number of listeners: 20
Critical significance: 0.05
Tukey's HSD: 0.443
(…)
-------------------------- Difference Matrix --------------------------
3.90.3 3.97a6
3.97a6ne 0.310 0.770*
3.90.3 0.460*
-----------------------------------------------------------------------
3.97a6new is better than 3.97a6
3.90.3 is better than 3.97a6
• TUKEY PARAMETRIC (10% confidence):
Number of listeners: 20
Critical significance: 0.10
Tukey's HSD: 0.384
(…)
3.97a6new is better than 3.97a6
3.90.3 is better than 3.97a6
[span style='font-size:8pt;line-height:100%']¹ Speed comparison based on one unique track file (length: 20 minutes) and AMD Duron 800 CPU:
• MP3 lame 3.90.3² --preset standard = x2,05 [188 kbps]
• MP3 lame 3.97a6² –V 2 –vbr-new = x5,21 [186 kbps]
• MPC musepack 1.14 --standard = x6,11 [174 kbps]
• OGG vorbis aoTuV beta 3² Q6 = x4,51 [182 kbps]
• AAC Ahead 2.9.9.999 ‘fast’ VBR ::normal:: = x2,95 [208 kbps]
• WMApro 9.1³ VBR single pass Q90 = x5,92 [196 kbps]
² John33 compilation.
³ through dBPpowerAmp
[/span]