Can someone please upload Fraunhofer's 1994 encoder (l3enc 0.99) here? Roberto's original page expired.
ResampAudio.exe -s 8000 -f cutoff=0.087 -D A-law -F WAVE ha_aac_test_sample_2010.wav ha_aac_test_sample_2010_a-law8.wavResampAudio.exe -s 44100 -D integer16 -F WAVE ha_aac_test_sample_2010_a-law8.wav ha_aac_test_sample_2010_a-law.wavdel ha_aac_test_sample_2010_a-law8.wav
I don't understand why two low anchors would be needed. Wouldn't it better to let the "mid" anchor define where the the lower end of the scale is? Then there would possibly be a bit wider scale for the contenders. Ideally the low anchor would then get 0-3 and the contenders 2-5. IMHO, it would be enough that there is one low anchor that can be detected easier than the actual contenders.
Also, I don't understand why some old/mediocre MP3 encoder/setting would make a better low anchor than FAAC. FAAC would nicely represent the basis of the more developed AAC encoders. [...]
In general, it would be desirable that all encoders, including the low anchor, are easily available so that anyone can reproduce the test scenario (for verifying the authenticity of the results) or test different samples/encoders using/including the tested encoders and settings in order to get comparable personal results. Also the procedure to decode and split the test sample should be reproducible by anyone.
Ideally the low anchor would then get 0-3 and the contenders 2-5. IMHO, it would be enough that there is one low anchor that can be detected easier than the actual contenders.
As the name implies, the lower anchor shall define the lower end of the scale and should give the listeners an idea of what we mean by "bad quality" (range 0-1).
Use of two anchors follows the MUSHRA methodology and is an attempt at making the grading scale of this test more absolute. After all, all encoders in this test sound quite good compared to old/simple encoding techniques or lower bit rates. As the name implies, the lower anchor shall define the lower end of the scale and should give the listeners an idea of what we mean by "bad quality" (range 0-1). The hope then is that this reduces the confidence intervals (grade variance) for the other coders in the test, including the mid anchor (which should end up somewhere in the middle of the grading scale).
Actually, it seems it doesn't. In my first informal evaluation, I noticed that FAAC is tuned very differently than the other AAC encoders in the test (less pre-echo, more warbling), and it seems LAME@96kb emphasizes the artifacts of the codecs under test (pre-echo, warbling on tonal sounds, etc.) better than FAAC@64. Btw, the bandwidth of LAME@96 is close enough to the codecs under test (around 15 kHz).
The ITU-R five grade impairment scale that is used is between 1 (Very Annoying) and 5 (Imperceptible).Bad quality would be in range 1-2, probably closer to 1.
Of course the situation is different in a 128 kbps AAC test, but there is a danger that the two anchors will occupy the grades 1-4 and the actual contenders will get 4-5 and once again be more or less tied even though the testers actually could hear clear differences between the contenders.
I see. I didn't actually try to do that kind of complex cross-comparison so you know more about this than I. You could have posted the explanation earlier...
What do you think about getting GXLame in as a low anchor (or even a competitor in a non-AAC test)? It's a low-bitrate MP3 encoder, so it just might fit the bill somewhere between V0-V30 (V20 averages 96kbps and defaults to 44kHz).
faac.exe --shortctl 1 -c 15848 -q 50 -w ha_aac_test_sample_2010.wav
Can someone point me to an Intel MacOS X (fat) binary of the latter?
Quote from: The Sheep of DEATH on 27 April, 2010, 01:20:59 AMWhat do you think about getting GXLame in as a low anchor (or even a competitor in a non-AAC test)? It's a low-bitrate MP3 encoder, so it just might fit the bill somewhere between V0-V30 (V20 averages 96kbps and defaults to 44kHz).When I have time, I'll certainly blind-test GXLame against LAME (because I'm interested in your work). However, assuming GXLame sounds better than LAME at low bit rates, I still tend towards LAME as anchor for this test. Here's why: unlike the codecs under test, anchors are supposed to produce certain artifacts, not avoid them. Chris
Quote from: muaddib link=msg=0 date=Also it would be beneficial to create tutorial with each,single,small step that proper test must consist of.Do you mean a tutorial for the listeners on "what the rules are" and how to proceed before and during the test? That sounds good. Will be done.
Also it would be beneficial to create tutorial with each,single,small step that proper test must consist of.
If you're an experienced listener and feel that your approach to a high-bit-rate blind test is radically different from my recommendation, please let me know about the difference.
I actually don't care about ABX probabilities but simply mux encoded and raw audio into L and R channels so that I can hear both signals simultaneously.
and you might not hear certain artifacts which are clearly audible if you listen to both channels of the codec signal.