Good luck with the test Igor.
Are you planing to use existing software (ABC/HR for Java) for the test or something new? ABC/HR's development is dead unfortunately and there were some problems in my last test requiring the installation of JRE 1.5 (which some people with JRE 1.6 found annoying).
Definately AAC. Nero v new Nero alone will bring in a lot of interest. Add in Apple's latest, with it's true vbr options. And it'd be interesting to see CT/Dobly in the mix since it's been left out of so many AAC discussions. And I agree with you, it should be in the ~96k range.
An AAC test would be interesting indeed. But at 96 kbps or so, you would have to test both AAC LC and HE-AAC, since they offer about the same average quality. With about 4 encoders under test (Apple, 2x nero, Dolby, ...), this would give 8 codecs under test. That's too many in my opinion (risk of overload and fatigue).
How about 112 kbps or so? There you can be relatively sure that LC is better then HE on average, i.e. you would need to test only LC.Chris
Regarding the samples, muaddib once had an idea to create a samples DB that should be divided into problem samples and regular samples. When preparing a test, one could pick X samples from that DB based on lottery numbers so that people don't complain that sample Y was selected with the purpose of letting encoder A appear better than encoder B. Additionally, samples collected especially for the respective test could be used like it was done in the past.
As I can see people interesting in AAC test and next codecs:...I think 4-5 codecs are already enough. Not more. What do you think?
Quote from: Sebastian Mares on 26 December, 2009, 01:36:00 PMGood luck with the test Igor.That sounded sarcastic.
4.Bitrates.96-100 kbits/s? It's possible to perform test at 128 kbits/s with very hard samples. Aemese-like samples will be very easy to spot from original.
True VBR mode isn't available in Windows version. Many people couldn't use it.
About Nero AAC: I don't see the point of testing the two last releases.
So 3 competitors and 2 anchors is probably the most doable configuration.
About Nero AAC: I don't see the point of testing the two last releases. Otherwise why wouldn't we test two different implementations of other competitors? If Nero developers have release this new encoder it's because it was tested, ready to use, and therefore solid and trustworthy.
Don't put too many competitors in the arena: the more encoders you have the harder is it to rate them accurately. At the end, many contenders will only bring statistic noise and the test will end with no clear winner. And don't forget the anchors: they're really essential to avoid or limit discrepancies.Ideally, and for a public listening test, I would go for 2 competitors and 2 anchors. But it won't be very attractive to many people. So 3 competitors and 2 anchors is probably the most doable configuration.
Agreed. But for completeness sake: Fraunhofer is currently finalizing quality tunings on their encoder which have been going on for about two years. Release is scheduled for end of January. If there is any interest, I can ask if it's possible to provide an evaluation encoder for this test.Chris
Quote from: IgorC on 26 December, 2009, 02:39:19 PMTrue VBR mode isn't available in Windows version. Many people couldn't use it.It's available in Quicktime Pro.
Quote from: IgorC on 25 December, 2009, 10:06:34 PM4.Bitrates.96-100 kbits/s? It's possible to perform test at 128 kbits/s with very hard samples. Aemese-like samples will be very easy to spot from original.If this is already clear enough, then forgive me, but I think the test needs to have a very defined and limited goal with respect to bit rates. Because when you start adding parameters and/or choosing optional items, things can get quite unclear what is equivalent or fair from one encoder to another encoder, especially if you are dealing with command-line vs gui.
Quote from: guruboolez on 26 December, 2009, 03:06:25 PMDon't put too many competitors in the arena: the more encoders you have the harder is it to rate them accurately. At the end, many contenders will only bring statistic noise and the test will end with no clear winner. And don't forget the anchors: they're really essential to avoid or limit discrepancies.Ideally, and for a public listening test, I would go for 2 competitors and 2 anchors. But it won't be very attractive to many people. So 3 competitors and 2 anchors is probably the most doable configuration.I can't disagree here. Also Sebastian's public tests indicate that 3 (maybe 4. It should be discussed) competitors is a good balance. Even more taking into account that today AAC encoders provide enough good quality at 100 kbps and make it more difficult to ABX.Possible high anchor:1.LAME -V4 or -V3. I think -V5 is too risky to be a high anchor.2.Nero or Apple 128 kbps
Possible low anchor:In my opinion low anchor shouldn't be that bad. LAME ~V7 (~100 kbps) or ABR 100.
So I guess, in the end, my vote goes out to a multi-format test featuring 1 carefully selected AAC codec, Vorbis (Aotuv?) and WMA (Professional?), with LAME at the same bitrate as low anchor (edit) and lossless original as high anchor.
In others words, I believe codecs have evolved to such a point where, competing against 96k AAC, imho, picking a high anchor is moot.
I would like to see Quicktime's new true VBR encoder on the test, even if it's a Mac OS X exlusive. Also having FAAC on the test as a low anchor would be interesting.
There is no reason to compare two Nero versions with other encoders.But many people use Nero encoder intensively here and want to know if there is any improvement (more specifically in quality area).You also have compared two versions of Nero partially in your last test. I've compared two versions of Nero too in my previous test But there are many encoders to test and it will be reasonable option to switch to last Nero encoder.I will make a poll to see what people prefer.
High anchor. Do we really need it?
Imho, only to avoid people from rating most (all?) competitors an unoriginal 4.5 score. As said, that may almost only be achieved by contrasting them to the original, lossless samples. But on the other hand, of course, that would make the test even more difficult to take.
Why not add ffmpeg aac encoder to this test?
It looks too strong for a low anchor. I'm also against dramatically poor low anchor but using one of the most advanced MP3 encoder at the same bitrate we'll test main contenders is a kind of a risk. But the comparison would be interesting I confess... For the sake of the test I'd rather lower the bitrate or use a very old (ISO AAC maybe?) encoder at 100 kbps. Or maybe half the bitrate with HE-AAC(v2)?
- High anchor: Lossless original. Edit: no further high anchors to minimize listening time.
- Post-screening rules: Remove all listeners from analysis who a) graded the high anchor lower than 4.5, b) graded the low anchor higher than the high anchor.What do you think?Chris