HydrogenAudio

Hydrogenaudio Forum => Listening Tests => Topic started by: rjamorim on 2003-08-22 07:52:47

Title: Pre-Test thread
Post by: rjamorim on 2003-08-22 07:52:47
Hello.

As most of you already know, I am planning to start a 64kbps public listening test in September.

So here are the planned details. Nothing is definitive so far:

The test starts at September 3rd and ends at September 14th

The codecs that will be tested are:
- Ahead HE-AAC "Streaming :: Medium" VBR profile, high quality.
- Ogg Vorbis post-1.0 CVS -q 0
- MP3pro codec in Adobe Audition, VBR quality 35, high quality, m/s and is stereo, no CRC, no narrowing.
- WMAv9 Standard 64kbps (there's no PRO version at such bitrate, AFAIK)
- Real Audio Cook 64kbps (I didn't investigate other settings yet. Comments welcome)

The samples that will be tested have been announced on this (http://www.hydrogenaudio.org/forums/index.php?showtopic=12358) thread. If you have concerns/comments about the sample suite choice, please post there.

The test results will be calculated the same way my former tests were. I don't plan to include bitrates in the formula. Comments are welcome now (they are of no use to me after the test has been started).

I haven't decided about anchors yet (my guru is traveling ), but someone suggested me that I use Lame ABR 128 as higher anchor, so that we can verify which one of these codecs really deliver the marketing of "sounds like MP3 at half the bitrate"

Then, the lower anchor would be a standard 3.5kHz lowpass, like it is done on most formal tests at low bitrates.

So, I'd like to know your thoughts on what I planned.

Thanks for your attention;

Roberto.
Title: Pre-Test thread
Post by: elmar3rd on 2003-08-22 09:13:17
I wonder if mp3PRO VBR is reliable. FhG fastenc VBR seems to be bad tuned (thats what I read here). So maybe we should consider to take CBR instead.
Title: Pre-Test thread
Post by: tigre on 2003-08-22 09:16:52
Quote
The codecs that will be tested are:
...

What about ATRAC 3 Plus (Net MD)? Should be interesting IMO.

Quote
...
someone suggested me that I use Lame ABR 128 as higher anchor, so that we can verify which one of these codecs really deliver the marketing of "sounds like MP3 at half the bitrate"

Great idea.

Quote
Then, the lower anchor would be a standard 3.5kHz lowpass, like it is done on most formal tests at low bitrates.
It's not completely clear to me what this is good for. Are the lowpassed samples supposed to be rated "1" while lame ABR 128 is "5"? To me some artifacts  (ringing, "underwater-warbling" sound more annoying than even a 3.5kHz lowpass. If I got the concept of "anchors" right - why not use something like 64kpbs mp3 as lower anchor (maybe not --alt-preset 64 but something "tuned" like CBR@full stereo + no resampling)?

EDIT: Just noticed the bitrates from the other thread:
MP3Pro :::::: 66.5   
HE-AAC ::::: 65.3   
Ogg Vorbis :: 61.8

I've suggested it before but didn't find any reply, so again (sorry    ):
What keeps you from adjusting Vorbis' -q setting to get a 65.x bitrate?
Title: Pre-Test thread
Post by: tigre on 2003-08-22 09:22:03
Quote
I wonder if mp3PRO VBR is reliable. FhG fastenc VBR seems to be bad tuned (thats what I read here). So maybe we should consider to take CBR instead.

The tests at http://www.soundexpert.info/ (http://www.soundexpert.info/) give similar results @ 64kbps (CBR better than VBR). Here Nero MP3Pro encoder was used. I have no idea if these results are reliable though.
Title: Pre-Test thread
Post by: elmar3rd on 2003-08-22 09:38:05
Beat me, but what about MP3 (LAME or fastenc) at 64 kbps mono?
OK, it's easy to ABX, but that's not the point in a 64 kbps-test. The question is, how annoying it is in comparison.
IMHO, 64 kbps mono really is a choice for streaming and some portable devices.
Title: Pre-Test thread
Post by: askoff on 2003-08-22 11:17:01
Is it possible to have QuickTime AAC files also in this test?

EDIT: And what about PNS for nero? With quick test i heard that it is useful option to use low bitrates.
Title: Pre-Test thread
Post by: LadFromDownUnder on 2003-08-22 11:44:25
With regard to the WMA codec (you are correct that there is no 64k Pro profile), are you intending to use 64k CBR, or 64k VBR (2 pass), Roberto?  CBR would better suit streaming, but given the other codec profiles, the VBR (2 pass) mode would be a fairer comparison.

Doug
Title: Pre-Test thread
Post by: rjamorim on 2003-08-22 16:59:21
Quote
I wonder if mp3PRO VBR is reliable. FhG fastenc VBR seems to be bad tuned (thats what I read here). So maybe we should consider to take CBR instead.

Hrm, I really have no idea.

Quote
What about ATRAC 3 Plus (Net MD)? Should be interesting IMO.


Hrm... I don't know. Does it implement m/s or IS stereo? (Atrac3 doesn't)

And where is it used at? AFAIK, not even minidisc units play it.

Besides, I had a very bad experience with SonicStage, and I'm not sure I want to try to install it on my fresh system :/ (My HDD recently crashed and I bought a new one, just finished reinstalling everything)

Quote
Are the lowpassed samples supposed to be rated "1" while lame ABR 128 is "5"


The point of using anchors is to put results into perspective, by applying a process that doesn't variate depending on sample complexity. A lowpass is the same for a sonata and for castanets.

That perspective is valid both for the test participant and for the final results.

Quote
If I got the concept of "anchors" right - why not use something like 64kpbs mp3 as lower anchor (maybe not --alt-preset 64 but something "tuned" like CBR@full stereo + no resampling)?


Yes, that's another option. That's why I started this thread

Quote
I've suggested it before but didn't find any reply, so again (sorry  ):
What keeps you from adjusting Vorbis' -q setting to get a 65.x bitrate?


I don't know. I was under the impression that people would only use -q 0 for their encoding needs at 64kbps, so a test at -q 0.2 wouldn't be of much use to them. (I think)

But, indeed, it might end up being the best sollution.

Quote
IMHO, 64 kbps mono really is a choice for streaming and some portable devices.


Stereo -> Mono downmixing is preprocessing, and no preprocessing should happen on these tests. (at most, fade in/out where the samples are cut)
So, if MP3 is tested, it must be on stereo. Or downmix for all encoders and do a mono test. Else, we're also comparing apples and oranges. (damn, I hate that metaphor)

Quote
Is it possible to have QuickTime AAC files also in this test?


Probably not. No matter how good it is, I doubt it can compete with AAC + SBR. I'm choosing the best codec for each format, and for MPEG4 audio at low bitrates, it's probably Ahead.

Quote
EDIT: And what about PNS for nero? With quick test i heard that it is useful option to use low bitrates.


True, but I don't think it would be good for HE AAC. PNS is only applied to the AAC part, so, according to Menno, there might appear a weird "hole" in the frequencies.

Still, I'll ask Ivan about his thoughts on using PNS.

Quote
With regard to the WMA codec (you are correct that there is no 64k Pro profile), are you intending to use 64k CBR, or 64k VBR (2 pass), Roberto? CBR would better suit streaming, but given the other codec profiles, the VBR (2 pass) mode would be a fairer comparison.


Yes, I will probably go with VBR. Forgot to mention that at the first post.

Thanks a lot for your comments.

Best regards;

Roberto.
Title: Pre-Test thread
Post by: bond on 2003-08-22 17:44:41
thanks for this test rjamorim!

as vorbis 1.0.1 should be released in september perhaps you should use this version for your test

Quote
Real Audio Cook 64kbps (I didn't investigate other settings yet. Comments welcome)

i am sure if you ask karl lillevold (the guy from real on doom9) he will find out which realaudio codec (cook, atrac.. + last versions available) and which settings should be used to reach the best results with realaudio at 64kbps
Title: Pre-Test thread
Post by: tigre on 2003-08-22 17:47:33
Quote
Quote
What about ATRAC 3 Plus (Net MD)? Should be interesting IMO.


Hrm... I don't know. Does it implement m/s or IS stereo? (Atrac3 doesn't)

I suggested ATRAC3Plus because it seems to perform quite well at 64kbps at www.soundexpert.info listening test so far - and it has similar hardware support as Ogg Vorbis ATM. 

The most detailed technical description I could find are here (http://www.sony.net/Products/ATRAC3/atrac/atrac3plus/index.html) and here (http://www.sony.net/Products/ATRAC3/atrac/atrac3plus/index1.html).

Seems like the newest big improvement is the use of dynamic bit allocation between channels (full stereo, no m/s or is) instead of double mono. 

As this format is even more closed than WMA (no plugins for encoders/players available etc.) and hardware players are (and will be) only available by one company I'd say it's a waste of resources to include it in the test.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-23 03:06:33
Quote
as vorbis 1.0.1 should be released in september perhaps you should use this version for your test

Well, there are some things I would need to know first.

First, will it happen at the beginning of September or near the end?

Second, will it include updates that make it worth the wait? I.E, will the 64kbps and surroundings coding be improved?

I can wait for a new release, no problem, but if it deals with other issues, I see no reason to postpone.

And, of course, I'm expecting they will deliver the update on time. :B

Quote
i am sure if you ask karl lillevold (the guy from real on doom9) he will find out which realaudio codec (cook, atrac.. + last versions available) and which settings should be used to reach the best results with realaudio at 64kbps


I already talked to him on other tests (he actually participated on both), and he said he would be able to help me.

Atrac3 won't be a good option. It doesn't even offers m/s stereo coding - all streams are encoded like Dual Mono. I would expect such codec to be even worse than MP3 at these bitrates.

Regards;

Roberto.
Title: Pre-Test thread
Post by: kotrtim on 2003-08-23 03:43:40
Quote
WMAv9 Standard 64kbps (there's no PRO version at such bitrate, AFAIK)

why don't use quality 10 to 25 of WMAPRO 9 44 kHz, 2ch, 24-bit

I've tested before the quality 10 at ~50 kbps sounds nicer than the standard 64kbps
quality 25 size will be ~64 kbps, but most of the time a few kb smaller than WMA9 standard 64 kbps
Title: Pre-Test thread
Post by: Dibrom on 2003-08-23 03:44:18
Quote
What keeps you from adjusting Vorbis' -q setting to get a 65.x bitrate?

I don't think it really makes sense to do this because it doesn't mirror a real world usage scenario.  People are not going to use a different -q setting per sample to reach a set bitrate every single time they encode a different file.  Instead, they pick a quality setting and stick with it.  It may turn out that on this sample set, Vorbis averages a little low, but on another sample set, it's going to be the opposite.  Given this and the fact that -q0 is widely recognized as given "64kbps" (even the Xiph guys seem to support this on a wide scale), this is the setting that IMO should be used.

As Roberto pointed out earlier, and I agree, I think that adjusting the settings away from the common incarnations, just to "set" the bitrate on this test, calls into question it's credibility.  IMO, it's one thing to adjust settings downward to try and reach a set bitrate (as was done with MPC in the previous test), since this should only really have the affect of worsening the results, but it's entirely different to adjust the settings upward to compensate for a lack of accuracy in encoding.  If Vorbis, or any of these other codecs, happen to use too few bits per sample in their VBR modes in any given case, that points to a possible flaw in the encoding scheme and any adjustment around this just goes to hide the very issues that we are trying to discern in the first place.

This test is about quality, and the test subjects are VBR coders.  The points of the test are to measure fidelity at a given quality mode, with bitrate being used only as a rough guideline and mode of classification (not implicit comparison).  People should realize that very important fact and accept the implications that come along with it (possible VBR pitfalls) in the context of this test.  And finally, in the test results, we should be interested in the representation of -- and significance in relation to -- real world usage scenarios rather than technicalities beyond the concerns of the majority of the readers (something like using -q0.x vs -q0).
Title: Pre-Test thread
Post by: rjamorim on 2003-08-23 03:48:30
Quote
I suggested ATRAC3Plus because it seems to perform quite well at 64kbps at www.soundexpert.info listening test so far - and it has similar hardware support as Ogg Vorbis ATM. 

Well, even though it has some hardware support, it's nearly not usable anywhere else. You can only use it on Windows, using that $%#&! SonicStage software, and there's DRM.

Even though Vorbis still lacks good hardware support, you can encode and play it almost everywhere, software and operational system-wise.

Quote
Seems like the newest big improvement is the use of dynamic bit allocation between channels (full stereo, no m/s or is) instead of double mono. 


Man, what's wrong with these Sony engineers? :B

Channel redundancy coding is not rocket science, it's been around in audio formats for more than 15 years now.

Quote
As this format is even more closed than WMA (no plugins for encoders/players available etc.) and hardware players are (and will be) only available by one company I'd say it's a waste of resources to include it in the test.


I would also agree with that. Heck, even VQF probablyt still has a bigger user base than Atrac3 plus.

Regards;

Roberto.
Title: Pre-Test thread
Post by: ExUser on 2003-08-23 04:36:17
Quote
As Roberto pointed out earlier, and I agree, I think that adjusting the settings away from the common incarnations, just to "set" the bitrate on this test, calls into question it's credibility.

Not at all. This should give the same result as 2-pass ABR, just done manually, instead of automatically.

I do understand where Roberto and co. are coming from and saying that the mode used should be the VBR mode that produces an average of "x" kbps for a broad spectrum of music. The problem is that we're not testing that broad spectrum of music. We're testing  several individual samples. In terms of equality, ABR works much better for a small, focused listening test like this one, merely because it provides both a single, standard testing methodology, and uniform bitrates across samples.

The tester should make every effort to provide as little difference between codecs as possible. Roberto did not do that with the 128 test. Instead, he threw together a mismatch of VBR and ABR, rationalizing his use of VBR with the broad-spectrum tests. Here I see ABR is going to be put up against VBR and CBR, if I understand correctly. This seems even more ludicrous than before. I understand the reasons behind it, that there's no quick and easy way to solve this problem, but it remains a problem. It remains an inequality between codecs. This is the problem I have with the theory behind these tests.

I understand that some people do not agree with me. But that is where I stand, and there seem to be people that agree with me. *shrugs*
Title: Pre-Test thread
Post by: wildboar on 2003-08-23 05:01:32
Quote
The tester should make every effort to provide as little difference between codecs as possible. Roberto did not do that with the 128 test.

On the contrary.  The tester should make every effort possible to represent each codec in the state in which it was designed to function, whether it be ABR, CBR, VBR.  You don't lock any particular codec into a certain mode just to make it "fair."

If you were testing the abilities of different supercars would you disable the front wheels of the 4 wheel drive models just because there were also entries that only had rear-wheel drive?  I don't think so.

This is the only way to conduct a test which best represents real world applications.
Title: Pre-Test thread
Post by: Dibrom on 2003-08-23 05:05:30
Quote
Quote
As Roberto pointed out earlier, and I agree, I think that adjusting the settings away from the common incarnations, just to "set" the bitrate on this test, calls into question it's credibility.

Not at all. This should give the same result as 2-pass ABR, just done manually, instead of automatically.

I understand that some people do not agree with me. But that is where I stand, and there seem to be people that agree with me. *shrugs*

You seem to have missed the main point of my post:

This test should reflect a real world usage scenario as much as possible.

Artificially tuning the settings does not do this, not to mention that 2-pass ABR doesn't even exist for these codecs (with the exception of WMA maybe? even so, this is not case being called into question).

Quote
I do understand where Roberto and co. are coming from and saying that the mode used should be the VBR mode that produces an average of "x" kbps for a broad spectrum of music. The problem is that we're not testing that broad spectrum of music.  We're testing  several individual samples.


This is arguable.  We are testing several individual samples comprising a broad spectrum of music.  The two conditions are not mutually exclusive.  Granted we are not testing every type of music, but that is impossible in any case, and if we're going to nitpick on that point, then we might as well forgo the test entirely.

No matter what the case here, people are going to have to accept that these codecs are not being tested under every condition (genre of music, or even sample for that matter) possible.  That means that the results, same as with all tests, have to be taken with a grain of salt, and within context.

Quote
In terms of equality, ABR works much better for a small, focused listening test like this one, merely because it provides both a single, standard testing methodology, and uniform bitrates across samples.


I disagree.  For one, most ABR modes are based upon VBR (so they share many possible flaws).  They are further encumbered though by catering to bitrate first, and quality second (or quality restrained or superceded by bitrate if you prefer).

As I understand it, simply by virtue of emphasing VBR as the method of choice in this test, the point is to measure quality first and bitrate second here.  I see no implicit virtue in testing ABR over VBR either -- it would seem to me to be a rather arbitrary choice and one not directly tied to the test's focus which, once again, should be related to provided quality in a real world usage scenario.

Most ABR modes in audio codecs are not entirely accurate in terms of bitrate either, so there is no guarentee that using such a mode will provide any more stable of a foundation than VBR in the first place.

Quote
The tester should make every effort to provide as little difference between codecs as possible.


No, the only real necessity is that the tester attempt to equalize the perceived classification of the codecs being tested, only making adjustments where gross mismatches occur.

This means that the absolute bitrate is not nearly as relevant as the wide-scale average bitrate (going beyond samples in this test) or usage scenarios.

What I believe we should be focusing on first is, again, the perceived classification (-q0 vorbis is said to compete with other codec at 64kbps avg mode, for example) first, and the average bitrate second, not worrying about the absolute sample by sample bitrate.

Quote
Roberto did not do that with the 128 test. Instead, he threw together a mismatch of VBR and ABR, rationalizing his use of VBR with the broad-spectrum tests.


I cannot speak directly for Roberto, but I believe this is because he was approaching the test from the perspective that I just laid out above.

Quote
It remains an inequality between codecs. This is the problem I have with the theory behind these tests.


The problem is that these codecs aren't equal, and so any test including them cannot expect the conditions of comparison to be pefect down to the attribute level of every single codec, only varying by common value.  These codecs are different in abilities and behavior, and so if we are to test them and compare them meaningfully and representatively, we need to focus on measuring a generalized and abstracted point (quality levels) not a specific and particular point (performance at exact bitrates).
Title: Pre-Test thread
Post by: rjamorim on 2003-08-23 05:09:05
Quote
This seems even more ludicrous than before.

Why, thank-you for your kind words.

Let me quote my guru here, Darryl Miyaguchi:

"For those who would have done it differently: the opportunity is still there for you! If you can dish out the criticism, can you stand to take it too?"



I won't really bother answering to the other points you made. They have been explained time and time again, with very good justificatives, by ff123, JohnV, Garf, me...

You didn't try to invalidate any of these justificatives, like the ones Wildboar and Dibrom kindly just pointed out. Instead, you just repeated what people have been repeating time and time again ad nausea.

If you can't understand what people have posted in this (http://www.hydrogenaudio.org/forums/index.php?showtopic=11936) thread explaining why your points of view are flawed, I see no hope really.

Edit: interestingly, I replied to one post by you in that thread. Seems you missed it completely.
Title: Pre-Test thread
Post by: ExUser on 2003-08-23 06:09:57
@wildboar:
The multi-pass ABR method I described would work perfectly for a VBR system like Vorbis's or Musepack's. It is in no way akin to your analogy. I understand the need to mix ABR and CBR here. I do.

The analogy falls flat, though, because it is nothing more than superficial. It doesn't relate at all to the test other than in a very, very general sense. And it's not adequate for discussion.

Perhaps something more adequate in describing the diffence between ABR/VBR/CBR is as follows:

We're studying the genetics of the tail lengths of dark cats trying to find a breed that has the longest tail, whilst remaining suitably dark. Due to some bizarre reason, we can only pick breeding pairs that are dark, and we study the tail lengths of the kittens.

So, do we pick parent cats that have:

A great majority black , but several pure white kittens? (VBR)
All dark kittens, with some that are black, some that are lighter? (ABR)
Or kittens that are all a uniform shade of dark grey? (CBR)

If all the kittens coat colours were properly weighted (however that is done) and averaged, and they all worked out to the uniform shade of dark grey of the CBR kittens, would all breeding pairs be acceptable as parents to be tested?

It adequately analogizes in my mind. I may not have explained it thoroughly enough, so I hope you all can catch my drift.

@dibrom:
Thank you for the in-depth response, foremost. I did decide to pick one specific area and debate that. I got your main thrust, and I understand the need for similarity to the real-world.  What I meant to do was describe the way an equivalent of a 2-pass ABR mode could be achieved using a numerical quality selector.

I'm going to forgo debating every single point of your response. You raised several more points that are opinion-based, and we could argue them all day and achieve nothing.

You did make some good non-opinionated points, though. I'll pick a few that stand out to me.

Quote
What I believe we should be focusing on first is, again, the perceived classification (-q0 vorbis is said to compete with other codec at 64kbps avg mode, for example)


It has this perceived classification? My understanding was that the coders intended -q0 Vorbis to work out to 64kbps on average, not compete with a 64kbps average codec. Furthermore, it was my understanding that everyone acknowledges that -q0 may be a little off in one direction or another.

The name of the test states that it's testing 64kbps codecs. I'd think that this implies that perceived classification does not enter into the picture; rather, that 64kbps should be the bitrate, or something thereabouts.

Quote
No, the only real necessity is that the tester attempt to equalize the perceived classification of the codecs being tested, only making adjustments where gross mismatches occur.


What's the point of the test then, if the codecs aren't on even ground? Attempt to equalize, so, in other words, set the codecs up so there's as little difference between them as possible? That's what I meant, if not what I said. We agree here.

Quote
we need to focus on measuring a generalized and abstracted point (quality levels) not a specific and particular point (performance at exact bitrates).


But ABR does not do that. ABR shifts the focus to the latter, not the former. CBR doubly so. This is exactly the point that bothers me.

I understand that Roberto's aiming for real-world results, and thus I can see why he does not wish to use anything other than the encoding methods directly available through the encoders, but there are some problems, that I think can have some detrimental effect on the overall test.

Dibrom, I apologize if I glossed over your message. It was long, and there was plenty of issues for me to address, so if I missed something, please tell me.

@rjamorim:
Quote
Why, thank-you for your kind words.

Let me quote my guru here, Darryl Miyaguchi:

"For those who would have done it differently: the opportunity is still there for you! If you can dish out the criticism, can you stand to take it too?"


Forgive me for wording that as strongly as I did. I didn't mean to be unkind, I meant to emphasize that I would have done things in a different manner, had I been the organizer. I'm not. You're putting in a great amount of time and effort to set the test up and to defend it. You have my respect for that, and I do not mean to seem otherwise. I can take criticism; I thrive on it. I do not presently get as much of it as I would like. I also greatly appreciate what you're doing. I forgot to emphasize that. If you weren't taking the time to do it presently, it wouldn't be getting done. That said, I had problems with the way the test was performed, although I think the LAME 128ABR anchor and the reasoning behind it was a stroke of genius, and added a human touch to all the dry science and ABX tests.

I read through the 128 test's explanations. I disagree with some of their assertions. Ultimately, what we're dealing with here is a difference of opinions. I suppose I've made enough noise about this to last for some time. I won't bring this topic up again, I just hoped I could make a difference and explain to other people the way I saw things. We'll see how you all take my cat analogy.

EDIT: And yes, I did see that response. I just disagree.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-23 06:17:19
Quote
I think the LAME 128ABR anchor and the reasoning behind it was a stroke of genius

Of course it was. Darryl suggested it
Title: Pre-Test thread
Post by: den on 2003-08-23 08:33:16
Quote
Atrac3 won't be a good option. It doesn't even offers m/s stereo coding - all streams are encoded like Dual Mono. I would expect such codec to be even worse than MP3 at these bitrates.


Umm sorry, but just for the sake of accuracy, this is not correct. I can not speak for ATRAC3plus, but ATRAC3 @ 66 kbits uses joint stereo. Whether it uses it well is another story... 

There are clear references to this both on the Sony site, and in my MD manual. One quote, "ATRAC3 in LP4 mode encodes audio in "joint-stereo" mode, encoding the left and right channels in one step (i.e. jointly) and exploiting the similarity between channels to increase compression..."

All the same I think leave out ATRAC3 unless you are still looking for that low anchor! 

Den.
Title: Pre-Test thread
Post by: tigre on 2003-08-23 08:38:59
Quote
I don't think it really makes sense to do this because it doesn't mirror a real world usage scenario.  People are not going to use a different -q setting per sample to reach a set bitrate every single time they encode a different file.  Instead, they pick a quality setting and stick with it.  It may turn out that on this sample set, Vorbis averages a little low, but on another sample set, it's going to be the opposite.  Given this and the fact that -q0 is widely recognized as given "64kbps" (even the Xiph guys seem to support this on a wide scale), this is the setting that IMO should be used.

This sounds reasonable to me. The weak point I see here is the real world scenario. There are several theoretically thinkable ways to get (=measure) figures about bitrate-wise behaviour of the tested VBR codecs under average real world conditions. E.g. taking statistics about sold records of different genres, encode huge numbers of samples and calculate an average bitrate weighted by the statistics ... . All possibilities that come to my mind here are just too much effort.

So there are two possibilities left:
1) Taking "Ogg vorbis -q0 averages 64kbps" as best possible assumption because "it's widely recognized as true" OR
2) Changing overall -q setting for the test as I've suggested.

Both have their problems:
1) We try to set up a test to get hard, comparable figures out of human subjective perception by double blind testing, statistical analysis etc., but we choose codec settings based on an assumption that is nothing than "widely recognized as true". That could lead to an uncalculable insecurity of the results we get.

2) We "adjust" average bitrates, but we don't know how close they mirror a real world scenario either (it's very hard to define "real world scenario" anyway), which leads to a similar insecurity.

Quote
As Roberto pointed out earlier, and I agree, I think that adjusting the settings away from the common incarnations, just to "set" the bitrate on this test, calls into question it's credibility.

As I tried to explain above, both possibilities have similar problems about what you call credibility here.

Quote
IMO, it's one thing to adjust settings downward to try and reach a set bitrate (as was done with MPC in the previous test), since this should only really have the affect of worsening the results, but it's entirely different to adjust the settings upward to compensate for a lack of accuracy in encoding.  If Vorbis, or any of these other codecs, happen to use too few bits per sample in their VBR modes in any given case, that points to a possible flaw in the encoding scheme and any adjustment around this just goes to hide the very issues that we are trying to discern in the first place.

I don't understand why you refer to vorbis needing fewer bitrates on some type of music as "lack of accuracy in encoding". One could also say "Vorbis is very good at encoding this type of music because it reaches a certain quality level needing less bits than encoding other music". Isn't this what VBR is about? So *not* adjusting Vorbis' bitrate could be seen as punishment for the good performance.

Quote
This test is about quality, and the test subjects are VBR coders.  The points of the test are to measure fidelity at a given quality mode, with bitrate being used only as a rough guideline and mode of classification (not implicit comparison).  People should realize that very important fact and accept the implications that come along with it (possible VBR pitfalls) in the context of this test.

Unfortunately the relationship between VBR bitrate and measured quality (1-5) isn't defined mathematically, e.g. there's no linear correlation. In situations like this a test can only deliver comparable results if only one parameter is measured while the others are fixed to the same level. (You can't compare how much fuel different cars need per 100 miles by letting each one drive at different speeds). Because of this bitrates have to be as close as possible, otherwise it'd be useless to measure quality.

Quote
And finally, in the test results, we should be interested in the representation of -- and significance in relation to -- real world usage scenarios rather than technicalities beyond the concerns of the majority of the readers (something like using -q0.x vs -q0).

As it's very hard to get a widely accepted definition of "real world scenario" (e.g. mine would consist of > 50% latin music  ) and even harder to get total averaged numbers about bitrate-wise behaviour of VBR codecs we should take what we have got for sure and assume our set of test samples as mirror of "real world" IMHO. Both possibilities have their pros and cons - and I can understand and will accept (do I have a choice?  B) ) if rjamorim decides to stick with -q0.

[span style='font-size:7pt;line-height:100%']edit: grammar, typos, clarification[/span]
Title: Pre-Test thread
Post by: S_O on 2003-08-23 11:18:31
For Real Gecko Audio you should use the "64 kbps Stereo Music RA8" setting (codec flavor 24) (Warning: there is also "64 kbps Stereo Music", but it uses an older codec version!). The easiest way is to download the newest helix producer, make your audience file (you can use a gui) and run the application with cli (or use a gui).
Real only offers CBR at the moment.

For real-life testing I´m suggesting the following, for ABR codecs this could be very useful and much more real-life:
Encode the entire song and then cut the sample out, so the encoder can decide the bitrate distribution over the entire song, not for one little part. So for the critical parts, which are tested, more bits can be used, while for the other parts less bits can be used. The average bitrate is now 64kbps over the entire song, not for this 20sec sample. But also, if the vbr algorithm fails, maybe for this part less bits are used. Encoding the entire song is much more real-life and shows better how good the vbr algorithm is.

For mp3pro, I suggest using the VBR quality mode, it gives good results (in CEP 2). There is no reason for using cbr. The vbr algorithm works fine, better than other sometimes, because it doesn´t reduces the bitrate dramatically at parts with low volume. This is good with classical music.
Title: Pre-Test thread
Post by: guruboolez on 2003-08-23 12:54:30
Some comments :

atrac3plus would be nice to test. Sony is putting some marketing investment in this format. I recently saw some portable CD/mp3 players reading atrac3plus ; there are flash memory players based on this new audio format too. Nevertheless I don't expect to see any other manufacturer supporting a Sony format, and therefore audience of this format may be very limited in the future. Unfortunately, including atrac3plus is a real pain. Encoding/decoding through SonicStage, capturing through TotalRecording, editing offset and file precise length with CEP for 12 samples is in my opinion a real torture. And I don't forget that all samples have to be flacced in order to intergrate the public archive, increasing the weight, downloading time and server bandwith. Sad...


wma9 at 64 kbps is probably the most used encoder/setting at this bitrate, by a lot of people having a small USB mp3player. Not on HA of course... but I often read positive comments and propaganda for wma9@64 kbps = mp3@128, at least in nomad conditions. Including wma 'standard' and lame 128 as anchor, is a nice idea, and will be a good, official reference for fighting this optimistic equation. Of course, I would like to see wma9pro performance too, even at 50 kbps, and compare it to wma9 'standard'. I was positively surprised by the performance of the format with classical music (but horrified with loud music), but I didn't tried with many samples...
Due to the lack of CLI decoder of both WMA/WMApro, I suppose difficult to include the two encoders :/
(note : there is a VBR setting for wma9 standard - is it possible to consider it ?)


AAC - HE-AAC : an AAC encoding, opposed to a HE-AAC one, may show some surprise (I don't know) : SBR is a nice tool, but some reverse effects are not impossible (they exist, I'm sure). Can we add an encoding ? File size isn't an issue (faad2 is already present), but challengers number is one, maybe... Why not Nero ABR, with PNS ? Or maybe Sorenson encodings
Title: Pre-Test thread
Post by: rjamorim on 2003-08-23 16:17:33
Quote
Umm sorry, but just for the sake of accuracy, this is not correct. I can not speak for ATRAC3plus, but ATRAC3 @ 66 kbits uses joint stereo. Whether it uses it well is another story... 

Well, that's what Karl Lillevold told me, I don't know...

Quote
All the same I think leave out ATRAC3 unless you are still looking for that low anchor! 


Hehe. OK.

Quote
Encode the entire song and then cut the sample out


That's not a real possibility because, of all the 12 samples, I only have one of them in it's entirety. It would require that people send me the entire songs for each sample. And then I would be accountable for piracy. You get the problem? :B

Quote
atrac3plus would be nice to test. Sony is putting some marketing investment in this format. I recently saw some portable CD/mp3 players reading atrac3plus ; there are flash memory players based on this new audio format too. Nevertheless I don't expect to see any other manufacturer supporting a Sony format, and therefore audience of this format may be very limited in the future. Unfortunately, including atrac3plus is a real pain. Encoding/decoding through SonicStage, capturing through TotalRecording, editing offset and file precise length with CEP for 12 samples is in my opinion a real torture. And I don't forget that all samples have to be flacced in order to intergrate the public archive, increasing the weight, downloading time and server bandwith. Sad...


Right. I am still not sure there is/wil be much demand for Atrac3plus, and given that it would only increase the mess... :/

Besides, keep in mind that if I go with the idea of 5 encoders + 2 anchors, that already means 7 tests for each sample. Few people have patience for that.

Quote
wma9 at 64 kbps is probably the most used encoder/setting at this bitrate, by a lot of people having a small USB mp3player. Not on HA of course... but I often read positive comments and propaganda for wma9@64 kbps = mp3@128, at least in nomad conditions. Including wma 'standard' and lame 128 as anchor, is a nice idea, and will be a good, official reference for fighting this optimistic equation. Of course, I would like to see wma9pro performance too, even at 50 kbps, and compare it to wma9 'standard'. I was positively surprised by the performance of the format with classical music (but horrified with loud music), but I didn't tried with many samples...
Due to the lack of CLI decoder of both WMA/WMApro, I suppose difficult to include the two encoders :/


Right. I guess either format would do for the test, but I'm not willing to add both. And for the same reasons pointed above: burden on the listeners, files must be delivered in FLAC format... (not to mention people saying I'm favouring MS by adding two WMA flavors)

So, it boils down to weighting which one is of more interest for this test. What do you people think?

Quote
(note : there is a VBR setting for wma9 standard - is it possible to consider it ?)


Yes, but it's worth thinking: Will people get interested in VBR 64? I am under the impression that people are mostly using CBR 64, so it's closer to a "real world scenario". Opinions? Ideas?

Quote
AAC - HE-AAC : an AAC encoding, opposed to a HE-AAC one, may show some surprise (I don't know) : SBR is a nice tool, but some reverse effects are not impossible (they exist, I'm sure). Can we add an encoding ? File size isn't an issue (faad2 is already present), but challengers number is one, maybe... Why not Nero ABR, with PNS ? Or maybe Sorenson encodings


Well, the only issue preventing such from happening is, as I said before, burden on participants. Do you think people won't get tired with 8 samples to test? What is your opinion?

Best regards;

Roberto.
Title: Pre-Test thread
Post by: phong on 2003-08-23 17:45:35
I think a 64kbps test is going to be less tiring than 128kbps per sample because it is much easier to distinguish them.  But seven or eight versions to listen to may still be too many.

As for the bitrate thing - we could do the same that that was done last time - encode tons of CDs at each of the quality levels around -q0 and see what REALLY equates to 64kbps.

Oh, and I've been working on writing a ABC/HR clone for Linux.  It may be good to go in time for the test.  Are you interested in having that available or would it be too much trouble to put together and test two packages?  It's written in Python with wxPython for the gui and pygame (SDL) for the audio so it should be very portable (in case there are any Mac users out there that want to use it).
Title: Pre-Test thread
Post by: Dologan on 2003-08-23 18:40:00
Hmm... I frankly don't understand the ABR/CBR/VBR nitpicking that just keeps arising again and again. Unless it raises compatibility issues, most people (myself included) don't give a damn if the sample has a 56.2, 74.7 or 64.0 kpbs avg. bitrate as long as it is in a certain tolerable range.
I think all this could be avoided simply by changing the test name from "64 kbps test" to something more like "64kpbs-range test" or "low bitrate test". Some people just take the "64 kpbs" part too much at heart.  Ok, I know this is raised due to "fairness" issues, but these have been discussed at length in favour of letting the codecs do what they are good at and not crippling them to a constrained, unnatural setting, since that would be unfair, too.
A suggestion to better design the amount of codecs/samples for the test: Make a poll about how much time would you be willing to spend on the test (for those who would consider participating) and then choose a combination of codecs/samples that best suits the results. IMHO it would be better to have a comparison of few codecs but with small error bars that allow reliable conclusions, than to obtain an entire battery of codecs nobody in this forum uses with error bars so large that barely any significant conclusions can be reached. Besides, making a large test would probably bias the results in favour of the listening preferences of patient people with lots of time, which may or not be different for other kind of people. (ok, this might sound crazy, but who knows about the influence of personality on annoyance thresholds?)

Regards,
~Dologan
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 01:01:11
Quote
A suggestion to better design the amount of codecs/samples for the test: Make a poll about how much time would you be willing to spend on the test (for those who would consider participating) and then choose a combination of codecs/samples that best suits the results.

Problem is, you can't even possibly imagine how much time someone will spend on the test. In the AAC test, I had Garf's results 2 hours(!) after I officially started the test. And JohnV submitted his last results few hours before the test closure.

What's the amount of time someone spends testing a sample? 2 minutes? 30 minutes? Besides, if it's a problem sample at low bitrate, the person will surely spend less time than if it's an easy sample at high bitrates.

Quote
I think all this could be avoided simply by changing the test name from "64 kbps test" to something more like "64kpbs-range test" or "low bitrate test". Some people just take the "64 kpbs" part too much at heart.


That makes sense, indeed, but most of the people that are criticizing the test aren't doing this because the bitrate deviates, but because, due to the bitrate deviation, some codecs might end up more "favoured" than others. In that case, even changing the test name wouldn't appease them.

Regards;

Roberto.
Title: Pre-Test thread
Post by: guruboolez on 2003-08-24 01:14:59
Testing eight different files for each sample doesn't annoy me for such low bitrates. But others people may probably be bored. Then,

isn't it possible to create two kind of archives :
- essential encoder (wma, ogg, he-aac, mp3pro)
- additional encoder, for curious people (wma9 pro, real, aac)

People who want to participate to the test had to send results for the first pack, and if they want to investigate further, they can evaluate encoding include in the second package. By doing that, you won't annoy or frighten people with too much encodings, and you will get some interesting results for additionnal codecs, without starting another test. It seems to be a good compromise between completeness and respect for the testers.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 01:41:36
That really sounds like a great idea to me.

(Although I didn't think about it hard enough to pick up eventual flaws in it)

Of course, the "official" 64kbps test results will be the ones featuring the "essential" codecs, and somewhere in the official page there'll be a link to a "subtest" featuring the essential codecs + additional ones.

That separation is needed because the tests can't be merged together if the amount of listeners isn't the same at each sample. To start with, the ANOVA error margin would be different for each case.

Besides, ABC/HR is limited to 8 sliders. You are already suggesting 7 codecs, not counting the anchors. Do you have any idea how to circumvent that issue? I don't think doing two separate test setups would be the right way, but that needs to be discussed. I would personally think the right way would be one test setup = essential codecs and the other = essential + additional codecs, and not one setup = essential and the other = additional. (I don't know if I'm making myself clear...)


Heh, that would make it harder for me to process the results, but I'm inclined to oblige and see how things turn out.

Comments? Ideas?

Regards;

Roberto.
Title: Pre-Test thread
Post by: guruboolez on 2003-08-24 01:59:46
Is it worth to put two different anchors in this test ?
For the 128 kbps listening test, anchor was needed to preserve lame mp3 from an exagerate notation. Here, there is no (known) encoder to protect from (known) stronger competitors. Why not remove this one ?
On the other side, I'd like to see mp3@128 as bottom anchor. This anchor is more than a "dead file" : it's a popular reference, and at the end of the test, we can build some conclusions the relation between this file and others competitors. It's very important to give a point of comparison : some people are obnubilate by the idea of mainting 128 kbps quality at half bitrate, and this test, with mp3@128 include, is the occasion to give strong answers (superior to pseudo scientific waveform comparison...) to these people (and it would be a good advertising for HA.org).
Honestly, is a 3.5 Khz lowpassed wav file really needed here ?
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 02:51:51
Quote
Is it worth to put two different anchors in this test ?

Well, I don't know. That's open to debate still. Unfortunately, the biggest authority I know of in listening tests is somewhere in Thailand :B

Quote
For the 128 kbps listening test, anchor was needed to preserve lame mp3 from an exagerate notation. Here, there is no (known) encoder to protect from (known) stronger competitors. Why not remove this one ?


Well, as already explained somewhere, the Anchor isn't there only to protect rankings, but also to put things into perspective across the entire sample suite.

Quote
On the other side, I'd like to see mp3@128 as bottom anchor. This anchor is more than a "dead file" : it's a popular reference, and at the end of the test, we can build some conclusions the relation between this file and others competitors. It's very important to give a point of comparison : some people are obnubilate by the idea of mainting 128 kbps quality at half bitrate, and this test, with mp3@128 include, is the occasion to give strong answers (superior to pseudo scientific waveform comparison...) to these people (and it would be a good advertising for HA.org).


Oh, sure, MP3 is definitely in.

Quote
Honestly, is a 3.5 Khz lowpassed wav file really needed here ?


Well, indeed, maybe not.

I'm just trying to figure out how to sort results, given that some of them will contain the essential codecs, others will contain the essential + additional. It can surely be done by hand, but given I expect this to be my biggest test to date, it'll be a PITA. And then you guys can't expect results delivered a few hours after the test closure. :B

Any idea?

R.
Title: Pre-Test thread
Post by: ErikS on 2003-08-24 03:56:06
One idea how to add more codecs to the test would be to make three different test suites. One where wma pro is in, another where real is in but not wma pro, and the third where aac would be in but none of the above. Then when a person downloads the suite, the server randomly gives one of the three packages. This way everybody will test the core codecs but you will still have some results for the additional ones.
Title: Pre-Test thread
Post by: tigre on 2003-08-24 07:47:52
Would it be an option to use the 8 sliders for tested codecs only and the higher anchor (lame @128kbps) while the lower anchor (no matter if lowpassed or a crappy encoding) can be provided seperately so people can listen to it without using ABC/HR? It should be so obvious what's wrong compared to the original that ABXing this one isn't necessary - and it'll get a fixed rating (= 1 ?) anyway (if I get the idea of anchors right).
Title: Pre-Test thread
Post by: S_O on 2003-08-24 12:14:37
Quote
That's not a real possibility because, of all the 12 samples, I only have one of them in it's entirety. It would require that people send me the entire songs for each sample. And then I would be accountable for piracy. You get the problem? :B
If they don´t upload it here to HA for the public, and yust send the song to you? That´s not piracy, in Germany it´s allowed to make a  private copy for realtives and friends (this could have been changed since the new copyright-law).
Another idea is, that they encode the samples themselves, you send them the exact setting (batch-file for CLI-encs), then they decode it again and cut the decoded files and them to you.
I think that´s very important for real-life testing, there have to be a way how it is possible.

For the codecs I think these should be tested:
HE-AAC
mp3pro
Ogg Vorbis
WMA
Real Gecko
Lame mp3 --preset 128
(atrac3plus?)
That are 6 (7) codecs, everybody can test that.
Title: Pre-Test thread
Post by: bond on 2003-08-24 19:37:59
dont think that people will get bored if they test 64kbps quality files? hey a 128kbps test were you cant hear any differences is more boring imho
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 20:02:31
Quote
One idea how to add more codecs to the test would be to make three different test suites. One where wma pro is in, another where real is in but not wma pro, and the third where aac would be in but none of the above. Then when a person downloads the suite, the server randomly gives one of the three packages. This way everybody will test the core codecs but you will still have some results for the additional ones.

man, if you can only imagine the mess it'll be to process the result files... :B

At the time being, I use a very useful tool created by ff123. It takes the results file in a text list and sorts them in a table that is usable in his Friedman tool.

If we go with adding a different codec for each sample package, I would have to edit ALL packages by hand. You'll have to expect the results for a week after the test is over.

In this aspect, Guru's suggestion would be easier to implement.

Quote
Would it be an option to use the 8 sliders for tested codecs only and the higher anchor (lame @128kbps) while the lower anchor (no matter if lowpassed or a crappy encoding) can be provided seperately so people can listen to it without using ABC/HR? It should be so obvious what's wrong compared to the original that ABXing this one isn't necessary - and it'll get a fixed rating (= 1 ?) anyway (if I get the idea of anchors right).


Not really a fixed rating. As you noticed on the 128kbps test, Blade was an anchor, and still it got scores higher than 1.

So, anyway, I'll probably just ditch the bottom anchor.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 20:10:28
Quote
If they don´t upload it here to HA for the public, and yust send the song to you? That´s not piracy, in Germany it´s allowed to make a  private copy for realtives and friends (this could have been changed since the new copyright-law).

It would be illegal in nearly the entire rest of the World, including Brazil. :-/

Quote
Another idea is, that they encode the samples themselves, you send them the exact setting (batch-file for CLI-encs), then they decode it again and cut the decoded files and them to you.


Well, is HE AAC, WMA, and Real even "cuttable"?

And, when cutting MP3pro, there's no risk of teh SBR part getting b0rked?

Quote
I think that´s very important for real-life testing, there have to be a way how it is possible.


Well, it is possible. That's what they do in formal listening tests. But I don't have the resources to conduce a formal listening test (which usually costs 4-digit dollars)

Quote
For the codecs I think these should be tested:
HE-AAC
mp3pro
Ogg Vorbis
WMA
Real Gecko
Lame mp3 --preset 128
(atrac3plus?)
That are 6 (7) codecs, everybody can test that.


Real Gecko? 

Yes, I agree completely with the first 6 codecs. And I'm not too fond of featuring atrac3plus. First, because I don't see it getting as mainstream as the others, mostly due to Sony's (understandable) paranoia on security and so on (DRM, etc.)

Quote
dont think that people will get bored if they test 64kbps quality files? hey a 128kbps test were you cant hear any differences is more boring imho


Indeed. IMO, anything above 7 codecs is too much for "every participant". That's why I would maybe go with Guru's idea of offering a superset of samples using non-essential codecs.

Regards;

Roberto.
Title: Pre-Test thread
Post by: Gecko on 2003-08-24 20:42:37
I think dividing the test into an essential and a non essential part is an excellent idea. Guruboolez' proposed division into essential/additional makes sense to me.

What about the rating scale of abc/hr? I believe this issue was brought up in the aftermath of the last test, but I don't remember the answer. Personally, I am fine with the wording and I wouldn't be able to come up with better alternatives, but I believe that many people have trouble with the scale. The reason behind this may be the nonlinearity of the scale and the use without context. (I guess that's what the anchors are for.) I'm not sure if people rate the samples against the original wav or if they rate them in the context of using ~64k samples ("Actually, this sample sounds like crap, but hey, it's only 64k"). Maybe this should be made more clear. Maybe someone should write a small text how to do proper rating, give examples. This could yield more accurate results.

Another issue is the number scale. People value numbers too much. Perhaps they should be removed from the interface and only be output to the result file. This way people would focus more on the describing words and their meaning than on the numbers.
Title: Pre-Test thread
Post by: S_O on 2003-08-24 21:20:21
Quote
Well, is HE AAC, WMA, and Real even "cuttable"?

And, when cutting MP3pro, there's no risk of teh SBR part getting b0rked?

Real is cuttable (there is rmeditor, cli application from helix), WMA should be cuttable, too, but I don´t know a good tool for that. AAC is cutable with BeSplit, but I don´t know if it also copies the ancillary data correctly. For mp3pro there is a problem (also for mp3), because of the bit reservoir. But that should only affect the first frame, so no real problem. And if a cutting tool like mp3directcut doesn´t work correctly with SBR, a simple hexeditor should work. Does someone know if there is SBR in all frames, or only in some? So it could be essentially for decoding that there is SBR in first frame, otherwise SBR isn´t detected.
Quote
It would be illegal in nearly the entire rest of the World, including Brazil. :-/
F*cking laws! But I noticed something illegal in your old test: You distributed binaries of faad, lame and blade. This is not legal in some countires (like the USA), too.
Quote
Well, it is possible. That's what they do in formal listening tests. But I don't have the resources to conduce a formal listening test (which usually costs 4-digit dollars)
Even if you would buy all this discs it would less than 100€/$ (of course even that would be too much). If the sample owners would encode/cut themselves it would be the easiest and legal way. If you don´t trust them or they are not able to do it alone, you could make own there PCs using NetMeeting remote control.
Quote
Real Gecko?
The codec is named "Gecko" in the real papers, because "Real Audio" can be every codec used by Real (Sipro Voice Codec, DolbyNet, Atrac3 etc.). Because the FourCC of it is "cook" (this comes from the  name of the codec developer "Ken Cooke") it also often called so.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 21:31:21
Quote
But I noticed something illegal in your old test: You distributed binaries of faad, lame and blade.

Well, there I was breaking patents that people don't give much of a damn anyway. It's not nearly as bad as breaking copyright. :-/


Quote
Even if you would buy all this discs it would less than 100?/$ (of course even that would be too much).


Haha, even if I had all that money (I don't), most of the music that will be featured can't be found in this hellhole of a country

Quote
If the sample owners would encode/cut themselves it would be the easiest and legal way. If you don´t trust them or they are not able to do it alone, you could make own there PCs using NetMeeting remote control.


Well, I believe few would allow a stranger to remotely control their PCs :B

And the issue remains: Can we cut MP3pro, AAC+SBR and WMA?

Problem here is that we can't use that process for some samples and not for others, that would make the test biased from the start.

Another issue: If these people were to cut the samples themselves, they would need to own at least Adobe Audition (mp3pro) and Nero 6 (HE AAC). And not all of them own it, and some aren't as morally unrestrained as me as to go and get a ju4r3z version :B

Quote
The codec is named "Gecko" in the real papers, because "Real Audio" can be every codec used by Real (Sipro Voice Codec, DolbyNet, Atrac3 etc.). Because the FourCC of it is "cook" (this comes from the  name of the codec developer "Ken Cooke") it also often called so.


Yeah, I call it Cook myself. Whatever floats your boat...

Regards;

Roberto.
Title: Pre-Test thread
Post by: S_O on 2003-08-24 21:57:29
Quote
Well, I believe few would allow a stranger to remotely control their PCs :B
With NetMeeting you can see everything the other does and you can always terminate the remote control yust by pressing one key.
Quote
Well, there I was breaking patents that people don't give much of a damn anyway. It's not nearly as bad as breaking copyright. :-/
Since the sample is only for you, and the program you offer is for everybody, I think this different, breaking the patent law x-thousand times is much worse than breaking copyright law 12 times, also it doesn´t matter if you delete the uncutted samples afterwards.
Quote
And the issue remains: Can we cut MP3pro, AAC+SBR and WMA?

mp3pro: possible. I yust cutted a file in a hex editor somewhere (at a frame beginning) and it was decodeable correctly with SBR. Since aac is cuttable, and mp3pro is cuttable, HE-AAC should also be cuttable.
Quote
Another issue: If these people were to cut the samples themselves, they would need to own at least Adobe Audition (mp3pro) and Nero 6 (HE AAC). And not all of them own it, and some aren't as morally unrestrained as me as to go and get a ju4r3z version :B
You don´t care about ju4r3z, but you care about copyright???
Title: Pre-Test thread
Post by: rjamorim on 2003-08-24 22:23:22
Quote
You don´t care about ju4r3z, but you care about copyright???

I don't care about anything. :B

I'm just saying that I doubt people would want to publicly go around breaking copyrights to send their tracks to me. I don't care, but they might care.

The problem with cutting HE AAC is that it's inside an MP4 container, and you can't just go around chopping the container, you must take in consideration the MP4 headers, etc.
Title: Pre-Test thread
Post by: Dologan on 2003-08-25 05:34:47
Roberto, are you sure you can't merge the results of the core+extra codecs together and still do valid statistics?
Unfortunately, I have forgotten most of the statistics lessons I took over a year ago, so I would have to dig into a book to refresh my asleep neurons; but IIRC it wasn't that big of a deal if during an experiment one of the mice in one of the test groups died; so I suppose it must be analogous if some codecs don't have the same number of testers...

~Dologan
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 05:47:46
Well, the biggest problem I see here is using ff123's friedman tool to perform the statistical analysis.

It can accept text files in this format:

Code: [Select]
mp3    aac    vorbis    wma    mpc
2.5    4.2    4.0    3.6    4.5
3.1    4.0    4.3    4.0    5.0
4.0    5.0    4.2    4.5    5.0


But not in this one:

Code: [Select]
mp3    aac    vorbis    wma    mpc    real    vqf
2.5    4.2    4.0    3.6    4.5    
3.1    4.0    4.3    4.0    5.0    3.5
2.0    3.0    3.2    2.5    3.5  5.0


I can't fill the columns with some null value, because there isn't one.

And I wouldn't even know where to start calculating the statistics, so friedman.exe is a must-have.

Besides, I would need some proof that mixing together the results won't bias the test, so that I can show something to eventual critics.
Title: Pre-Test thread
Post by: Dologan on 2003-08-25 06:34:57
Hmm... I see...
When are you planning to start the test? I don't want to promise anything, but if I have time the next few days, I guess I could freshen my memory with some statistics books and see if it would be possible for us to perform a statistical analysis with unequal groups that doesn't bias the test in some way. The analysis then would not be as simple as running a little program, but would certainly be more complete imo. What do you say?

~Dologan
Title: Pre-Test thread
Post by: Gabriel on 2003-08-25 08:10:07
*I think that Guru's idea of 2 sets is interesting but unfortunately probably bad in our case. I am afraid that only experienced listeners would pick the second group. As we know that those listeners are using lower ranking (as demonstrated in the 128kbps test), ranking of both groups would probably not be comparable.

*I am not sure if Atrac-3 is really usefull, considering both the user base and the fact that the new portable players from Sony are now able to use other formats.

*Perhaps plain AAC should be considered, as it is decodable right now by some hardware players, while HE-AAC is not

*If a lower anchor has to be used, why not Lame --preset 64? (mp3 is still widely used at low bitrates for Shout/Ice streaming)

*I think that the high number of codecs is not such an issue, compared to the 128 test. In this case it will be easier for a broad range of listeners.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 08:55:07
Quote
When are you planning to start the test?

September 3rd. Of course, I would need that information a little before.

Quote
What do you say?


Well, if it doesn't turns out terribly difficult (I.E, I won't take a week to sort out results), fine, I can go for it.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 09:01:50
Quote
*I think that Guru's idea of 2 sets is interesting but unfortunately probably bad in our case. I am afraid that only experienced listeners would pick the second group. As we know that those listeners are using lower ranking (as demonstrated in the 128kbps test), ranking of both groups would probably not be comparable.


Good point.

Quote
*I am not sure if Atrac-3 is really usefull, considering both the user base and the fact that the new portable players from Sony are now able to use other formats.


Indeed, Atrac3 is nearly out of the test.

Quote
*Perhaps plain AAC should be considered, as it is decodable right now by some hardware players, while HE-AAC is not


Maybe, but HE AAC is definitely in. It's actually the main reason of this test, since the other codecs (Vorbis, MP3pro, WMA std) didn't change much since ff123's test.

Quote
*If a lower anchor has to be used, why not Lame --preset 64? (mp3 is still widely used at low bitrates for Shout/Ice streaming)


Maybe, but I'm more inclined of leaving the bottom anchor out.

Quote
*I think that the high number of codecs is not such an issue, compared to the 128 test. In this case it will be easier for a broad range of listeners.


Well, that's OK for me, I'm not the one that is going to take the test. :B

Thanks for your thoughts.

Regards;

Roberto.
Title: Pre-Test thread
Post by: ErikS on 2003-08-25 11:17:18
* If someone modifies ff123's tool you use for the analysis to accept null values, would you consider including the additional codecs?

* I don't like the idea to not use a lower anchor and I don't like the idea of using lame encoded files as anchors. Why? Because anchors should be fixed. If someone want's to redo this test next year to see how much the codecs have improved he needs to be able to use the exact same anchors as in this test. Lowpass is very fine in this regard, and also BladeEnc is pretty safe since it hasn't changed the last five years or so. But lame is still evolving slowly, so lame anchors should be avoided if possible. Or if you really want to use lame as an anchor you should save the exact version and which settings you used together with the test results. The lower anchor is needed to put things in perspective IMO. Without it, I think the scale of ratings would vary more than if it was included.
Title: Pre-Test thread
Post by: tigre on 2003-08-25 12:16:34
Quote
Quote
*I think that Guru's idea of 2 sets is interesting but unfortunately probably bad in our case. I am afraid that only experienced listeners would pick the second group. As we know that those listeners are using lower ranking (as demonstrated in the 128kbps test), ranking of both groups would probably not be comparable.


Good point.

A solution could be to "normalize" the rankings of each listener by calculating his/hers total average of rankings of the 1st (smaller) group of samples. Then all rankings (also for the 2nd (extended) group of samples should be multiplied with a certain factor so the average (1st group) becomes e.g. 3. Hopefully this doesn't mess up the results - another thing that dologan could try to find out.  B)
Title: Pre-Test thread
Post by: phong on 2003-08-25 12:48:28
Another problem with using lame as an anchor is that it might not serve its function.  A lower anchor should rank last for each sample.  Disreguarding the blade anchor, lame did not do that in the 128kbps test.  If a lower anchor is to be used (and I tend to think that would be a good idea, but I am not a statistics expert), I would be in favor of using blade again.  Unlike a simple lowpass, it produces a spectrum of different kinds of artifacts, which I think is a more realistic baseline to work from.

Also, since there haven't been any respnoses about my Linux ABC/HR clone, I assume there's no interest.  Anyone who's interested can let me know, otherwise I'll probably not devote as much time to it as I would otherwise.
Title: Pre-Test thread
Post by: elmar3rd on 2003-08-25 14:48:15
In statistics, an anchor can be a middle value, not the highest and not the lowest.
I think, in a listening test an anchor is a weighting for the results of every listener to make them more comparable. It is to prevent that some listeners only use the 4-5 range while others take the full range.
It is also useful to check that a participant submits serious results.

Therefore, Blade at 64 kbps would do it.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 17:47:24
I think we're making a mess out of this test...


OK, people, please give me suggestions. If you want a lower anchor, and lame 128 shouldn't be an anchor, which would be the codecs featured, in your opinion, including anchors?

Remember we're limited to 8 codecs.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 17:52:24
Quote
A solution could be to "normalize" the rankings of each listener by calculating his/hers total average of rankings of the 1st (smaller) group of samples. Then all rankings (also for the 2nd (extended) group of samples should be multiplied with a certain factor so the average (1st group) becomes e.g. 3. Hopefully this doesn't mess up the results - another thing that dologan could try to find out.  B)

My fear is that such a messy way of calculation would open lots of possibilities for critics and the like to flame my test.

Not mentioning that calculating the resulting scores will be nightmarish (I mean, it'll be very hard, and then human errors might creep in, since I won't be using only ff123's tools anymore to do the calculation, I would have to do several calculations myself)
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 18:02:47
Quote
EDIT: And what about PNS for nero? With quick test i heard that it is useful option to use low bitrates.

Ok. Just talked with Ivan, he said PNS is automagically disabled when you use HE AAC. Probably because it actually decreases quality, I guess.
Title: Pre-Test thread
Post by: tigre on 2003-08-25 18:28:42
Quote
Well, as already explained somewhere, the Anchor isn't there only to protect rankings, but also to put things into perspective across the entire sample suite.


Probably it does more good than bad to define fixed rankings for anchors but to put things into perspective it would be good IMO to suggest at least a range for the ranking of the anchors.

Taking this into account the codecs tested should be

higher anchor:
1. lame --preset 128; suggested ranking arround "4" (arround could mean e.g. +/-1)

lower anchor:
2. lame --preset 64; suggested ranking arround "1" - "2" (I don't know how reallistic this suggested ranking is as I haven't tested --preset 64 much so far.)
OR:
2. something transcoded, e.g. WMA9@64kbps -> MP3Pro@64kbps (could have some educational value)

3.Ahead HE-AAC

4.Ogg Vorbis

5.MP3pro

6.WMAV9

7.Real Audio Cook

8.AAC? ATRAC3Plus? WMA8? ...? I'd say AAC because of hardware support.
Title: Pre-Test thread
Post by: phong on 2003-08-25 18:52:25
Ok, these are my personal feelings:

I think that having lame at 128k for the upper anchor is critical.  Lots of these codecs are claiming to be as good at 64k as mp3 is at 128k.  I'm going to wager that such a claim is a big fat lie and I think one of the goals of this test should be testing that claim.

I don't care if there is or is not a lower anchor.  I guess I don't fully understand the significance of multiple anchors, or their importance.  I'm also perfectly happy doing 8 codecs.  With the 128 test, that would have been exhausting.  No codec is that close to transparent at 64k so it will be much easer and less fatiguing to do more samples.  If there is going to be a lower anchor, I would prefer blade.  I think lame at 64k would be surprisingly competitive.

As far as WMA vs. WMA pro, you're screwed any way you do it.  If you use WMA only, people will complain that you didn't include the best version.  If you use only WMA pro, people will claim that it's not representative of the WMA that everyone uses and actually has support in hardware.  Including both seems like a waste.

Sony's claiming that Atrac3 sounds really good and are marketing the crap out of it, which is a claim that should be tested.  However, apparently the software is so horrible that nobody's gonna use it no matter how good it sounds.  I'd say the same is true of Real Audio, but it's actually got some popularity somehow.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-25 20:48:27
Quote
I think that having lame at 128k for the upper anchor is critical.  Lots of these codecs are claiming to be as good at 64k as mp3 is at 128k.  I'm going to wager that such a claim is a big fat lie and I think one of the goals of this test should be testing that claim.

Right, so let's try to make things easier: This is definitely in:

-Lame --ap 128
-Ahead HE AAC Streaming :: Medium
-Vorbis -q 0 (or -q 0.2?)
-Adobe Audition MP3pro quality 40

This is discusseable:

-Real Audio Cook/Gecko 64kbps
-WMA (std or pro? CBR or VBR?)
-Bottom anchor (lowpass? blade? lame?)

-This is probably out, but can also be discussed:

-Atrac3plus
-QuickTime AAC LC

I think a good compromise between those that went to test as much as possible, and participants that don't want to waste too much time taking the test, is taking what's definitely in and what's discusseable. And leave out LC AAC and Atrac3+.

That would also make the statistical calculation of the resuls much easier and less prone to criticism. There would be no more odd packages, with a different sample in each of them, or doing special packages for those that want to test more.

Quote
As far as WMA vs. WMA pro, you're screwed any way you do it. If you use WMA only, people will complain that you didn't include the best version. If you use only WMA pro, people will claim that it's not representative of the WMA that everyone uses and actually has support in hardware. Including both seems like a waste.


True. I think another point would be that WMA std was already tested at ff123's test. Yes, it was v8, but I'm not very confident that v9 got much improved. Anyone caring to try? If it's nearly the same as v8, I might as well go with Pro.

Quote
Sony's claiming that Atrac3 sounds really good and are marketing the crap out of it, which is a claim that should be tested. However, apparently the software is so horrible that nobody's gonna use it no matter how good it sounds.


Haha. Right.

I created a special VirtualPC Win98 partition to install SonicStage, in case someone really wants it. But I'm inclined to let it alone.

Regards;

Roberto.
Title: Pre-Test thread
Post by: Gabriel on 2003-08-26 08:08:35
Real: I think that it is not needed
wma: I think that v9 std should be included, as it is marketed for portable devices
atrac3plus: do not think that we need it
aac-lc: I think that it should be included, as it can be decoded by portable devices.

Even if some codecs (wma and aac) were already tested in previous tests, I think that they should be included.

If an higher anchor is included (mp3-128), I think that a lower anchor should be included. Otherwise, there is a risk that most of the ranking would be in the bottom range. With a lower anchor, the interesting competitors will probably be more balanced.

As a lower anchor, I think that lame 64 would be good: mp3 is still used for streaming, and it is a something that is really used (Blade64 and lowpass are probably not used that much...)
I the exact version number and parameters are mentionned, this lower anchor is still reproducible.

So my choice would be:
lame 128 (higher anchor)
lame 64 (lower anchor)
vorbis
mp3pro
he-aac
aac
wma
Title: Pre-Test thread
Post by: Digga on 2003-08-26 09:50:15
Quote
If an higher anchor is included (mp3-128), I think that a lower anchor should be included. Otherwise, there is a risk that most of the ranking would be in the bottom range.

That thought sounds wise to my ears. IMO, either put one lower and one higher anchor in, or make explicitly clear that the results may be a little down the ladder.

I also would to like see wma included, as it is one of the advertised two main formats in portable music (...so ms would say, it's as good at 32kbps as mp3 at 64...) let's see if they right :-)
Title: Pre-Test thread
Post by: Digga on 2003-08-26 09:50:56
uups, double post...
Title: Pre-Test thread
Post by: LadFromDownUnder on 2003-08-26 11:06:07
Roberto, given the MS claims about WMA9 (not WMA8), and its current semi-support (most existing hardware devices support the capabilities of the WMA8 bit-stream, which is more constrained), and increasing industry support, it has to be included. 

Also, the MS claims about quality are VBR/ABR based, not CBR, and most of the other codec configurations are VBR/ABR configurations.  I suggest ABR (VBR 2-pass) with the standard codec (the pro codec will only realize 64k VBR through quality based configuration which would be more difficult to constrain than ABR).

By the way, thanks for doing all this, Roberto.  Whilst many folk will criticise, few will bother doing anything worth criticising.

Doug
Title: Pre-Test thread
Post by: rjamorim on 2003-08-26 17:36:51
OK, so let's try this:

-HE AAC
-Vorbis
-MP3pro
-WMA Std
-AAC-LC
-Real Audio
-Lame 128 as high anchor
-Lame/Blade 64 as bottom anchor

Any criticism?
Title: Pre-Test thread
Post by: Digga on 2003-08-26 19:45:30
That combination looks realy good. This way

- you can look at the differences btw he-aac and lc-aac at the given bitrate
- compare mp3 and wma in detail
- see how vorbis puts up with all of them and how it is possibly beaten by aac...
- and in the end get an impression if mp3(pro!!) realy still is the medium of choise in a low bitrate scenario

The anchors are well choosen, as almost anybody know how mp3 'should' sound and are well set.

I'll say, let's do it that way. Though the codec-combination looks good to me, I realy have no big clue bout all the special 'treatments' for each codec that may be chosen... (but that has been discussed before in this thread, if my mem. doesn't let me down).
Title: Pre-Test thread
Post by: elmar3rd on 2003-08-26 19:51:59
I agree.

Of course, 8 codecs will take some time (time to listen, time to relax the ears during the test).
But if someones time is short, it's better to reduce the number of testet samples in this personal case, than to reduce the number of codecs for the whole test.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-26 19:57:14
Quote
I realy have no big clue bout all the special 'treatments' for each codec that may be chosen... (but that has been discussed before in this thread, if my mem. doesn't let me down).

What treatments? Preprocessing?

Preprocessing IST VERBOTTEN!

Quote
But if someones time is short, it's better to reduce the number of testet samples in this personal case, than to reduce the number of codecs for the whole test.


True. Besides, the test will last for 11 days and two whole weekends. I reckon people will need less time to listen to this test's 8 samples than the 128kbps test's 6 samples.

Anyway, still looking for criticism before I officialize that as the sample suite.

(Also, I'll wait some time for a reply I'm expecting from a codec developer)

Thanks for all the suggestions and criticism.

Best regards;

Roberto.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-26 20:02:33
BTW, some more questions that need to be answered:

-What AAC LC codec we'll use, Apple (ABR 64kbps) or Ahead (VBR, Radio/Tape (I don't remember))
-Blade or Lame at 64kbps for the bottom lowpass? Or maybe even FhG?

IMO, it would be interesting to use MP3 at it's best at 64kbps too, to see how well (bad?) it compares to other codecs (specially WMA). And at 64kbps the best is surely FhG, since neither Lame nor Blade have Intensity Stereo coding.

Regards;

Roberto.
Title: Pre-Test thread
Post by: music_man_mpc on 2003-08-26 20:08:50
Quote
IMO, it would be interesting to use MP3 at it's best at 64kbps too, to see how well (bad?) it compares to other codecs (specially WMA). And at 64kbps the best is surely FhG, since neither Lame nor Blade have Intensity Stereo coding.

I'm interested too.  I say go with FhG for the bottom anchor.
Title: Pre-Test thread
Post by: phong on 2003-08-26 20:22:50
The list looks good from where I'm standing.  I'll also agree that featuring 64k mp3 at its best is going to provide the most interesting results.
Title: Pre-Test thread
Post by: n68 on 2003-08-26 20:26:16
Ciao...

read @ the portal.
sounds familiar.: *general phong*
Title: Pre-Test thread
Post by: rjamorim on 2003-08-26 20:29:03
Quote
read @ the portal.
sounds familiar.: *general phong*

Too late. You replied, and now it's gone. 
Title: Pre-Test thread
Post by: phong on 2003-08-26 21:52:38
Huh?
Title: Pre-Test thread
Post by: brett on 2003-08-27 00:59:53
rj -- as for which aac to include, i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt -- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment. however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw.

brett.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 01:12:19
Quote
rj -- as for which aac to include, i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt

Actually, I'm leaning more towards QT. First, because Ahead is already featuring a codec in this test, and second because QT fared so well in the AAC test.

Quote
-- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment.


I agree

Quote
however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw.


I don't take listening tests. But if someone wants to test Ahead 64 vs. QT 64, the results would be very welcome.

Regards;

Roberto.
Title: Pre-Test thread
Post by: Digga on 2003-08-27 01:14:57
Quote
i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt -- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment.


Well, I would be interested in Ahead's AAC, BECAUSE it is new and BECAUSE QT was tested and found to be realy good. I would like to see how Ahead fares at 64kpps, so you could roughly compare the two (though this is not realy valid, given the bitrate difference).
I also think that there is a growing userbase for Aheads implementation of the codec, so let's give it a try.

Quote
however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw


I thought is wasn't about chosing the best codecs out of the bunch, but more to include some well known ones... or am I getting something wrong here?!   

edit: damn, Rjamorim has smuggeld a post in before mine... 
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 01:36:33
Quote
I thought is wasn't about chosing the best codecs out of the bunch, but more to include some well known ones... or am I getting something wrong here?!   

My tests were *always*about*quality*

Other features like price, support, license, popularity, etc.  come far behind.

Quote
edit: damn, Rjamorim has smuggeld a post in before mine... 


Well, nothing has been decided so far
Title: Pre-Test thread
Post by: Gabriel on 2003-08-27 09:04:37
For mp3-64m I think that blade would be less interesting, as it does not reflect a real usage.
Lame or FhG would be more interesting.
Using FhG is an interesting suggestion if we consider that it is claimed to be the best mp3 codec in this bitrate area. (btw I am wondering how much better than Lame it really is)
Title: Pre-Test thread
Post by: tigre on 2003-08-27 14:41:05
Quote
I don't take listening tests. But if someone wants to test Ahead 64 vs. QT 64, the results would be very welcome.

I'm interested. To keep it lean some questions to AAC experts here:

Quicktime 6.3:
- Is it a good idea to resample to 32kHz at 64kbps

Ahead AAC (Using NeroMix 1.4.0.4):
- Should PNS be enabled at 64kbps or not?
- Using VBR is recommendable - correct?
- Any settings/hidden preferences/etc. I could have missed?

Thanks in advance.

If anyone's interested I can upload the encoded mp4 samples.
__________________

What about lame --alt-preset 64 vs. fhg? I'm curious about this too. Is there a better fhg encoder for that bitrate than Cool Edit Pro 2.1's?
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 14:50:54
Quote
Using FhG is an interesting suggestion if we consider that it is claimed to be the best mp3 codec in this bitrate area. (btw I am wondering how much better than Lame it really is)

Well, all I know are claims that FhG is better at 64 because it supports IS, and LAME doesn't.

Anybody caring to perform a quick test Lame vs. FhG?
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 14:58:04
Quote
- Is it a good idea to resample to 32kHz at 64kbps

I don't think it would be. That's definitely preprocessing, and it's bias favourable to QuickTime.

If we resample for one of the codecs, we must resample for all of them.

Quote
- Should PNS be enabled at 64kbps or not?


No. The only reason LC is being included is because it's popular and has hardware support. If we enable PNS, both reasons become moot.

Quote
- Using VBR is recommendable - correct?


Yes, probably the Tape preset.

Quote
- Any settings/hidden preferences/etc. I could have missed?


Probably not.

Quote
What about lame --alt-preset 64 vs. fhg? I'm curious about this too. Is there a better fhg encoder for that bitrate than Cool Edit Pro 2.1's?


I think, traditionally, CoolEdit and Musicmatch always use the latest FhG versions.

So, both should be OK.

Now, it's worth wondering what codec will output the best quality
"Current - Best Quality" or "Legacy - High Quality (Slow)"

Regards;

Roberto.
Title: Pre-Test thread
Post by: AstralStorm on 2003-08-27 15:22:18
I'll do a test with ABC/HR on few of samples from this test.
(FhG HQ vs FhG Current vs Lame -ap 64)
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 15:43:55
Quote
I'll do a test with ABC/HR on few of samples from this test.
(FhG HQ vs FhG Current vs Lame -ap 64)

Thanks

I think you'll have to try Quality 1 with FhG in Audition. The lowest preset (10) is announced as 80-95kbps, which is too high for the test.

If even 1 is too high, I might have to go with CBR 64.



BTW, I would like to use that obscuring BAT file you made for the 128 test. Is it OK?

Sorry about not using it on the 128test, but I have a policy of not changing the test unless it's absolutely necessary. Else, some people might believe the results they already submitted were invalid, and that definitely wouldn't be the case.

Regards;

Roberto.
Title: Pre-Test thread
Post by: ErikS on 2003-08-27 15:48:51
Quote
Quote
- Is it a good idea to resample to 32kHz at 64kbps

I don't think it would be. That's definitely preprocessing, and it's bias favourable to QuickTime.

If we resample for one of the codecs, we must resample for all of them.

Que? I would assume that any good codec would resample internally when it is appropriate. You're not saying that you will force the codecs to use 44.1 kHz, are you? For one it wouldn't reflect real world usage to force codecs to use a specific sample rate. Normally you would leave that decision to the encoder...
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 15:51:50
Quote
Que? I would assume that any good codec would resample internally when it is appropriate. You're not saying that you will force the codecs to use 44.1 kHz, are you? For one it wouldn't reflect real world usage to force codecs to use a specific sample rate. Normally you would leave that decision to the encoder...

Sure, sure. What I said:
"If we resample for one of the codecs..."

So, I won't resample samples before feeding them to their codecs. But if the codecs do that, it's their business.

Besides, I don't think that a codec resampling is preprocessing. It's just another step in the codec compression flow.
Title: Pre-Test thread
Post by: AstralStorm on 2003-08-27 15:55:32
Rjamorim,
you are hereby granted the rights to use my obscured-batch-file-test-method.
Consider it as BSDv2 licensed.

Quality 1 produces too large files, unfortunately.
I have prepared the files and will take the test right now.
Title: Pre-Test thread
Post by: AstralStorm on 2003-08-27 17:05:17
ABC/HR 0.9b doesn't save the results when I try to setup a new test.
I get a blank results file with the title of the new test!
I had to do the test twice (except Waiting) because of the bug.

Does anybody have a working link to Friedman?

Results:
Code: [Select]
ABC/HR Version 0.9b, 30 August 2002
Testname: Waiting MP3@64k

1R = D:\test\L-Waiting.wav
2L = D:\test\FH-Waiting.wav
3L = D:\test\FC-Waiting.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: D:\test\L-Waiting.wav
1R Rating: 1.5
1R Comment: Major ringing, slight dropouts
---------------------------------------
2L File: D:\test\FH-Waiting.wav
2L Rating: 2.0
2L Comment: Underwater
---------------------------------------
3L File: D:\test\FC-Waiting.wav
3L Rating: 1.7
3L Comment: Bad ringing
---------------------------------------
ABX Results:

ABC/HR Version 0.9b, 30 August 2002
Testname: Big Yellow Taxi MP3@64k

1L = D:\test\FH-bigye.wav
2L = D:\test\FC-bigye.wav
3R = D:\test\L-bigye.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: D:\test\FH-bigye.wav
1L Rating: 3.0
1L Comment: Not bad, except slight underwatery quality and lowpass
---------------------------------------
2L File: D:\test\FC-bigye.wav
2L Rating: 2.5
2L Comment: Less lowpass than 1, but more artifacts
---------------------------------------
3R File: D:\test\L-bigye.wav
3R Rating: 1.5
3R Comment: Dropouts in guitar + tons of other
---------------------------------------
ABX Results:

ABC/HR Version 0.9b, 30 August 2002
Testname: Gone MP3@64k

1R = D:\test\L-gone.wav
2L = D:\test\FC-gone.wav
3L = D:\test\FH-gone.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: D:\test\L-gone.wav
1R Rating: 1.0
1R Comment: Bad dropouts  and ringing
---------------------------------------
2L File: D:\test\FC-gone.wav
2L Rating: 1.4
2L Comment:
---------------------------------------
3L File: D:\test\FH-gone.wav
3L Rating: 1.3
3L Comment: Less ringing than 2, but too much lowpass
---------------------------------------
ABX Results:

ABC/HR Version 0.9b, 30 August 2002
Testname: Polonaise MP3@64k

1R = D:\test\FC-ChopinPolonaiseDMoll.wav
2L = D:\test\FH-ChopinPolonaiseDMoll.wav
3R = D:\test\L-ChopinPolonaiseDMoll.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: D:\test\FC-ChopinPolonaiseDMoll.wav
1R Rating: 2.5
1R Comment: Slightly more ringing than 2
---------------------------------------
2L File: D:\test\FH-ChopinPolonaiseDMoll.wav
2L Rating: 3.0
2L Comment:
---------------------------------------
3R File: D:\test\L-ChopinPolonaiseDMoll.wav
3R Rating: 1.0
3R Comment: Totally destroyed - sounds flangy, tons of dropouts
---------------------------------------
ABX Results:

ABC/HR Version 0.9b, 30 August 2002
Testname: Experiencia MP3@64k

1L = D:\test\FC-experiencia.wav
2R = D:\test\FH-experiencia.wav
3L = D:\test\L-experiencia.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: D:\test\FC-experiencia.wav
1L Rating: 2.5
1L Comment: Slightly less ringing than 3
---------------------------------------
2R File: D:\test\FH-experiencia.wav
2R Rating: 3.5
2R Comment: Sounds very good for 64k
---------------------------------------
3L File: D:\test\L-experiencia.wav
3L Rating: 2.3
3L Comment: Ringing
---------------------------------------
ABX Results:


/EDIT\ Cleanup and modifiaction - results same \EDIT/
Title: Pre-Test thread
Post by: rjamorim on 2003-08-27 17:08:29
Quote
Does anybody have a working link to Friedman?

http://ff123.net/friedman/friedman124.zip (http://ff123.net/friedman/friedman124.zip)

Thanks a lot for the help. I guess it'll be FhG-HQ then.
Title: Pre-Test thread
Post by: AstralStorm on 2003-08-27 17:21:54
Don't forget to enable all additional options (like narrowing of stereo image) and disable CRC generation.

/EDIT\
Additional information about my micro-test:
- all samples were decoded with Foobar2000 0.7RC7 without replaygain to 16bit and dithered
- encoded with LAME 3.90.3 --alt-preset 64 and Adobe Audition 1.0 without CRC and with all stereo options enabled.
\EDIT/
Title: Pre-Test thread
Post by: tigre on 2003-08-27 17:34:15
Quote
Quote
- Is it a good idea to resample to 32kHz at 64kbps

I don't think it would be. That's definitely preprocessing, and it's bias favourable to QuickTime.

If we resample for one of the codecs, we must resample for all of them.

Quicktime has a switch to choose sampling rate - as lame (22.05kHz sampling rate is integrated in --preset 64). So it'd be fair to choose the best sounding encoding setting.
Title: Pre-Test thread
Post by: tigre on 2003-08-28 01:08:33
Results
Settings used:

<samplename>_CEP_dev.wav
CEP2.1 fhg mp3 encoder
64kbps CBR, HQ, 22050Hz Samplingrate, default lowpass (10349Hz), m/s, intensity stereo, narrowing enabled.

<samplename>_CEP_LP11.wav
CEP2.1 fhg mp3 encoder
64kbps CBR, HQ, 22050Hz Samplingrate, lowpass 11025 Hz, m/s, intensity stereo, narrowing enabled.

<samplename>.wav.wav
lame 3.90.3
--preset 64

<samplename>_qt_44.wav
Quicktime Pro 6.3. AAC
64kbps CBR, Audio Track: "Music"; Stereo; Best Quality; Sample rate 44100Hz

<samplename>_qt_32.wav
Quicktime Pro 6.3. AAC
64kbps CBR, Audio Track: "Music"; Stereo; Best Quality; Sample rate 32000Hz

<samplename>_Nero.wav
NeroMix 1.4.0.4: Ahead AAC
VBR Tape; HQ; LC


Results so far:
Code: [Select]
ABC/HR Version 0.9b, 30 August 2002
Testname: 001 riteofspring 64kbps mp3/AAC

1R = .\Samples\001 riteofspring.wav.wav
2R = .\Samples\001 riteofspring_qt_44.wav
3R = .\Samples\001 riteofspring_CEP_dev.wav
4L = .\Samples\001 - riteofspring_Nero.wav
5L = .\Samples\001 riteofspring_qt_32.wav
6L = .\Samples\001 riteofspring_CEP_LP11.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: .\Samples\001 riteofspring.wav.wav
1R Rating: 2.0
1R Comment: Lowpass sounds too low most of the time
annoying ringing sometimes
watery sound
warbeling of higher frequency background sounds/noise


---------------------------------------
2R File: .\Samples\001 riteofspring_qt_44.wav
2R Rating: 3.8
2R Comment: Lowpass high enough: Lack of sharpness/brighness at a few points
ringing/warbeling hard to notice.
---------------------------------------
3R File: .\Samples\001 riteofspring_CEP_dev.wav
3R Rating: 2.5
3R Comment: Practically identical to 1 in:
"Lowpass sounds too low most of the time
watery sound
warbeling of higher frequency background sounds/noise"
but almost no ringing
the background warbeling is a little bit stronger sometimes.
---------------------------------------
4L File: .\Samples\001 - riteofspring_Nero.wav
4L Rating: 2.8
4L Comment: Lowpass noticable but not annoying (similar to 2)
but much ringing/warbeling/watery sound similar to 3
---------------------------------------
5L File: .\Samples\001 riteofspring_qt_32.wav
5L Rating: 3.8
5L Comment: No difference to 2:
Lowpass high enough: Lack of sharpness/brighness at a few points
ringing/warbeling hard to notice.
---------------------------------------
6L File: .\Samples\001 riteofspring_CEP_LP11.wav
6L Rating: 2.4
6L Comment: Same as 3
"Practically identical to 1 in:
'Lowpass sounds too low most of the time
watery sound
warbeling of higher frequency background sounds/noise'
but almost no ringing
the background warbeling is a little bit stronger sometimes."
but a bit more ringing (far less than 1)

---------------------------------------
ABX Results:
Original vs .\Samples\001 riteofspring.wav.wav
   8 out of 8, pval = 0.004
Original vs .\Samples\001 riteofspring_qt_44.wav
   11 out of 12, pval = 0.003
Original vs .\Samples\001 riteofspring_CEP_dev.wav
   8 out of 8, pval = 0.004
Original vs .\Samples\001 - riteofspring_Nero.wav
   8 out of 8, pval = 0.004
Original vs .\Samples\001 riteofspring_qt_32.wav
   8 out of 8, pval = 0.004
Original vs .\Samples\001 riteofspring_CEP_LP11.wav
   8 out of 8, pval = 0.004
.\Samples\001 riteofspring_CEP_dev.wav vs .\Samples\001 riteofspring_CEP_LP11.wav
   8 out of 8, pval = 0.004

_______________________________
ABC/HR Version 0.9b, 30 August 2002
Testname: 002 bigye 64kbps mp3/AAC

1L = .\Samples\002 bigye.wav.wav
2L = .\Samples\002 bigye_CEP_LP11.wav
3R = .\Samples\002 - bigye_Nero.wav
4R = .\Samples\002 bigye_CEP_dev.wav
5L = .\Samples\002 bigye_qt_44.wav
6R = .\Samples\002 bigye_qt_32.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\002 bigye.wav.wav
1L Rating: 1.8
1L Comment: Most annoying:
1. Ringing
2. Lowpass
3. smeared drums
Other problems don't matter in comparison
---------------------------------------
2L File: .\Samples\002 bigye_CEP_LP11.wav
2L Rating: 1.6
2L Comment: Same as 1:
Most annoying:
1. Ringing (slightly worse than 1)
2. Lowpass
3. smeared drums
Other problems don't matter in comparison
---------------------------------------
3R File: .\Samples\002 - bigye_Nero.wav
3R Rating: 3.8
3R Comment: lowpass hard to notice
a few positions with slight chirping/warbling/smearing on cymbals/hihats
---------------------------------------
4R File: .\Samples\002 bigye_CEP_dev.wav
4R Rating: 2.4
4R Comment: Same as 1 + 2, but less ringing
---------------------------------------
5L File: .\Samples\002 bigye_qt_44.wav
5L Rating: 4.0
5L Comment: same as 3 but hihats sound a bit better
---------------------------------------
6R File: .\Samples\002 bigye_qt_32.wav
6R Rating: 4.2
6R Comment: same as 3, 5 but hihats sound best
---------------------------------------
ABX Results:
Original vs .\Samples\002 bigye.wav.wav
   8 out of 8, pval = 0.004

___________________________________
ABC/HR Version 0.9b, 30 August 2002
Testname: 006 experiencia 64kbps mp3/AAC

1L = .\Samples\006 experiencia_CEP_dev.wav
2R = .\Samples\006 experiencia_qt_44.wav
3L = .\Samples\006 experiencia_qt_32.wav
4L = .\Samples\006 experiencia.wav.wav
5R = .\Samples\006 - experiencia_Nero.wav
6R = .\Samples\006 experiencia_CEP_LP11.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\006 experiencia_CEP_dev.wav
1L Rating: 2.2
1L Comment: Lowpass
Ringing
Smeared transients
---------------------------------------
2R File: .\Samples\006 experiencia_qt_44.wav
2R Rating: 3.8
2R Comment: Smeared percussion; almost no ringing, warbling; lowpass sounds noticable but ok
---------------------------------------
3L File: .\Samples\006 experiencia_qt_32.wav
3L Rating: 3.8
3L Comment: Same as 2:
Smeared percussion; almost no ringing, warbling; lowpass sounds noticable but ok
---------------------------------------
4L File: .\Samples\006 experiencia.wav.wav
4L Rating: 1.8
4L Comment: As 1:
Lowpass
Ringing (much more)
Smeared transients
---------------------------------------
5R File: .\Samples\006 - experiencia_Nero.wav
5R Rating: 3.8
5R Comment: More accurate than 2 & 3 (sharper percussion) but sometimes slight ringing; hard to tell which is better
---------------------------------------
6R File: .\Samples\006 experiencia_CEP_LP11.wav
6R Rating: 1.9
6R Comment: As 4:
Lowpass
Ringing (a bit better)
Smeared transients
---------------------------------------
ABX Results:


So far
for AAC quicktime is better (no mentionable difference between sampling rates);
for mp3 FHG Fraunhofer at default 64kbps settings is better. Here things could change because at least the extreme stereo samples seem to sound extremely crappy with FHG's intensity stereo. L8r.
Title: Pre-Test thread
Post by: tigre on 2003-08-28 12:18:59
Next result: The Chopin Piano sample. To me surprising compared to previous ones:
- lame performed far better than FHG
- Ahead AAC was worst of all codecs!
Code: [Select]
ABC/HR Version 0.9b, 30 August 2002
Testname: 003 ChopinPolonaiseDMoll 64kbps mp3/AAC

1L = .\Samples\003 ChopinPolonaiseDMoll_qt_32.wav
2L = .\Samples\003 ChopinPolonaiseDMoll.wav.wav
3R = .\Samples\003 ChopinPolonaiseDMoll_qt_44.wav
4L = .\Samples\003 ChopinPolonaiseDMoll_CEP_dev.wav
5R = .\Samples\003 ChopinPolonaiseDMoll_CEP_LP11.wav
6L = .\Samples\003 - ChopinPolonaiseDMoll_Nero.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\003 ChopinPolonaiseDMoll_qt_32.wav
1L Rating: 3.5
1L Comment: Sounds half-good; lowpass detectable but not annoying; ringing, also pre-echo-like (not that much) most annoying artifact; high frequeny noise-like echo reduced on high notes.
---------------------------------------
2L File: .\Samples\003 ChopinPolonaiseDMoll.wav.wav
2L Rating: 4.0
2L Comment: Sounds good (similar to 1); lowpass detectable but not annoying; very small amount of warble on highest notes; high frequeny noise-like echo reduced on high notes.
---------------------------------------
3R File: .\Samples\003 ChopinPolonaiseDMoll_qt_44.wav
3R Rating: 4.4
3R Comment: Without focussing on a few seconds everything sounds fine. In ABX situation high frequency warbeling/ringing detectable.
---------------------------------------
4L File: .\Samples\003 ChopinPolonaiseDMoll_CEP_dev.wav
4L Rating: 3.0
4L Comment: Most noticable: Pre-echo on low notes; compared to this other problems (similar to 1, 2) don't matter.
---------------------------------------
5R File: .\Samples\003 ChopinPolonaiseDMoll_CEP_LP11.wav
5R Rating: 2.8
5R Comment: Almost the same as 4 but a little bit more ringing
---------------------------------------
6L File: .\Samples\003 - ChopinPolonaiseDMoll_Nero.wav
6L Rating: 2.0
6L Comment: Annoying: Warbeling (especially on echos: metallic sound); watery sound. Lowpass no problem here in comparision.
---------------------------------------
ABX Results:
Original vs .\Samples\003 ChopinPolonaiseDMoll_qt_44.wav
   8 out of 8, pval = 0.004


New Results:

Code: [Select]
ABC/HR Version 0.9b, 30 August 2002
Testname: 004 Daft_Punk___Da_Funk mp3/AAC

1R = .\Samples\004 Daft_Punk___Da_Funk.wav.wav
2L = .\Samples\004 Daft_Punk___Da_Funk_CEP_LP11.wav
3L = .\Samples\004 Daft_Punk___Da_Funk_qt_32.wav
4L = .\Samples\004 Daft_Punk___Da_Funk_qt_44.wav
5L = .\Samples\004 Daft_Punk___Da_Funk_CEP_dev.wav
6R = .\Samples\004 - Daft_Punk___Da_Funk_Nero.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: .\Samples\004 Daft_Punk___Da_Funk.wav.wav
1R Rating: 1.5
1R Comment: Smeared transients, high frequency cutoff much too low, ringing
---------------------------------------
2L File: .\Samples\004 Daft_Punk___Da_Funk_CEP_LP11.wav
2L Rating: 1.3
2L Comment: Same as 1: Smeared transients, too low cutoff, ringing; but ringing is slightly more annyoing
---------------------------------------
3L File: .\Samples\004 Daft_Punk___Da_Funk_qt_32.wav
3L Rating: 3.0
3L Comment: Most annoying: lack of brightness, smeared transients
---------------------------------------
4L File: .\Samples\004 Daft_Punk___Da_Funk_qt_44.wav
4L Rating: 3.0
4L Comment: Same as 3
---------------------------------------
5L File: .\Samples\004 Daft_Punk___Da_Funk_CEP_dev.wav
5L Rating: 1.5
5L Comment: similar to 1 + 2, but almost no ringing. Transients even more smeared. Can't tell if 1 or 3 is better/worse
---------------------------------------
6R File: .\Samples\004 - Daft_Punk___Da_Funk_Nero.wav
6R Rating: 2.3
6R Comment: similar to 3+4, but additional warbeling/watery sound. clearly worse.
---------------------------------------
ABX Results:
__________________________________________

ABC/HR Version 0.9b, 30 August 2002
Testname: 005 Enola_Gay 64kbps mp3/AAC

1L = .\Samples\005 Enola_Gay.wav.wav
2L = .\Samples\005 - Enola_Gay_Nero.wav
3R = .\Samples\005 Enola_Gay_CEP_LP11.wav
4L = .\Samples\005 Enola_Gay_qt_44.wav
5R = .\Samples\005 Enola_Gay_CEP_dev.wav
6R = .\Samples\005 Enola_Gay_qt_32.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\005 Enola_Gay.wav.wav
1L Rating: 2.0
1L Comment: most audible/annoying: lowpass, ringing, warbeling, smeared drums
---------------------------------------
2L File: .\Samples\005 - Enola_Gay_Nero.wav
2L Rating: 3.5
2L Comment: Most noticable: lack of brightness, smeared hihats with slight ringing
---------------------------------------
3R File: .\Samples\005 Enola_Gay_CEP_LP11.wav
3R Rating: 2.0
3R Comment: same as 1
---------------------------------------
4L File: .\Samples\005 Enola_Gay_qt_44.wav
4L Rating: 3.3
4L Comment: same as 2 but less bright, less ringing, more smearing; slightly worse for me but a matter of taste I'd say.
---------------------------------------
5R File: .\Samples\005 Enola_Gay_CEP_dev.wav
5R Rating: 2.2
5R Comment: similar to 1 + 3 but almost no ringing. OTH less bright, slightly smearier. All in all a bit better.
---------------------------------------
6R File: .\Samples\005 Enola_Gay_qt_32.wav
6R Rating: 3.5
6R Comment: similar to 2 and 4. least ringing, same smearing as 4, therefore same rating as 2.
---------------------------------------
ABX Results:
_________________________________________

ABC/HR Version 0.9b, 30 August 2002
Testname: 007 gone 64kbps mp3/AAC

1R = .\Samples\007 - gone_Nero.wav
2L = .\Samples\007 gone_qt_44.wav
3L = .\Samples\007 gone_qt_32.wav
4L = .\Samples\007 gone.wav.wav
5L = .\Samples\007 gone_CEP_dev.wav
6L = .\Samples\007 gone_CEP_LP11.wav

---------------------------------------
General Comments:

---------------------------------------
1R File: .\Samples\007 - gone_Nero.wav
1R Rating: 1.9
1R Comment: piano part: 2.0 - warbeling, watery, ringing
guitar part: 1.8 - still annoyingly watery + ringing sound
---------------------------------------
2L File: .\Samples\007 gone_qt_44.wav
2L Rating: 2.8
2L Comment: piano part: 3.4 - background noise gone or bumping pre-echo-like
guitar part: 2.2 - ringing, watery sound almost as bad as 1
---------------------------------------
3L File: .\Samples\007 gone_qt_32.wav
3L Rating: 2.8
3L Comment: piano + guitar part: same as 2
---------------------------------------
4L File: .\Samples\007 gone.wav.wav
4L Rating: 2.9
4L Comment: piano part: 3.8 - same as 2,3 but noise completely removed; sounds like lower cutoff but cleaner, less annoying
guitar part: 2.0 - ringing, warbeling (a little better than 2,3) but low cutoff
---------------------------------------
5L File: .\Samples\007 gone_CEP_dev.wav
5L Rating: 3.0
5L Comment: piano part: same as 4
Guitar part: same as 4, ringing slightly less annoying
---------------------------------------
6L File: .\Samples\007 gone_CEP_LP11.wav
6L Rating: 2.3
6L Comment: piano part: 2.5 - cutoff noticable = lack of brightness, background noise; (slightly) ringing transients
guitar part: 1.9 - similar to 5,6; a bit more annoying.
---------------------------------------
ABX Results:

______________________________________

ABC/HR Version 0.9b, 30 August 2002
Testname: 009 mybloodrusts.sample20sec 64kbps mp3/AAC

1L = .\Samples\009 mybloodrusts.sample20sec_qt_32.wav
2L = .\Samples\009 mybloodrusts.sample20sec_qt_44.wav
3L = .\Samples\009 mybloodrusts.sample20sec_CEP_dev.wav
4R = .\Samples\009 mybloodrusts.sample20sec_CEP_LP11.wav
5L = .\Samples\009 mybloodrusts.sample20sec.wav.wav
6R = .\Samples\009 - mybloodrusts.sample20sec_Nero.wav

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\009 mybloodrusts.sample20sec_qt_32.wav
1L Rating: 3.5
1L Comment: 1st part (left ear mono): good, only slight chirping on right channel, probably not noticable without headphones
2nd part: warbeling cymbals; no obvious lowpass problem
---------------------------------------
2L File: .\Samples\009 mybloodrusts.sample20sec_qt_44.wav
2L Rating: 3.0
2L Comment: similar to 1 but both problems more obvious
---------------------------------------
3L File: .\Samples\009 mybloodrusts.sample20sec_CEP_dev.wav
3L Rating: 1.5
3L Comment: 1st part: severe stereo separation problem
2nd part: high pitched drums totally smeared, replaced by warbeling
---------------------------------------
4R File: .\Samples\009 mybloodrusts.sample20sec_CEP_LP11.wav
4R Rating: 1.3
4R Comment: same as 3; warbeling more "chirpy", therefore more audible; OTH less loss of brightness; slightly worse than 3 I'd say.
---------------------------------------
5L File: .\Samples\009 mybloodrusts.sample20sec.wav.wav
5L Rating: 2.5
5L Comment: stereo separation in 1st part best of all samples
warbeling/cutoff/smearing similar to 3,4 but less annoying
---------------------------------------
6R File: .\Samples\009 - mybloodrusts.sample20sec_Nero.wav
6R Rating: 1.0
6R Comment: 1st part: other kind of stereo separation problem than 3,4 but similar annoying
2nd part: most awful chirping/warbeling
---------------------------------------
ABX Results:


edit: wrong result posted for "007 gone" - corrected.
Title: Pre-Test thread
Post by: Ivan Dimkovic on 2003-08-28 13:54:34
Hmm.. for plain AAC (not HE AAC)  I think that Ahead's  64 kbps CBR mode is better than 'Tape' preset.  Also, encoder in the latest Nero6 is better.
Title: Pre-Test thread
Post by: AstralStorm on 2003-08-28 14:45:39
Weird... for me LAME Polonaise sample is just unlistenable - it is flangy^H^H^H^H^H^Hphasey because of the dropouts!
(It is ~61kbps. I'll try --alt-preset CBR to see if it fixes anything)
FhG samples were just somewhat ringy and preechoey.

/EDIT\ For me, --alt-preset cbr 64 is a bit better on this sample. \EDIT/
Title: Pre-Test thread
Post by: tigre on 2003-08-28 17:26:20
Quote
Hmm.. for plain AAC (not HE AAC)  I think that Ahead's  64 kbps CBR mode is better than 'Tape' preset.  Also, encoder in the latest Nero6 is better.

Thanks for the info. If I get this right, this means the Nero Mix demo I've downloaded ~1 week ago doesn't contain the latest Ahead AAC encoder - correct? So ... I'll give Nero 6 a try. (Hopefully I get it uninstalled afterwards without any trouble.)

EDIT: I've just downloaded the latest Nero6 and had a look at the extracted files after starting installation: aacenc32.dll is the same version (2.5.5.1) that I had already in my Ahead/Shared folder (HE AAC encoding also works with NeroMix). Is there a newer one I've missed? - I guess not, so I'll start encoding 64kbps CBR ...

____________________

I won't test mp3 anymore. Enough time spent on it. The decision lame vs. FHG is still hard IMO.
Lame:
- more problems with ringing, warbeling, etc.
- sometimes quite good, sometimes quite annoying
+ much better for extreme stereo (everything on 1 channel)
+ preserves more details on transients

FHG:
- smeared transients
- problems with extreme stereo (Overrepresented in the test samples?)
+ ringing, warbeling not that bad most of the time
+ somehow constant performance over a broad variety of music

I'd probably choose FHG at CEP's 64kbps default setting (22.5kHz sampling rate).
Title: Pre-Test thread
Post by: rjamorim on 2003-08-28 18:15:24
I guess the biggest problem of lame vs. FhG is for stereo imaging sensibility of listeners. Lame will probably have a quite better stereo image since it doesn't use IS at all.

Have you tried encoding with FhG with IS disabled?
Title: Pre-Test thread
Post by: tigre on 2003-08-28 18:26:49
Not yet. good idea ...
Title: Pre-Test thread
Post by: askoff on 2003-08-29 11:27:35
rjamorim: Didn't you say that you are looking for quality of the encoders? So i think PNS should be on when encodin AAC-LC with nero at this bitrate. Compatibility with portables should not be noticed in this test.
Title: Pre-Test thread
Post by: tigre on 2003-08-29 11:37:37
Results

Settings used:

<samplename>.wav.wav
lame 3.90.3
--preset 64

<samplename>_CEP_is_narrow.wav
CEP2.1 fhg mp3 encoder
64kbps CBR, HQ, 22050Hz Samplingrate, default lowpass (10349Hz), joint stereo, intensity stereo, narrowing enabled

<samplename>_CEP_narrow.wav
CEP2.1 fhg mp3 encoder
64kbps CBR, HQ, 22050Hz Samplingrate, default lowpass (10349Hz), joint stereo, narrowing enabled, intensity stereo disabled

<samplename>_Nero.wav
NeroMix 1.4.0.4: Ahead AAC (aacenc32.dll v. 2.5.5.1)
VBR Tape; HQ; LC

<samplename>_Nero_cbr.wav
NeroMix 1.4.0.4: Ahead AAC (aacenc32.dll v. 2.5.5.1)
CBR 64kbps; HQ; LC

<samplename>_qt_44.wav
Quicktime Pro 6.3. AAC
64kbps CBR, Audio Track: "Music"; Stereo; Best Quality; Sample rate 44100Hz

Results:

Code: [Select]
---------------------------------------
General Comments:

---------------------------------------
1R File: .\Samples\001 riteofspring_qt_44.wav
1R Rating: 3.8
1R Comment: small lack of brightness, sharpness in a few places
---------------------------------------
2L File: .\Samples\001 riteofspring_CEP_is_narrow.wav
2L Rating: 2.2
2L Comment: low cutoff, warbeling, watery, ringing, mostly on transients
---------------------------------------
3R File: .\Samples\001 riteofspring.wav.wav
3R Rating: 2.0
3R Comment: similar to 2, ringing sometimes more annoying
---------------------------------------
4L File: .\Samples\001 - riteofspring_Nero.wav
4L Rating: 3.0
4L Comment: warbeling, ringing comparable to 2, 3 but bright as 1, more sharpness
---------------------------------------
5L File: .\Samples\001 riteofspring_CEP_narrow.wav
5L Rating: 2.2
5L Comment: same as 2
---------------------------------------
6L File: .\Samples\001 - riteofspring_Nero_cbr.wav
6L Rating: 3.2
6L Comment: same as 4 but a little less obvious
---------------------------------------

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\002 - bigye_Nero_cbr.wav
1L Rating: 3.0
1L Comment: flanging, ringing, smeared transients (hihats)
---------------------------------------
2R File: .\Samples\002 bigye.wav.wav
2R Rating: 1.5
2R Comment: lowpass, ringing, warbeling
---------------------------------------
3L File: .\Samples\002 bigye_CEP_narrow.wav
3L Rating: 2.0
3L Comment: similar to 2 but less annoying
---------------------------------------
4R File: .\Samples\002 bigye_qt_44.wav
4R Rating: 3.8
4R Comment: similar to 1 but much better (-> ringing), OTOH less sharp
---------------------------------------
5L File: .\Samples\002 - bigye_Nero.wav
5L Rating: 3.6
5L Comment: similar to 1 but less annoying
---------------------------------------
6R File: .\Samples\002 bigye_CEP_is_narrow.wav
6R Rating: 2.0
6R Comment: same as 3
---------------------------------------

---------------------------------------
General Comments:
small annoyances with warbeling/flanging of different kinds with all samples; hard to judge which is worse (exept for 2)
---------------------------------------
1R File: .\Samples\003 ChopinPolonaiseDMoll_CEP_narrow.wav
1R Rating: 2.5
1R Comment: Pre-echo, cut-off sometimes too low
---------------------------------------
2R File: .\Samples\003 - ChopinPolonaiseDMoll_Nero.wav
2R Rating: 1.5
2R Comment: warbeling, ringing, watery; metallic echos
---------------------------------------
3L File: .\Samples\003 - ChopinPolonaiseDMoll_Nero_cbr.wav
3L Rating: 3.5
3L Comment: same problems as 2 but much better; sometimes bumbping low-pitched noise like sound added (15.8-19)
---------------------------------------
4L File: .\Samples\003 ChopinPolonaiseDMoll_CEP_is_narrow.wav
4L Rating: 2.5
4L Comment: same as 1
---------------------------------------
5R File: .\Samples\003 ChopinPolonaiseDMoll_qt_44.wav
5R Rating: 4.0
5R Comment: acceptable cutoff; not much ringing, pre-echo
---------------------------------------
6R File: .\Samples\003 ChopinPolonaiseDMoll.wav.wav
6R Rating: 3.0
6R Comment: similar to 1,4; less pre-echo
---------------------------------------

---------------------------------------
General Comments:

---------------------------------------
1R File: .\Samples\004 Daft_Punk___Da_Funk_qt_44.wav
1R Rating: 3.8
1R Comment: lack of brightness, high-pitched drums smeared, some ringing/warbeling problems
---------------------------------------
2R File: .\Samples\004 Daft_Punk___Da_Funk_CEP_is_narrow.wav
2R Rating: 2.0
2R Comment: cutoff, smeared, warbeling, flanging, ringing
---------------------------------------
3R File: .\Samples\004 - Daft_Punk___Da_Funk_Nero.wav
3R Rating: 3.0
3R Comment: similar to 1, more ringing/flanging, highpitched drums + brightness a bit better; all in all worse
---------------------------------------
4L File: .\Samples\004 Daft_Punk___Da_Funk.wav.wav
4L Rating: 1.5
4L Comment: similar to 2, ringing more annoying
---------------------------------------
5L File: .\Samples\004 Daft_Punk___Da_Funk_CEP_narrow.wav
5L Rating: 2.0
5L Comment: same as 2
---------------------------------------
6L File: .\Samples\004 - Daft_Punk___Da_Funk_Nero_cbr.wav
6L Rating: 3.3
6L Comment: similar to 4 but more warbeling/dropouts; audible clipping at 5 sec.
---------------------------------------

---------------------------------------
General Comments:

---------------------------------------
1L File: .\Samples\007 gone_CEP_is_narrow.wav
1L Rating: 3.0
1L Comment: piano: cutoff, pre-echo, but not much annoying
guitar: cutoff, ringing , warbeling
---------------------------------------
2R File: .\Samples\007 gone_qt_44.wav
2R Rating: 2.5
2R Comment: piano: bright, but pumping noise, chirping, pre-echo
guitar: ringing, watery, metallic, brighter but more annoying than 1
---------------------------------------
3L File: .\Samples\007 - gone_Nero_cbr.wav
3L Rating: 2.3
3L Comment: piano: slightly warbeling noise, flanging
guitar: as 2 but more flanging
---------------------------------------
4L File: .\Samples\007 - gone_Nero.wav
4L Rating: 1.5
4L Comment: piano: 2.0 - warbeling, ringing, watery; lowpass
guitar: as 3 but even more flanging, ringing
---------------------------------------
5L File: .\Samples\007 gone.wav.wav
5L Rating: 2.8
5L Comment: same as 1, more ringing in guitar part
---------------------------------------
6R File: .\Samples\007 gone_CEP_narrow.wav
6R Rating: 3.0
6R Comment: same as 1
---------------------------------------

---------------------------------------
General Comments:

---------------------------------------
1R File: .\Samples\009 mybloodrusts.sample20sec_CEP_is_narrow.wav
1R Rating: 1.5
1R Comment: bad stereo problem in 1st part
2nd part: warbeling cymbals
---------------------------------------
2L File: .\Samples\009 mybloodrusts.sample20sec_qt_44.wav
2L Rating: 2.5
2L Comment: slight chirping in 1st part
2nd part strong chirping on cymbals
---------------------------------------
3L File: .\Samples\009 mybloodrusts.sample20sec.wav.wav
3L Rating: 2.5
3L Comment: lowpass, but best stereo reproduction
2nd: cutoff -> loss of details but flanging + ringing not that annoying
---------------------------------------
4R File: .\Samples\009 - mybloodrusts.sample20sec_Nero_cbr.wav
4R Rating: 1.5
4R Comment: worst stereo problem in 1st part
2nd part not as annoying as 6 but similar
---------------------------------------
5R File: .\Samples\009 mybloodrusts.sample20sec_CEP_narrow.wav
5R Rating: 2.4
5R Comment: 1st part: same as 3 but more flanging
2nd part similar to 3
---------------------------------------
6L File: .\Samples\009 - mybloodrusts.sample20sec_Nero.wav
6L Rating: 1.5
6L Comment: 1st part: bumping stereo problem, 2nd awful warbeling, ringing
---------------------------------------


Conclusions:

In the case where FHG with intensity stereo failed badly before it's improved very much without IS. No general differences detectable, so FHG without IS seems to be the best choice for 64kbps mp3.

Most of the time using CBR improves the performance of Nero AAC but Quicktime is still better.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-29 17:24:55
Quote
Conclusions:

In the case where FHG with intensity stereo failed badly before it's improved very much without IS. No general differences detectable, so FHG without IS seems to be the best choice for 64kbps mp3.

Most of the time using CBR improves the performance of Nero AAC but Quicktime is still better.

Great! Thanks a lot for your invaluable help.

So, here's an updated list of codecs to be featured:

-Nero HE AAC
-Ogg Vorbis
-MP3pro
-WMAv9 std
-Apple AAC LC
-Higher anchor: Lame --alt-preset 128
-FhG MP3 64kbps w/o IS

Real is on hold due to reasons out of my league. I'll know if it'll be featured or not by the weekend.
Title: Pre-Test thread
Post by: tigre on 2003-08-30 18:19:38
Quote
Quote
Hmm.. for plain AAC (not HE AAC)  I think that Ahead's  64 kbps CBR mode is better than 'Tape' preset.  Also, encoder in the latest Nero6 is better.

Thanks for the info. If I get this right, this means the Nero Mix demo I've downloaded ~1 week ago doesn't contain the latest Ahead AAC encoder - correct? So ... I'll give Nero 6 a try. (Hopefully I get it uninstalled afterwards without any trouble.)

EDIT: I've just downloaded the latest Nero6 and had a look at the extracted files after starting installation: aacenc32.dll is the same version (2.5.5.1) that I had already in my Ahead/Shared folder (HE AAC encoding also works with NeroMix). Is there a newer one I've missed? - I guess not, so I'll start encoding 64kbps CBR ...

Some news: NeroMediaPlayer available for download since 08/29/2003 has a new aacenc32.dll included (v. 2.5.5.2). Besides this problem (http://www.hydrogenaudio.org/forums/index.php?showtopic=12753&st=0&#entry129761) in a quick test it sounded better than v. 2.5.5.1 (haven't compared to quicktime so far). Hopefully I'll be able to repeat the AAC part of the test this weekend ... The improvement, especially with the worst sounding piano sample was quite big, so maybe we'll get a photo-finish.  B)
Title: Pre-Test thread
Post by: tigre on 2003-08-30 23:41:13
Test with newest Ahead aacenc32.dll v. 2.5.5.2
same settings as before.

Results:

Code: [Select]
---------------------------------------
1R File: .\Samples\003 ChopinPolonaiseDMoll_Nero_vbr_new.wav
1R Rating: 3.0
1R Comment: lower cutoff, ringing and warbeling in some places (e.g. 14-18), bumping echos
---------------------------------------
2R File: .\Samples\003 ChopinPolonaiseDMoll_Nero_cbr_new.wav
2R Rating: 3.5
2R Comment: high tones sound a bit like another instrument, problem with transients
---------------------------------------
3R File: .\Samples\003 ChopinPolonaiseDMoll_qt_44.wav
3R Rating: 3.7
3R Comment: small lack of brightness compared to 2, but less other problems.
---------------------------------------

---------------------------------------
1L File: .\Samples\005 Enola_Gay_qt_44.wav
1L Rating: 3.5
1L Comment: brightness, smeared transients, a little bit of flanging, pre-echo
---------------------------------------
2R File: .\Samples\005 Enola_Gay_Nero_vbr_new.wav
2R Rating: 3.0
2R Comment: less bright than 1, less flanging, a little bit less pre-echo at some places
---------------------------------------
3L File: .\Samples\005 Enola_Gay_Nero_cbr_new.wav
3L Rating: 2.8
3L Comment: same as 3, brighter but serious stereo problem at 1 sec., warbeling
---------------------------------------

---------------------------------------
1L File: .\Samples\006 experiencia_Nero_cbr_new.wav
1L Rating: 3.6
1L Comment: percussion smeared, trumpets suffer from cutoff
---------------------------------------
2L File: .\Samples\006 experiencia_Nero_vbr_new.wav
2L Rating: 2.5
2L Comment: similar to 1; cutoff more noticable, ringing
---------------------------------------
3R File: .\Samples\006 experiencia_qt_44.wav
3R Rating: 3.5
3R Comment: same as 1, trumpets a little bit worse
---------------------------------------

---------------------------------------
1L File: .\Samples\007 gone_Nero_vbr_new.wav
1L Rating: 2.0
1L Comment: 1st part: 2.0 cutoff, warbeling, pre-echo
2nd part: 2.0 warbeling, ringing, flanging, watery
---------------------------------------
2L File: .\Samples\007 gone_Nero_cbr_new.wav
2L Rating: 3.3
2L Comment: 1st part: 3.5 high pitched noise partitially removed + bumping, slightly flanging in some places
2nd part: 3.0 ringing, flanging, cymbals smeared
---------------------------------------
3L File: .\Samples\007 gone_qt_44.wav
3L Rating: 3.5
3L Comment: 1st part: 4 high pitched noise partitially removed
2nd part: same as 2
---------------------------------------

---------------------------------------
1L File: .\Samples\009 mybloodrusts.sample20sec_Nero_vbr_new.wav
1L Rating: 2.3
1L Comment: 1st part: 2.5 stereo problem
2nd part: 2: loud ringing
---------------------------------------
2R File: .\Samples\009 mybloodrusts.sample20sec_qt_44.wav
2R Rating: 2.5
2R Comment: 1st part: 3.5 small stereo problem (chirping)
2nd part: 1.5 constant chirping, ringing
---------------------------------------
3R File: .\Samples\009 mybloodrusts.sample20sec_Nero_cbr_new.wav
3R Rating: 2.3
3R Comment: 1st part: 1.5 annoying stereo problem
2nd part: 3 guitar+cymbals smeared but no annoying ringing
---------------------------------------

---------------------------------------
1R File: .\Samples\011 Scars_Nero_cbr_new.wav
1R Rating: 3.8
1R Comment: percussion smeared
---------------------------------------
2R File: .\Samples\011 Scars_qt_44.wav
2R Rating: 3.8
2R Comment: same as 1
---------------------------------------
3R File: .\Samples\011 Scars_Nero_vbr_new.wav
3R Rating: 2.5
3R Comment: percussion smeared, lowpass, ringing, warbeling
---------------------------------------

---------------------------------------
1L File: .\Samples\012 Waiting_Nero_vbr_new.wav
1L Rating: 2.0
1L Comment: 1st part: stereo separation OK but annoying ringing
2nd part: smeared, flanging, ringing cymbals
---------------------------------------
2R File: .\Samples\012 Waiting_qt_44.wav
2R Rating: 3.0
2R Comment: 1st part: 3 same as 1 but better
2nd part: similar as 1 but less lowpass
---------------------------------------
3L File: .\Samples\012 Waiting_Nero_cbr_new.wav
3L Rating: 3.2
3L Comment: 1st part: 3.2 same as 2 but slightly better
2nd part: as 2
---------------------------------------


Conclusion:
Ahead aacenc CBR and Quicktime seem to be equal if there weren't the stereo issues. As there are at least two severe problem cases for Ahead aacenc among the samples chosen, Quicktime is the only option.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-31 00:02:45
Great! Thanks for the information, and for taking the time to perform these tests
Title: Pre-Test thread
Post by: bond on 2003-08-31 18:17:24
Quote
So, here's an updated list of codecs to be featured:

-Nero HE AAC
-Ogg Vorbis
-MP3pro
-WMAv9 std
-Apple AAC LC
-Higher anchor: Lame --alt-preset 128
-FhG MP3 64kbps w/o IS

Real is on hold due to reasons out of my league. I'll know if it'll be featured or not by the weekend.

imho i would test:
- Nero HE AAC
- Ogg Vorbis
- MP3pro
- WMAv9 std
- RealCook (hey, its the direct competitor to wma and used very often!)

- anchor: Lame --alt-preset 128

-> 6 codecs, nice amount for testing, interesting results also for "the masses"...
Title: Pre-Test thread
Post by: askoff on 2003-08-31 21:32:57
Quote
imho i would test:
- Nero HE AAC
- Ogg Vorbis
- MP3pro
- WMAv9 std
- RealCook (hey, its the direct competitor to wma and used very often!)

- anchor: Lame --alt-preset 128

-> 6 codecs, nice amount for testing, interesting results also for "the masses"...

That list looks bit better for me too. I think i would also add Aplle AAC LC with them.
Title: Pre-Test thread
Post by: rjamorim on 2003-08-31 21:58:18
I was thinking of creating a poll offering people several options for a codec suite.

But then, I would have to postpone the test for at least a week.

Decisions, decisions... >_<
Title: Pre-Test thread
Post by: askoff on 2003-09-01 09:37:18
Quote
I was thinking of creating a poll offering people several options for a codec suite.

But then, I would have to postpone the test for at least a week.

The poll idea would be great. More happy people==less whining.
I hope we are no hurry on this test. I remember many stupid finnish proverb about doing something in hurry...
Title: Pre-Test thread
Post by: bond on 2003-09-02 10:52:09
just an info:

according to JohnV vorbis 1.0.1 "is expected in few days actually"
Title: Pre-Test thread
Post by: JohnV on 2003-09-02 10:59:44
Quote
just an info:

according to JohnV vorbis 1.0.1 "is expected in few days actually"

Well.. according to #vorbis irc-channel topic actually..
Title: Pre-Test thread
Post by: rjamorim on 2003-09-02 11:05:24
OK.. So, the million dollar question is: Will it make any difference at my test? Is low bitrate tweaking planned for this release?
Title: Pre-Test thread
Post by: bond on 2003-09-02 11:09:47
no quality tweaks, just bug fixes as far as i know, but i think you can be sure that people will start whining if you dont use it
Title: Pre-Test thread
Post by: JohnV on 2003-09-02 11:18:25
Quote
OK.. So, the million dollar question is: Will it make any difference at my test? Is low bitrate tweaking planned for this release?

Quote
[05:29] <xiphmont> 1.0.1
[05:29] <xiphmont> all the CVS fixes that have been building up, as well as bugs reported in other forums.
[05:29] <xiphmont> ...as it turns out there will be a few tuning fixes there too.

So.. few tuning fixes.. Hard to say if those would make any real difference though.
Title: Pre-Test thread
Post by: rjamorim on 2003-09-02 11:41:58
Quote
but i think you can be sure that people will start whining if you dont use it

Right... :B

OK, test start is postponed for one week. New expected dates are Sept 10th to 21st.
Title: Pre-Test thread
Post by: bond on 2003-09-02 13:03:31
more info about 1.0.1 here (http://wiki.xiph.org/Release101)
Title: Pre-Test thread
Post by: dev0 on 2003-09-02 13:53:40
Quote
Improved handling of quiet signals in low bitrate modes


This could be quite significant for the test, so waiting is the right decision.

dev0
Title: Pre-Test thread
Post by: phong on 2003-09-02 21:26:34
This (http://www.hydrogenaudio.org/forums/index.php?showtopic=7197&hl=) is a relavent thread to the vorbis fix.
Title: Pre-Test thread
Post by: ff123 on 2003-09-05 00:32:22
Quote
ABC/HR 0.9b doesn't save the results when I try to setup a new test.
I get a blank results file with the title of the new test!
I had to do the test twice (except Waiting) because of the bug.

Bug noted and added to the short list.  I will try to release a new version soon which obscures results (not for this test though).

[998 spams/virii in 18 days of vacation, including about a dozen real email].

ff123
Title: Pre-Test thread
Post by: rjamorim on 2003-09-05 00:55:49
Welcome back, master!
Title: Pre-Test thread
Post by: rjamorim on 2003-09-07 21:34:08
OK, the test will start this Wednesday, no matter if Vorbis 1.0.1 is released or not.

So, I'll lay out the general headlines of what will be tested:

Samples: The ones mentioned at the first post in this thread:
http://www.hydrogenaudio.org/show.php/showtopic/12358 (http://www.hydrogenaudio.org/show.php/showtopic/12358)

The samples suite is now frozen and won't change.

The encoders that are planned to be featured are:

HE AAC from Nero 6.0.0.15, Streaming :: Medium, High Quality
Ogg Vorbis 1.0.1 or post-1.0 CVS, -q 0
MP3pro from Adobe Audition 1.0, quality 40, Current codec, allow M/S and IS, allow narrowing, no CRC
Real Audio Gecko/Cook 64kbps from Real Producer 9.0.1
LC AAC from QuickTime 6.3, Best Quality
WMA v9 VBR quality 50
High anchor: Lame 3.90.3 --alt-preset 128
Low anchor: MP3 from Adobe Audition 1.0 (FhG), 64kbps CBR, Current codec, allow M/S, no I/S, allow narrowing.

About resampling: I won't resample anything prior to encoding, but if the encoder resamples by default on that specific setting, I won't force it to use another sample rate. Therefore, for instance, FhG MP3 will end up with 22050 kHz, since that's the default in Audition.


Now, something important that MIGHT happen: I'm talking to a codec developer, and tomorrow I'll receive his reply if they want me to test their codec or not. IMO, that codec is very worth testing, as from what it seems, it might well be the winner, or be among the winners. I can't speak more about it now because I still have no answer from him.

So, if this developer gives me the thumbs up, what codec you guys think should be replaced by it? I'm leaning towards LC AAC or MP3, but it's up to you. Any opinions?

Thanks.

Regards;

Roberto.
Title: Pre-Test thread
Post by: askoff on 2003-09-07 22:08:06
I dont think that MP3 at 64kbps is needed because it surely will be the looser of all these codeks.
Title: Pre-Test thread
Post by: elmar3rd on 2003-09-07 22:10:09
Quote
IMO, that codec is very worth testing, as from what it seems, it might well be the winner, or be among the winners. I can't speak more about it now because I still have no answer from him.

So, if this developer gives me the thumbs up, what codec you guys think should be replaced by it? I'm leaning towards LC AAC or MP3, but it's up to you. Any opinions?

Is it related to the rumors over mp3 with mpc-psymodel?  Whatever, i would replace one of the anchors, as the comparison of the codecs is much more important than anchors, imho.
Title: Pre-Test thread
Post by: Dibrom on 2003-09-07 22:22:17
Quote
Is it related to the rumors over mp3 with mpc-psymodel?

I very seriously doubt it.  I also doubt that this codec (the mpc/mp3 thing) will ever see the light of day in the capacity that it was originally rumored it might.
Title: Pre-Test thread
Post by: rjamorim on 2003-09-07 22:23:56
Quote
Is it related to the rumors over mp3 with mpc-psymodel?

Nonono. That project is most probably dead.

And, even using MPC psymodel, there is no way MP3 could be "among the winners", like I said.
Title: Pre-Test thread
Post by: tigre on 2003-09-07 22:47:18
Although I have invested some time into it, I'd recommend to replace mp3@64kbps. From the codecs I've tested @64kbps recently it has the lowest lowpass which will be the most obvious difference to notice with the majority of samples. The other codecs tested will have most problems with "real" encoding artifacts (ringing, pre-echo, changed background noise, etc), so this lower anchor probably wouldn't be worth that much anyway.
Title: Pre-Test thread
Post by: brett on 2003-09-08 02:10:08
mp3 = part of the past.

lc aac = part of the future.

(and i thought you agreed that qt's performance in the last two tests (a win and a strong second with the vbr millstone) logically demanded its inclusion in this test. no?)
Title: Pre-Test thread
Post by: Hyrok on 2003-09-08 18:30:52
I would suggest to use the nero profile Portable 50-70kbps HE-AAC HQ instead of Streaming for the test. Average Bitrate is always arround 64kbps, mostly even lower, and the quality is surely better than 64kbps cbr (at least for my ears)
Title: Pre-Test thread
Post by: rjamorim on 2003-09-08 20:52:17
Quote
I would suggest to use the nero profile Portable 50-70kbps HE-AAC HQ instead of Streaming for the test.

Dude, where the heck did you see a "Portable" profile in Nero? 
Title: Pre-Test thread
Post by: Hyrok on 2003-09-08 21:09:28
Argh, now i got it! Preset "VBR/Stereo - Portable, 50-70 Kb/s (HE-AAC) is identical with the standard option (in this case "streaming"). Sorry, my mistake^^;
Title: Pre-Test thread
Post by: rjamorim on 2003-09-09 02:07:14
Oh, well. The developer I talked about declined participating in this test. The codec suite and settings will be the ones I mentioned sooner. Test starts next wednesday (09-10).

Looking forward to your participation.

Roberto.
Title: Pre-Test thread
Post by: rjamorim on 2003-09-09 03:06:47
OK, the sample packages are ready, except for Vorbis, that, hopefully, will be released until tomorrow night (I can't wait more than that)

Unfortunately, I can't postpone this test anymore. It would mean the test would start at 09-17 and end at 09-28, and I'm going to travel on that sunday.

So, let's hope Xiph releases Vorbis 1.0.1 soon.

On another issue, could someone please clarify what is the overhead introduced by the Ogg container? I'm trying to find out the container overhead for the formats I'm featuring and so far I came up with this:

-MP3: Containerless
-Real Audio: Does it even use a container?
-WMA: ~6Kbps
-MP4: Negative (!), since the ADTS headers are ripped prior to multiplexing
-Ogg: ?

Thanks for any info;

Roberto.
Title: Pre-Test thread
Post by: danielperez on 2003-09-09 03:24:32
[spam removed by moderation]
Title: Pre-Test thread
Post by: music_man_mpc on 2003-09-09 03:29:32
@danielperez

Do me and the rest of the community a favor by NEVER EVER POSTING HERE AGAIN.  Thanx.       
Title: Pre-Test thread
Post by: rjamorim on 2003-09-09 03:29:54
Jesus Jumping Christ 
Title: Pre-Test thread
Post by: Cygnus X1 on 2003-09-09 03:30:26
PLEASE tell me that my eyes are playing tricks on me, and that a few posts before mine is NOT an 80-page pyramid scheme?  My, how times have changed at HA.
Title: Pre-Test thread
Post by: music_man_mpc on 2003-09-09 03:30:37
Quote
Jesus Jumping Christ 

Indeed.
Title: Pre-Test thread
Post by: rjamorim on 2003-09-09 03:34:35
why, oh, why in MY test thread? >_<

Now people won't notice the info request I posted at my last post (just before the 80-page spam)
Title: Pre-Test thread
Post by: music_man_mpc on 2003-09-09 03:46:19
Back, not quite on but slightly closer to the topic for a secound.  How could the MP4 container size be negative Roberto?  I can see it being negligibly small, but how could it be possible for it to shrink the filesize?
Title: Pre-Test thread
Post by: rjamorim on 2003-09-09 03:51:16
Quote
Back, not quite on but slightly closer to the topic for a secound.  How could the MP4 container size be negative Roberto?  I can see it being negligibly small, but how could it be possible for it to shrink the filesize?

Because it strips all the headers from the AAC files prior to multiplexing them.

There's a header (either ADIF or ADTS) in every AAC stream with information about sampling rate, bit depth, number of channels, AAC profile, etc.

It removes all these headers (making the file a RAW AAC), multiplexes it in the MP4 container and stores the stream information in the MP4 header. That's why it's slightly smaller.

Edit: Do the test for yourself: Encode a track with Psytel AACenc or FAAC, wrap it in MP4 using MP4ui or MP4creator, and compare file sizes.

Regards;

Roberto.
Title: Pre-Test thread
Post by: pseudoacoustic on 2003-09-09 03:59:00
Quote
PLEASE tell me that my eyes are playing tricks on me, and that a few posts before mine is NOT an 80-page pyramid scheme?   My, how times have changed at HA.

It's not even a well-written pyramid scheme 

Quote
PLEASE MAKE SURE you write every name & address ACURETLY. This is critical to YOUR success.


This is just sad 
Title: Pre-Test thread
Post by: Peter on 2003-09-09 06:07:09
*yawn* spam deleted, account banned, have a nice and productive day.
Title: Pre-Test thread
Post by: Ivan Dimkovic on 2003-09-09 08:51:01
Quote
Because it strips all the headers from the AAC files prior to multiplexing them.

There's a header (either ADIF or ADTS) in every AAC stream with information about sampling rate, bit depth, number of channels, AAC profile, etc.

It removes all these headers (making the file a RAW AAC), multiplexes it in the MP4 container and stores the stream information in the MP4 header. That's why it's slightly smaller.

Edit: Do the test for yourself: Encode a track with Psytel AACenc or FAAC, wrap it in MP4 using MP4ui or MP4creator, and compare file sizes.

Hmm - someone should implement converting of ADIF AAC files to MP4 in MP4Creator (if this is not done already)

Encoding with ADIF header (psytel, faac) won't generate frame-headers and still there will be enough information to build the MP4 file, and therefore there would be no undercoding/bit reservoir issues.
Title: Pre-Test thread
Post by: c_haese on 2003-09-09 15:14:02
Quote
OK, the sample packages are ready, except for Vorbis, that, hopefully, will be released until tomorrow night (I can't wait more than that)

Unfortunately, I can't postpone this test anymore. It would mean the test would start at 09-17 and end at 09-28, and I'm going to travel on that sunday.

So, let's hope Xiph releases Vorbis 1.0.1 soon.

On another issue, could someone please clarify what is the overhead introduced by the Ogg container? I'm trying to find out the container overhead for the formats I'm featuring and so far I came up with this:

-MP3: Containerless
-Real Audio: Does it even use a container?
-WMA: ~6Kbps
-MP4: Negative (!), since the ADTS headers are ripped prior to multiplexing
-Ogg: ?

Thanks for any info;

Roberto.

If you have a recent CVS snapshot (i.e. from after September 2), that is 1.0.1 as far as encoding quality is concerned. AFAIK, the 1.0.1 release is currently only being held back due to Win32 build problems.

Regarding the Ogg container overhead, http://www.xiph.org/ogg/vorbis/doc/framing.html (http://www.xiph.org/ogg/vorbis/doc/framing.html) says that the overhead shouldn't be more than 2% of the bandwidth. If this is correct (but I don't know if it is), that would give you 1.3kbps overhead on a 64kbps stream.

Hope this helps,

Carsten Haese
Ogg Traffic Editor, Xiph.org Foundation
Title: Pre-Test thread
Post by: dev0 on 2003-09-09 15:17:14
John33 shall build new Vorbis binaries then...
;)
Title: Pre-Test thread
Post by: rjamorim on 2003-09-09 15:51:36
Quote
Hmm - someone should implement converting of ADIF AAC files to MP4 in MP4Creator (if this is not done already)

Encoding with ADIF header (psytel, faac) won't generate frame-headers and still there will be enough information to build the MP4 file, and therefore there would be no undercoding/bit reservoir issues.

Well, one can always use ADIF2MP4 in that case... (given you know where to find it, of course  )

Quote
If you have a recent CVS snapshot (i.e. from after September 2), that is 1.0.1 as far as encoding quality is concerned. AFAIK, the 1.0.1 release is currently only being held back due to Win32 build problems.

Regarding the Ogg container overhead, http://www.xiph.org/ogg/vorbis/doc/framing.html (http://www.xiph.org/ogg/vorbis/doc/framing.html) says that the overhead shouldn't be more than 2% of the bandwidth. If this is correct (but I don't know if it is), that would give you 1.3kbps overhead on a 64kbps stream.


Interesting. Thanks a lot for the info.

Quote
John33 shall build new Vorbis binaries then...


Well, I believe these ones are already up-to-date then:
http://www.hydrogenaudio.org/forums/index....topic=13019&hl= (http://www.hydrogenaudio.org/forums/index.php?showtopic=13019&hl=)
(Dated September 2...)

Regards;

Roberto.
Title: Pre-Test thread
Post by: rjamorim on 2003-09-10 07:21:12
OK. This discussion is now officially closed. The sample packages are already done and uploaded to a temporary server (where Spoon will grab them and move to the definitive server - thanks Spoon  ) and the batch files are ready as well as the test setup files.

Vorbis 1.0.1 didn't make it, it only made me lose one week. Shows you shouldn't rely on Xiph for deadlines.

One bad news is that the size of the sample packages now sum to 126Mb. Sorry about that, blame on the formats that force me to distribute them in lossless for lacking a command line decoder (WMA, MP3pro and Real Audio). Shouldn't be a hassle for broadband users, but people on dial-up might need to let the files downloading overnight. The average package size is 10Mb.

Thanks for all the support you guys gave me in this thread. It was hugely valuable.

The test starts tomorrow sometime by the (brazilian) afternoon.

Looking forward to your participation.

Best regards;

Roberto.