AAC at 128kbps v2 listening test - FINISHED

Topic: AAC at 128kbps v2 listening test - FINISHED (Read 66108 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

AAC at 128kbps v2 listening test - FINISHED

Reply #100 – 2004-03-02 01:47:58

Quote

Quote
As a content creator intensely familiar with a variety of media standards including AES, NTSC, ISO, ITU-R/CCIR, etc. I believe MPEG-4 w/AAC (not Quicktime MPEG-4, mind you, but straight MPEG-4) is the superlative format for compressed audiovisual media. However, for critical listening, only uncompressed audio is the way to go.

I'm pretty sure that since this slashdot poster wrote "MPEG-4 w/AAC" and "audiovisual media" that he is in fact referring to (and criticizing) Quicktime's implementation of MPEG-4 video not audio.

As for his comment with regards to uncompressed media, I'm pretty sure it's not "the only way to go" (regardless of how discerning he might be) given the existence of lossless audio compression.

AAC at 128kbps v2 listening test - FINISHED

Reply #101 – 2004-03-02 02:11:40

Quote

Quote
(guruboolez @ Mar 1 2004, 06:05 PM)

The current problem don't lie in overall performances, but on occasionnal artifacts. Faac, Real, Compaact have still serious issues.

I agree with you.

Me too

Quote

,Mar 2 2004, 05:56 AM] As you already said, there is no doubt that @128 kbps the best AAC implementation is superior to the best MP3 implementation. The next multiformat test will confirm what we already know. So, i understand your desire to see Ahead AAC codec but i'm afraid this could not happen because lack of space. Maybe we can exclude iTunes: i see no point to compare again two codecs (LAME and iTunes) that, quality speaking, remained almost the same after the last multiformat test.

I don't agree.

The point of the multi-format test is to compare the best of MP3 with the best of AAC with the best of Vorbis, etc.

Well, that was my understanding.

Why put Ahead/Nero AAC into the test if it lost to iTunes/QT AAC in both previous tests?

I don't see Fhg being used instead of LAME, even though Fhg lost to LAME in previous tests... we've all see LAME in so many different audio tests after all...

(Apologies to any Ahead/Nero employees/fans)

AAC at 128kbps v2 listening test - FINISHED

Reply #102 – 2004-03-02 02:40:52

Quote

Quote
My question is, what is this "straight MPEG-4 AAC" that this guy is talking about?

Excuse my french, but this guy is "plein de merde".

Lol, nice one rjamorim! You might even say, "il est aussi ulite qu'un frien à main sur un canoe." Ah, the French are so much more creative with insults than us .

AAC at 128kbps v2 listening test - FINISHED

Reply #103 – 2004-03-02 02:59:36

The Winner of a listening test - is - ofcourse the right one to contend in the multiformat test.

Choosing one of the "loosing" codecs to represent a format in a multiformat test sounds insane. That's like getting the 4th best (ranked) sprinter to run in the Olympics, instead of the best - cause we all know the 1st ranked sprinter would just win again, and again, and again..

A stupid comparison, maybe, but it still makes sense somehow (i hope)

AAC at 128kbps v2 listening test - FINISHED

Reply #104 – 2004-03-02 03:29:59

i want xing to participate

AAC at 128kbps v2 listening test - FINISHED

Reply #105 – 2004-03-02 03:38:29

Quote

i want xing to participate

Yep, just to see if it can pull off another surprise.

AAC at 128kbps v2 listening test - FINISHED

Reply #106 – 2004-03-02 03:54:54

Why not TAC? I bet my good friend KM would be very pleased.

AAC at 128kbps v2 listening test - FINISHED

Reply #107 – 2004-03-02 08:16:47

Quote

"il est aussi ulite qu'un frien à main sur un canoe."

Funny one. (this sounds more like Canadian French than "France French")

AAC at 128kbps v2 listening test - FINISHED

Reply #108 – 2004-03-02 13:50:13

Quote

Why put Ahead/Nero AAC into the test if it lost to iTunes/QT AAC in both previous tests?

i also dont get that

i clearly want to know how the best aac implementation does compared to wma9, thats simply the upcoming audio codec "war" and aac should be represented as good as possible!
also this comparison can be used/shown for consumers to decide whether they should choose itunes or a wma music store qualitywise

to make it short: qt should be used in any case!
(i also cant imagine a reason which speaks against using qt)

AAC at 128kbps v2 listening test - FINISHED

Reply #109 – 2004-03-02 14:33:02

I would like to ask a few questions about the methodology (I'm a newbye when it comes to audio compression, so my apologies if this is something stupid.)

- As far as I can understand there are two sources of uncertainties in such a test:
1) The spread of grades given to a particular music sample because people rate them differently, background noise, etc.
2) The spread in each implementation's efficiency (i.e. bitrate distribution, modulation, etc.) for each music sample.

How are those two dealt with in the final number? Just treat 1) and 2) as statistical errors and "sum and divide by N (or sqrt{N})"?

- I don't want to sound mighty and lofty, but the one that raised the point that the error bars are overlapping two much to call codec A "the winner" is right. Since people around here seem to believe that codec A is really the winner in the listening test it makes me think that the error bars are overestimated (this is a common "mistake" when one deals with uncertainties as uncorrelated. It has more to do with measurement theory than statistics). If codec A really sounds better than codec B in most of the cases then the error bars as they stand don't tell us the whole story.

Like I said before, I'm a complete n00b in audio compression (although I have a little experience in experimental physics), so forgive me if the point I raised is idiotic -- although I would like to know why

Ah, last but not least: thanks to the rarewares admins for the wonderful page and especial kudos to everybody that participated in the test.

AAC at 128kbps v2 listening test - FINISHED

Reply #110 – 2004-03-02 22:04:29

Quote

- As far as I can understand there are two sources of uncertainties in such a test:
1) The spread of grades given to a particular music sample because people rate them differently, background noise, etc.
2) The spread in each implementation's efficiency (i.e. bitrate distribution, modulation, etc.) for each music sample.

"1)" is a known variable that is dealt with by qualifying the statement of the results. "Codec A had the highest average rating of all the samples tested." To simply say that "Codec A is the beat" would be wrong, as the claim isn't qualified.

"2)" is more quantifiable, and there is a demand for resolving this question, as I hear many people ask "Which codec produces the best sound quality at the lowest bitrate, or (synonymously) with the smallest filesizes?"

I've proposed a "composite rating system in the past, which produced some debate, but was never "replaced" with a more effective system. My idea was...

( [average rating] / [actual average bitrate] ) x [target nominal bitrate]

For instance, in this test, if iTunes rated an average of 4.20, and had an actual average bitrate of 128kbps, then it's Composite Rating would be...

( 4.20 / 128 ) x 128 = 4.20

Nero had an average rating os 4.04, and (I believe) an actual average bitrate of 141kbps, so it could be said to have a Composite Rating of...

(4.04 / 141 ) x 128 = ~3.67

This is not a perfect "system", I'm sure, but it at least attempts to address the disparateness of average bitrates (and in turn, resulting filesizes) of the encoded sample tracks. Encoded filesizes (resulting from bitrate) is a matter of great importance to many people who encode music with a limited amount of HD space.

But, of course, how relevant this system would be when used to consider filesizes of encoded music other than the sample tracks is an issue that relates back to "1)".

AAC at 128kbps v2 listening test - FINISHED

Reply #111 – 2004-03-02 22:24:05

Quote

( 4.20 / 128 ) x 128 = 4.20

Nero had an average rating os 4.04, and (I believe) an actual average bitrate of 141kbps, so it could be said to have a Composite Rating of...

(4.04 / 141 ) x 128 = ~3.67

Sorry, but this system is nonsense -

1. It is impossible to compare VBR and CBR directly (at least in Nero) because they use completely different algorithms and psychoacoustic parameters - check MP3 tests where FhG VBR scored much worse than FhG CBR, for example - and the bit rate was similar

2. Nero CBR was used in the last AAC 128 test and it scored 4.02, if I remember correctly - it is not really possible to extrapolate results directly - and I am quite confident that current Nero encoder is way better than the one used in the last year's test.

3. Linear scaling of the subjective rankings directly to "match" bit-rate is not founded by any scientific proof - check bit-rate vs. SDG distribution used in making of PEAQ tool (database of listening tests) and you will see that the quality vs. bit rate curve is no way near linear, at least not for AAC.

AAC at 128kbps v2 listening test - FINISHED

Reply #112 – 2004-03-02 22:38:13

Quote

1. It is impossible to compare VBR and CBR directly (at least in Nero) because they use completely different algorithms and psychoacoustic parameters - check MP3 tests where FhG VBR scored much worse than FhG CBR, for example - and the bit rate was similar

These are completely different encoders. Audioactive is based on SlowEnc. Audition is based on FastEnc. Besides, Audioactive underwent in-house tuning.

A good proof that AudioActive is based on SlowEnc is that, in my PC, it takes 1:40 to encode a 3:20 music at 128kbps. Audition takes 12 seconds

Quote

and I am quite confident that current Nero encoder is way better than the one used in the last year's test.

That is arguable. iTunes barely changed, according to Apple's AAC developer. And it managed to keep a good margin of superiority compared to Nero.

I guess the explanation to that is that between tests, it seems more resources were targeted at low bitrate and SBR tuning, and not mid-high bitrate tuning.

AAC at 128kbps v2 listening test - FINISHED

Reply #113 – 2004-03-02 22:47:35

Quote

Nero had an average rating os 4.04, and (I believe) an actual average bitrate of 141kbps, so it could be said to have a Composite Rating of...

I'd hope simplified "conclusions" like yours would be exclusive for Slashdot, not for HA..
You totally forget again what vbr is about, and what was the average bitrate of thousands of tracks, not to talk about different approach of cbr and vbr which are not comparable. Again, the principle in VBR is that you choose a quality level, and encoder tries to keep it.
This chosen quality level then approaches some bitrate with thousands of tracks, with tested Nero's setting this is about 131kbps.

AAC at 128kbps v2 listening test - FINISHED

Reply #114 – 2004-03-02 22:49:42

Quote

That is arguable. iTunes barely changed, according to Apple's AAC developer. And it managed to keep a good margin of superiority compared to Nero.

It's not arguable. It's totally clear that Nero has improved considerably for anybody who has done lots of testing. Just because your 12 samples may not show this, just shows that 12 samples isn't nearly enough to show the whole picture.
This is even more true for some of the lower ranking encoders, which in reality have way more trouble with quality than you can conclude from these 12 samples.

AAC at 128kbps v2 listening test - FINISHED

Reply #115 – 2004-03-02 22:49:44

Ok - I stand corrected - but anyway, comparing different encoder modes and extrapolating results in completely wrong way of doing things.

AAC quality scale is curve which starts to rapidly descent towards 1.0 somewhere between 80 and 96 kb/s , and beyond that the curve angle is much smaller. I don't have the graph, unfortunately - because it is part of JAES tests - but you could see that curve steepnes are different for different codecs.

AAC at 128kbps v2 listening test - FINISHED

Reply #116 – 2004-03-02 22:51:00

Quote from: gotaserena,Mar 2 2004, 06:33 AM

- As far as I can understand there are two sources of uncertainties in such a test:
1) The spread of grades given to a particular music sample because people rate them differently, background noise, etc.
2) The spread in each implementation's efficiency (i.e. bitrate distribution, modulation, etc.) for each music sample.

How are those two dealt with in the final number? Just treat 1) and 2) as statistical errors and "sum and divide by N (or sqrt{N})"?

- I don't want to sound mighty and lofty, but the one that raised the point that the error bars are overlapping two much to call codec A "the winner" is right. Since people around here seem to believe that codec A is really the winner in the listening test it makes me think that the error bars are overestimated (this is a common "mistake" when one deals with uncertainties as uncorrelated. It has more to do with measurement theory than statistics). If codec A really sounds better than codec B in most of the cases then the error bars as they stand don't tell us the whole story.

[/quote]
After the means for each sample have been calculated, these are fed back into the ANOVA/Fisher LSD program to come up with the final error bars for the codecs over all 12 samples.

So in effect, each sample is assumed to be independent of each other (uncorrelated) and weighted equally in figuring the final outcome. I imagine this could be a problem, for example if the sample selection were biased towards a certain genre, or if some samples had a widely disparate number of participants. It just underscores how much the test samples can affect the outcome of the test.

Is this what you were after?

ff123

AAC at 128kbps v2 listening test - FINISHED

Reply #117 – 2004-03-02 22:53:50

Quote

This chosen quality level then approaches some bitrate with thousands of tracks, with tested Nero's setting this is about 131kbps.

That's OK, but you can't deny that higher bitrates definitely help encoders.

A good example is Compaact!. It did pretty badly on most samples. But it really shone at Velvet, because it went up to 170kbps. (If I had featured fatboy, it would have used 270kbps!).

In these samples the bitrate stayed at 141. But if the average bitrate over hundreds of samples is 131-132kbps, on another batch of samples the bitrates would maybe be lower. And then quality would maybe be worse. Would it be worth weighting bitrate deviations then? It's pure speculation, but the question remains.

So, IMO both approaches - weighting bitrate deviations or not - have their cons and pros.

AAC at 128kbps v2 listening test - FINISHED

Reply #118 – 2004-03-02 22:55:49

Quote

It's not arguable. It's totally clear that Nero has improved considerably for anybody who has done lots of testing. Just because your 12 samples may not show this, just shows that 12 samples isn't nearly enough to show the whole picture.
This is even more true for some of the lower ranking encoders, which in reality have way more trouble with quality than you can conclude from these 12 samples.

Well, my samples come from a wide variety of styles. If you have other test results, using other samples, showing Nero indeed got much better since the last test, you should post them.

AAC at 128kbps v2 listening test - FINISHED

Reply #119 – 2004-03-02 22:57:11

Quote

Quote
This chosen quality level then approaches some bitrate with thousands of tracks, with tested Nero's setting this is about 131kbps.

That's OK, but you can't deny that higher bitrates definitely help encoders.

A good example is Compaact!. It did pretty badly on most samples. But it really shone at Velvet, because it went up to 170kbps. (If I had featured fatboy, it would have used 270kbps!).

In these samples the bitrate stayed at 141. But if the average bitrate over hundreds of samples is 131-132kbps, on another batch of samples the bitrates would surely be lower. And then quality would maybe be worse. Would it be worth weighting bitrate deviations then? It's pure speculation, but the question remains.

So, IMO both approaches - weighting bitrate deviations or not - have their cons and pros.

Compaact's high bitrate is due to high overcoding of short blocks. Just because of this, you can't extend this directly to other encoders.
Of course higher bitrate "helps", but the idea is to keep relatively constant quality. Compaact VBR doing very high overcoding of short blocks breaks this principle.

AAC at 128kbps v2 listening test - FINISHED

Reply #120 – 2004-03-02 22:57:50

Quote

So, IMO both approaches - weighting bitrate deviations or not - have their cons and pros.

You could theoretically weight bitrate and quality by keeping the same coding mode of the encoder, and knowing - a priori codec performance per bits/sample.

I could tell you that it is nowhere near perfection, and still it is a very tough method, and you have to have couple of statistically significant tests at, say, 96, 128, 160 and 192 kb/s to know how quality "scales" with the bit rate.

"Scale" of descent, like I said, is not the same for AAC and different codecs, and - most likely, not same between various implementations of the same algorithm ,

What if we slide bit rate to, say, 192 - so, QT would get 6.3 rating, right?

Wrong.

AAC at 128kbps v2 listening test - FINISHED

Reply #121 – 2004-03-02 23:00:24

Quote

Of course higher bitrate "helps", but the idea is to keep relatively constant quality. Compaact VBR doing very high overcoding of short blocks breaks this principle.

Well, they try to keep quality at all costs. I think that's the idea behind VBR - screw the bitrate, give desired quality.

And, even Velvet considered, Compaact managed to come really close to an average of 128kbps.

AAC at 128kbps v2 listening test - FINISHED

Reply #122 – 2004-03-02 23:03:43

Quote

Well, my samples come from a wide variety of styles. If you have other test results, using other samples, showing Nero indeed got much better since the last test, you should post them.

Not sure if I have that old Nero encoders anywhere, but there's no doubt that Nero has clearly improved. I've been testing it during this time quite a lot. I think Guru can verify this too although only for the kind of music he listens.

AAC at 128kbps v2 listening test - FINISHED

Reply #123 – 2004-03-02 23:03:49

Quote

"Scale" of descent, like I said, is not the same for AAC and different codecs, and - most likely, not same between various implementations of the same algorithm

So that warrants allowing a codec to use more bits than allowed by the test setup?

Quote

What if we slide bit rate to, say, 192 - so, QT would get 6.3 rating, right?

Wrong.

What are you trying to say there?

AAC at 128kbps v2 listening test - FINISHED

Reply #124 – 2004-03-02 23:05:22

Quote

Well, they try to keep quality at all costs. I think that's the idea behind VBR - screw the bitrate, give desired quality.

And, even Velvet considered, Compaact managed to come really close to an average of 128kbps.

That's just it: Compaact gives Velvet relatively better quality than on average because of overcoding of short blocks. This is more than just "keeping the quality", it's "pushing the quality".

Notice