Skip to main content

Topic: 64kbps public listening test (Read 50300 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • rjamorim
  • [*][*][*][*][*]
64kbps public listening test
Hello.

There have been some delays, but here are the so awaited results of the latest 64kbps group blind listening test.

http://audio.ciara.us/test/64test/results.html

And the final plot:
Each vertical line segment represents the 95% confidence interval (using ANOVA analysis) for each codec.


Note: Lame MP3 is 128kbps "high anchor" in this test, FhG MP3 is the "low anchor"

Zoomed version without the anchors:


Codec specifications:
  • Ahead/Nero 6.0.0.15 HE AAC VBR profile Streaming :: Medium, high quality
  • Ogg Vorbis post-1.0 CVS -q 0
  • MP3pro (from Adobe Audition 1.0) VBR quality 40, Current Codec, allow M/S and IS, allow narrowing, no CRC
  • Real Audio Gecko (from Real Producer 9.0.1 64kbps
  • Windows Media Audio v9 VBR quality 50
  • QuickTime 6.3 AAC LC 64kbps, Best Quality
  • Lame MP3 encoder 3.90.3 --alt-preset 128 --scale 1. (high anchor)
  • FhG MP3 encoder (from Adobe Audition 1.0) 64kbps CBR, Current codec, allow M/S, no I/S, allow narrowing, no CRC. (low anchor)
Thanks to everyone that participated and helped!

Best regards;

Roberto.
  • Last Edit: 22 September, 2003, 11:10:45 AM by JohnV
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • bond
  • [*][*][*][*][*]
64kbps public listening test
Reply #1
yummie

hm, first time that i voted most of the codecs worse than the average (especially wma was 1.77 in my case  )

seems that my hearing evolves (dunno if this is a good thing  )

edit: perhaps an even more zoomed in graph (leaving out lame too which is also at 128kbps) between 4 and 2.5 would be nice
  • Last Edit: 22 September, 2003, 04:37:05 AM by bond
I know, that I know nothing (Socrates)

  • neoufo51
  • [*][*][*][*]
64kbps public listening test
Reply #2
Interesting...

I expect Vorbis to be a little higher, and wow, Lame is really up there.

  • rjamorim
  • [*][*][*][*][*]
64kbps public listening test
Reply #3
Bloody 5:20 AM here.

I'll read criticisms, comments & al. in 7 hours :B

One last thing: if anyone is interested, the results are all available as a single zip here:
http://rarewares.hydrogenaudio.org/rja/comments.zip

Enjoy!

Best regards;

Roberto Amorim.
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • rjamorim
  • [*][*][*][*][*]
64kbps public listening test
Reply #4
BTW: If the server starts with that redirection fag0try: Blame Verloren
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • rjamorim
  • [*][*][*][*][*]
64kbps public listening test
Reply #5
One last thing: People are invited to announce it on Slashdot, Kuro5hin, RAO or anywhere you think it fits. This is the best test I performed so far and, IMO, with the most interesting results. So it deserves (i think) some advertizing
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • Ivan Dimkovic
  • [*][*][*][*][*]
  • Developer
64kbps public listening test
Reply #6
Roberto,  can you please move the anchors (low-high) to the right - and/or, indicate that LAME is 128 kbps, otherwise the test graph results might be very misleading
  • Last Edit: 22 September, 2003, 04:26:25 AM by Ivan Dimkovic

  • Jon Ingram
  • [*][*][*][*]
64kbps public listening test
Reply #7
The group results seem to be:
Lame > He AAC, MP3Pro, Vorbis (with He AAC > Vorbis) > Real, WMA, QT AAC > FhG.

Listed in descending order of mean, with >'s only where they're significant.

I tested 11 of the 12 samples, so I thought it'd be quite interesting to apply a similar analysis to my results. I get:
Lame > Vorbis, MP3 Pro, He AAC > WMA, Real Audio, QT AAC > FhG.

So my results agree with the group, except that instead of He AAC > Vorbis, I found Vorbis > He AAC (although it wasn't quite at the 5% significance level). This may be influenced by the fact that I didn't test Polonaise, which seems to be a poor sample for Vorbis -- it's quite possible that, with that sample in, I'd be even closer to the group average.

Well, it's good to know that I agree with the majority .

PS. neoufo51:
Lame is 'really up there' because it's at twice the bitrate of everything else! It was in the test to see whether any of the codecs lived up to the hype of some of them (wave at WMA  ) -- the marketing is often claiming 128kbps performance at 64kbps. The results of this test indicate that none of this generation of codecs are there yet.
  • Last Edit: 22 September, 2003, 04:32:11 AM by Jon Ingram

  • Gabriel
  • [*][*][*][*][*]
  • Developer
64kbps public listening test
Reply #8
Quote
the anchors (low-high) to the right - and/or, indicate that LAME is 128 kbps

Perhaps grayed

  • rjamorim
  • [*][*][*][*][*]
64kbps public listening test
Reply #9
Quote
Roberto,  can you please move the anchors (low-high) to the right - and/or, indicate that LAME is 128 kbps, otherwise the test graph results might be very misleading

Sure. But tomorrow, since I would nearly have to rebuild the spreadsheets.

Or maybe someone wants to do that for me?

http://pessoal.onda.com.br/rjamorim/plots.zip

(nevermind the comments, they are from the 128 test)

It might be also interesting to replace "Lame MP3" with "Lame 128"

G'night to you all :B
  • Last Edit: 22 September, 2003, 04:33:50 AM by rjamorim
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • ScorLibran
  • [*][*][*][*][*]
  • Banned
64kbps public listening test
Reply #10
Here are the averaged results of my own tests on just five samples, in case anyone's curious as to what an untrained newbie can and cannot hear...

LAME MP3 (128kbps anchor) - 4.76
LC AAC - 4.66
MP3Pro - 4.64
HE AAC - 4.64
WMA Std - 4.32
Real - 4.06
FhG MP3 - 3.72
Vorbis - 3.64

At least the upper anchor sounded the best to me...otherwise I'd run out tomorrow and get my hearing checked.

The ironic part?  Vorbis is my lossy codec of choice.           
  • Last Edit: 22 September, 2003, 04:38:18 AM by ScorLibran

  • guruboolez
  • [*][*][*][*][*]
  • Members (Donating)
64kbps public listening test
Reply #11
I obtained different results, with bad notation for vorbis (unfortunately, I forgot the matrix on another computer). I'm not at ease with vorbis at this bitrate during a blind test : it sounds too particular (hiss, desquilibrated tonal range : more treble, poor low-medium, and limited stereo), and it's easy for me to detect the encoder. I'm rating vorbis, and not an unknow encoder. So it isn't blind anymore.

HE-AAC was my favorite : often the best - never the worst. Sometimes betrayed by a grainy texture, the same as mp3pro one. No noise packets, as heard with the first releases of the encoder.

Lower anchor was rarely the worse file I rated : on 8 files, I rated other encodings as worst one. I prefer an excessive lowpassed sound without artifacts than a richer sound, but destroyed by flanging. Personnal taste.

WMA9 (I hated this encoder) was as often the best file than HE-AAC. But it was three time the worse for me. So it isn't a reliable encoder, but on some situation, it works very well.

LC-AAC was first on two sample (02 and 09), last on one (11).. Vorbis best on one (04), and last on two (06 and 09).

  • Jon Ingram
  • [*][*][*][*]
64kbps public listening test
Reply #12
Quote
The ironic part?  Vorbis is my lossy codec of choice.

That's not all that odd, really -- if you've listened to Vorbis encoded files a lot, you've probably grown more sensitive to its distinctive features, and so you can pick Vorbis out of a crowd more easily than some other codec which has other problems. When MP3 first appeared, it took quite a while for even people with 'good' hearing to detect the problems inherent in that format, even though the problems would stand out a mile if you could listen today to your 7/8 year old MP3 encodes.

With just five samples the scope for meaningful statistics is reduced, but we can say that you were probably able to detect Real, FhG MP3 and Vorbis from the original, and not able to detect AAC or MP3 Pro.

It takes time to be sensitised to codec problems, which is why most people can happily listen to encoded tracks which would lead many of us to commit suicide with a pointy stick within 30 seconds...
  • Last Edit: 22 September, 2003, 04:53:17 AM by Jon Ingram

  • bond
  • [*][*][*][*][*]
64kbps public listening test
Reply #13
edit: my zoomed in pic is already on the official presentation page
  • Last Edit: 22 September, 2003, 09:48:45 AM by bond
I know, that I know nothing (Socrates)

  • mcbevin
  • [*]
  • Developer
64kbps public listening test
Reply #14
Hey, I think it might be worth emphasizing (or at the very least, mentioning!) on the original post that Lame is at 128 kbps and the others at 64 kbps. That fact took me a while to realise, and I suspect the post will be leaving a lot of people with the impression that Lame at 64 kbps is better than the other codecs at 64 kbps, when this is obviously not the intent!

  • 2Bdecided
  • [*][*][*][*][*]
  • Developer
64kbps public listening test
Reply #15
Quote
I obtained different results, with bad notation for vorbis (unfortunately, I forgot the matrix on another computer). I'm not at ease with vorbis at this bitrate during a blind test : it sounds too particular (hiss, desquilibrated tonal range : more treble, poor low-medium, and limited stereo), and it's easy for me to detect the encoder. I'm rating vorbis, and not an unknow encoder. So it isn't blind anymore.

Ditto. Well, I wasn't sure it was vorbis (because I've never used it), but I hated it and soon recognised it in all subsequent samples. It just killed the stereo. This would have been less obvious for non-critical listening over speakers (i.e. not in the sweet spot), but over headphones it was useless!

Quote
Lower anchor was rarely the worse file I rated : on 8 files, I rated other encodings as worst one. I prefer an excessive low-passed sound without artefacts than a richer sound, but destroyed by flanging. Personal taste.


I'm sure I wrote in one of them "I can hear the low pass, but it's less annoying that what some of the others are doing!". There was nasty temporal smearing that destroyed the "fun" of the music in a way that a low pass doesn't.



Roberto, what do the results look like if you transform each person's scores into rankings, before doing the analysis?

Cheers,
David.

  • 2Bdecided
  • [*][*][*][*][*]
  • Developer
64kbps public listening test
Reply #16
btw(!) Thanks for conducting such an excellent, and interesting test!

EDIT: Can hydrogen audio set up a directory of listening test results (with or without samples)? It would be useful to have them all in one place, and backed up for if/when other peoples servers go offline.

What do other people think of this idea?

Cheers,
David.

EDIT2: Does this mean that any codec producer who says "sounds as good as 128kbps mp3 at only 64kbps!" can now be taken to court for false advertising?
  • Last Edit: 22 September, 2003, 05:46:59 AM by 2Bdecided

  • guruboolez
  • [*][*][*][*][*]
  • Members (Donating)
64kbps public listening test
Reply #17
Note than Vorbis sufffered by an artifact (reverberation or hollow sound), introduce by latest CVS encoder, used in this test. See :
http://www.hydrogenaudio.org/forums/index....pic=7197&st=25&

(I'm not enterely sure that previous encoders didn't have this flaw).

  • Continuum
  • [*][*][*][*]
64kbps public listening test
Reply #18
Quote
The group results seem to be:
Lame > He AAC, MP3Pro, Vorbis (with He AAC > Vorbis) > Real, WMA, QT AAC > FhG.
[...]
Well, it's good to know that I agree with the majority .

I can't say that for me:
lame > HeAAC > Real > Vorbis, mp3pro, QTaac > WMA > FhG

Like in the c't test I'm a "Real Audio" fan!

I tested only 4 samples though (exactly those with the least participation B)).

  • ScorLibran
  • [*][*][*][*][*]
  • Banned
64kbps public listening test
Reply #19
Quote
With just five samples the scope for meaningful statistics is reduced, but we can say that you were probably able to detect Real, FhG MP3 and Vorbis from the original, and not able to detect AAC or MP3 Pro.

Very likely.  You'd be amazed (and I'm embarrassed to say) how many 5.0's I noted...about half of the test groups across the samples I tested.  Oh well.  That just means I need more practice listening for artifacts.  (But in the meantime, I save a *lot* of hard disk space with medium-low bitrate music that sounds just fine to my ears.)

Quote
Does this mean that any codec producer who says "sounds as good as 128kbps mp3 at only 64kbps!" can now be taken to court for false advertising?

Well if they say it here, anyway, they'll be flogged with TOS rule #8. 

  • eloj
  • [*][*]
64kbps public listening test
Reply #20
Slashdot is going to be hell if this is posted and the lame 128kbit sample isn't properly labeled in the graph :-O

Also, how many participated? Ah.. between 26-43. Not a very large sample then.
  • Last Edit: 22 September, 2003, 06:34:53 AM by eloj

  • teetee
  • [*][*][*][*]
64kbps public listening test
Reply #21
Quote
EDIT: Can hydrogen audio set up a directory of listening test results (with or without samples)? It would be useful to have them all in one place, and backed up for if/when other peoples servers go offline.

What do other people think of this idea?

Cheers,
David.

Might the Wiki be a suitable place for this repository?
Check out the foobar2000 Wiki: http://doc.hydrogenaudio.org/wikis/foobar2000/FrontPage
If you can help write it, please sign up and start writing!

  • schnofler
  • [*][*][*]
  • Developer
64kbps public listening test
Reply #22
My results looked like this: Lame, HE AAC > MP3Pro, Vorbis > Real > WMA, QT, FhG. So I pretty much agree with the overall results.
That listening test really was a very interesting expereince. I found it especially surprising just how well you can train yourself to hear artifacts. In the beginning I was rarely able to tell more than 3 codecs from the original. But after some hours of continuous testing (I tested all of the 12 samples), I mostly had no problems identifying the encoded samples instantly. I even repeated some of the tests I did early on, because I tended to generally rate everything lower after some training.
As for the ratings, I did rate WMA and QT almost exactly the same overall as FhG. Still, these codecs have quite different artifacts. Especially WMA really annoyed me because, on most of the samples, I could hear a constant very high frequency ringing to an otherwise more or less well reproduced sample. Sometimes this made the sample just plain unlistenable (it's most noticeable in Waiting and NewYorkCity). I never rated WMA better than 2.0 because of this.
Generally, my ratings are based on high frequencies. I have pretty good high frequency hearing, so if a codec lowpasses excessively, I find the sound detail-less and dull. On the other hand, things like stereo image are not really important for me (I didn't use headphones).

  • phong
  • [*][*][*][*]
64kbps public listening test
Reply #23
WOW, very interesting!  The results are so far different from my own that I think I should have my hearing checked.  Here's the original post I had in the other thread:
Quote
I've been itching to discuss the test all weekend!  Here are my results ranked by average score:
1. Lame MP3
2. WMA Std
3. LC AAC
4. HE AAC
5. Vorbis
6. MP3 Pro
7. FhG MP3
8. Real Audio Gecko

Lame was the winner by a large margin.  Nothing was close to WMA for second place.  Both AAC codecs and Vorbis were actually fairly close.  A different set of samples could see them in a much different order.  MP3 Pro and FhG MP3 were almost tied, the difference was very small.  Real audio was a joke.  Lost by a huge margin, even compared to 64k mp3.

Other than Real, which lost on almost every sample, and Lame which won most samples, every codec had at least one sample where they fell near the bottom of the pile.  Illinois and mybloodrusts turned out to have scores that differed wildly from all the other samples.  Illinois in particular gave the second best rating to FhG MP3 and absolutely killed Vorbis.  WMA fell to pieces on the piano sample (Polonaise) for some reason producing a bunch of nasty hissing noises.

I'd confidently say that the claim that any of these perform as well as mp3 at half the bitrate is a BIG FAT LIE.


Obviously, everyone else is hearing a problem with WMA that I'm not.  I'm going to have to look over the comments to see.  Real Audio, FhG and MP3 Pro had very damaging lowpasses to my ears.  I have no idea how either got rated as high as they did.  They sounded horrible to me!
  • Last Edit: 22 September, 2003, 07:09:53 AM by phong
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

  • Garf
  • [*][*][*][*][*]
  • Developer (Donating)
64kbps public listening test
Reply #24
Quote
Also, how many participated? Ah.. between 26-43. Not a very large sample then.

I guess everybody assumes someone else will do it for them

The results are statistically valid regardless of the small sample size.
  • Last Edit: 22 September, 2003, 07:13:32 AM by Garf