Hydrogenaudio Forums

Hydrogenaudio Forum => Listening Tests => Topic started by: rjamorim on 22 September, 2003, 04:10:17 AM

Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 04:10:17 AM
Hello.

There have been some delays, but here are the so awaited results of the latest 64kbps group blind listening test.

http://audio.ciara.us/test/64test/results.html (http://audio.ciara.us/test/64test/results.html)

And the final plot:
Each vertical line segment represents the 95% confidence interval (using ANOVA analysis) for each codec.
(https://hydrogenaud.io/imgcache.php?id=2146aa6903a0575ab17c07b99f3696d0" rel="cached" data-warn="External image, click to view at original size" data-url="http://sivut.koti.soon.fi/julaak/64_test.png)

Note: Lame MP3 is 128kbps "high anchor" in this test, FhG MP3 is the "low anchor"

Zoomed version without the anchors:
(https://hydrogenaud.io/imgcache.php?id=edc2d81c1e08fc3f9df784355f21a502" rel="cached" data-warn="External image, click to view at original size" data-url="http://sivut.koti.soon.fi/julaak/64_test_zoom.gif)

Codec specifications:Thanks to everyone that participated and helped!

Best regards;

Roberto.
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 04:15:06 AM
yummie

hm, first time that i voted most of the codecs worse than the average (especially wma was 1.77 in my case  )

seems that my hearing evolves (dunno if this is a good thing  )

edit: perhaps an even more zoomed in graph (leaving out lame too which is also at 128kbps) between 4 and 2.5 would be nice
Title: 64kbps public listening test
Post by: neoufo51 on 22 September, 2003, 04:20:09 AM
Interesting...

I expect Vorbis to be a little higher, and wow, Lame is really up there.
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 04:21:39 AM
Bloody 5:20 AM here.

I'll read criticisms, comments & al. in 7 hours :B

One last thing: if anyone is interested, the results are all available as a single zip here:
http://rarewares.hydrogenaudio.org/rja/comments.zip (http://rarewares.hydrogenaudio.org/rja/comments.zip)

Enjoy!

Best regards;

Roberto Amorim.
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 04:22:13 AM
BTW: If the server starts with that redirection fag0try: Blame Verloren
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 04:24:35 AM
One last thing: People are invited to announce it on Slashdot, Kuro5hin, RAO or anywhere you think it fits. This is the best test I performed so far and, IMO, with the most interesting results. So it deserves (i think) some advertizing
Title: 64kbps public listening test
Post by: Ivan Dimkovic on 22 September, 2003, 04:26:09 AM
Roberto,  can you please move the anchors (low-high) to the right - and/or, indicate that LAME is 128 kbps, otherwise the test graph results might be very misleading
Title: 64kbps public listening test
Post by: Jon Ingram on 22 September, 2003, 04:27:46 AM
The group results seem to be:
Lame > He AAC, MP3Pro, Vorbis (with He AAC > Vorbis) > Real, WMA, QT AAC > FhG.

Listed in descending order of mean, with >'s only where they're significant.

I tested 11 of the 12 samples, so I thought it'd be quite interesting to apply a similar analysis to my results. I get:
Lame > Vorbis, MP3 Pro, He AAC > WMA, Real Audio, QT AAC > FhG.

So my results agree with the group, except that instead of He AAC > Vorbis, I found Vorbis > He AAC (although it wasn't quite at the 5% significance level). This may be influenced by the fact that I didn't test Polonaise, which seems to be a poor sample for Vorbis -- it's quite possible that, with that sample in, I'd be even closer to the group average.

Well, it's good to know that I agree with the majority .

PS. neoufo51:
Lame is 'really up there' because it's at twice the bitrate of everything else! It was in the test to see whether any of the codecs lived up to the hype of some of them (wave at WMA  ) -- the marketing is often claiming 128kbps performance at 64kbps. The results of this test indicate that none of this generation of codecs are there yet.
Title: 64kbps public listening test
Post by: Gabriel on 22 September, 2003, 04:28:17 AM
Quote
the anchors (low-high) to the right - and/or, indicate that LAME is 128 kbps

Perhaps grayed
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 04:32:00 AM
Quote
Roberto,  can you please move the anchors (low-high) to the right - and/or, indicate that LAME is 128 kbps, otherwise the test graph results might be very misleading

Sure. But tomorrow, since I would nearly have to rebuild the spreadsheets.

Or maybe someone wants to do that for me?

http://pessoal.onda.com.br/rjamorim/plots.zip (http://pessoal.onda.com.br/rjamorim/plots.zip)

(nevermind the comments, they are from the 128 test)

It might be also interesting to replace "Lame MP3" with "Lame 128"

G'night to you all :B
Title: 64kbps public listening test
Post by: ScorLibran on 22 September, 2003, 04:35:43 AM
Here are the averaged results of my own tests on just five samples, in case anyone's curious as to what an untrained newbie can and cannot hear...

LAME MP3 (128kbps anchor) - 4.76
LC AAC - 4.66
MP3Pro - 4.64
HE AAC - 4.64
WMA Std - 4.32
Real - 4.06
FhG MP3 - 3.72
Vorbis - 3.64

At least the upper anchor sounded the best to me...otherwise I'd run out tomorrow and get my hearing checked.

The ironic part?  Vorbis is my lossy codec of choice.           
Title: 64kbps public listening test
Post by: guruboolez on 22 September, 2003, 04:41:57 AM
I obtained different results, with bad notation for vorbis (unfortunately, I forgot the matrix on another computer). I'm not at ease with vorbis at this bitrate during a blind test : it sounds too particular (hiss, desquilibrated tonal range : more treble, poor low-medium, and limited stereo), and it's easy for me to detect the encoder. I'm rating vorbis, and not an unknow encoder. So it isn't blind anymore.

HE-AAC was my favorite : often the best - never the worst. Sometimes betrayed by a grainy texture, the same as mp3pro one. No noise packets, as heard with the first releases of the encoder.

Lower anchor was rarely the worse file I rated : on 8 files, I rated other encodings as worst one. I prefer an excessive lowpassed sound without artifacts than a richer sound, but destroyed by flanging. Personnal taste.

WMA9 (I hated this encoder) was as often the best file than HE-AAC. But it was three time the worse for me. So it isn't a reliable encoder, but on some situation, it works very well.

LC-AAC was first on two sample (02 and 09), last on one (11).. Vorbis best on one (04), and last on two (06 and 09).
Title: 64kbps public listening test
Post by: Jon Ingram on 22 September, 2003, 04:49:37 AM
Quote
The ironic part?  Vorbis is my lossy codec of choice.

That's not all that odd, really -- if you've listened to Vorbis encoded files a lot, you've probably grown more sensitive to its distinctive features, and so you can pick Vorbis out of a crowd more easily than some other codec which has other problems. When MP3 first appeared, it took quite a while for even people with 'good' hearing to detect the problems inherent in that format, even though the problems would stand out a mile if you could listen today to your 7/8 year old MP3 encodes.

With just five samples the scope for meaningful statistics is reduced, but we can say that you were probably able to detect Real, FhG MP3 and Vorbis from the original, and not able to detect AAC or MP3 Pro.

It takes time to be sensitised to codec problems, which is why most people can happily listen to encoded tracks which would lead many of us to commit suicide with a pointy stick within 30 seconds...
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 04:56:43 AM
edit: my zoomed in pic is already on the official presentation page
Title: 64kbps public listening test
Post by: mcbevin on 22 September, 2003, 05:29:21 AM
Hey, I think it might be worth emphasizing (or at the very least, mentioning!) on the original post that Lame is at 128 kbps and the others at 64 kbps. That fact took me a while to realise, and I suspect the post will be leaving a lot of people with the impression that Lame at 64 kbps is better than the other codecs at 64 kbps, when this is obviously not the intent!
Title: 64kbps public listening test
Post by: 2Bdecided on 22 September, 2003, 05:43:39 AM
Quote
I obtained different results, with bad notation for vorbis (unfortunately, I forgot the matrix on another computer). I'm not at ease with vorbis at this bitrate during a blind test : it sounds too particular (hiss, desquilibrated tonal range : more treble, poor low-medium, and limited stereo), and it's easy for me to detect the encoder. I'm rating vorbis, and not an unknow encoder. So it isn't blind anymore.

Ditto. Well, I wasn't sure it was vorbis (because I've never used it), but I hated it and soon recognised it in all subsequent samples. It just killed the stereo. This would have been less obvious for non-critical listening over speakers (i.e. not in the sweet spot), but over headphones it was useless!

Quote
Lower anchor was rarely the worse file I rated : on 8 files, I rated other encodings as worst one. I prefer an excessive low-passed sound without artefacts than a richer sound, but destroyed by flanging. Personal taste.


I'm sure I wrote in one of them "I can hear the low pass, but it's less annoying that what some of the others are doing!". There was nasty temporal smearing that destroyed the "fun" of the music in a way that a low pass doesn't.



Roberto, what do the results look like if you transform each person's scores into rankings, before doing the analysis?

Cheers,
David.
Title: 64kbps public listening test
Post by: 2Bdecided on 22 September, 2003, 05:44:16 AM
btw(!) Thanks for conducting such an excellent, and interesting test!

EDIT: Can hydrogen audio set up a directory of listening test results (with or without samples)? It would be useful to have them all in one place, and backed up for if/when other peoples servers go offline.

What do other people think of this idea?

Cheers,
David.

EDIT2: Does this mean that any codec producer who says "sounds as good as 128kbps mp3 at only 64kbps!" can now be taken to court for false advertising?
Title: 64kbps public listening test
Post by: guruboolez on 22 September, 2003, 05:52:01 AM
Note than Vorbis sufffered by an artifact (reverberation or hollow sound), introduce by latest CVS encoder, used in this test. See :
http://www.hydrogenaudio.org/forums/index....pic=7197&st=25& (http://www.hydrogenaudio.org/forums/index.php?showtopic=7197&st=25&)

(I'm not enterely sure that previous encoders didn't have this flaw).
Title: 64kbps public listening test
Post by: Continuum on 22 September, 2003, 05:58:42 AM
Quote
The group results seem to be:
Lame > He AAC, MP3Pro, Vorbis (with He AAC > Vorbis) > Real, WMA, QT AAC > FhG.
[...]
Well, it's good to know that I agree with the majority .

I can't say that for me:
lame > HeAAC > Real > Vorbis, mp3pro, QTaac > WMA > FhG

Like in the c't test I'm a "Real Audio" fan!

I tested only 4 samples though (exactly those with the least participation B)).
Title: 64kbps public listening test
Post by: ScorLibran on 22 September, 2003, 06:27:15 AM
Quote
With just five samples the scope for meaningful statistics is reduced, but we can say that you were probably able to detect Real, FhG MP3 and Vorbis from the original, and not able to detect AAC or MP3 Pro.

Very likely.  You'd be amazed (and I'm embarrassed to say) how many 5.0's I noted...about half of the test groups across the samples I tested.  Oh well.  That just means I need more practice listening for artifacts.  (But in the meantime, I save a *lot* of hard disk space with medium-low bitrate music that sounds just fine to my ears.)

Quote
Does this mean that any codec producer who says "sounds as good as 128kbps mp3 at only 64kbps!" can now be taken to court for false advertising?

Well if they say it here, anyway, they'll be flogged with TOS rule #8. 
Title: 64kbps public listening test
Post by: eloj on 22 September, 2003, 06:31:27 AM
Slashdot is going to be hell if this is posted and the lame 128kbit sample isn't properly labeled in the graph :-O

Also, how many participated? Ah.. between 26-43. Not a very large sample then.
Title: 64kbps public listening test
Post by: teetee on 22 September, 2003, 06:55:31 AM
Quote
EDIT: Can hydrogen audio set up a directory of listening test results (with or without samples)? It would be useful to have them all in one place, and backed up for if/when other peoples servers go offline.

What do other people think of this idea?

Cheers,
David.

Might the Wiki be a suitable place for this repository?
Title: 64kbps public listening test
Post by: schnofler on 22 September, 2003, 07:04:58 AM
My results looked like this: Lame, HE AAC > MP3Pro, Vorbis > Real > WMA, QT, FhG. So I pretty much agree with the overall results.
That listening test really was a very interesting expereince. I found it especially surprising just how well you can train yourself to hear artifacts. In the beginning I was rarely able to tell more than 3 codecs from the original. But after some hours of continuous testing (I tested all of the 12 samples), I mostly had no problems identifying the encoded samples instantly. I even repeated some of the tests I did early on, because I tended to generally rate everything lower after some training.
As for the ratings, I did rate WMA and QT almost exactly the same overall as FhG. Still, these codecs have quite different artifacts. Especially WMA really annoyed me because, on most of the samples, I could hear a constant very high frequency ringing to an otherwise more or less well reproduced sample. Sometimes this made the sample just plain unlistenable (it's most noticeable in Waiting and NewYorkCity). I never rated WMA better than 2.0 because of this.
Generally, my ratings are based on high frequencies. I have pretty good high frequency hearing, so if a codec lowpasses excessively, I find the sound detail-less and dull. On the other hand, things like stereo image are not really important for me (I didn't use headphones).
Title: 64kbps public listening test
Post by: phong on 22 September, 2003, 07:08:21 AM
WOW, very interesting!  The results are so far different from my own that I think I should have my hearing checked.  Here's the original post I had in the other thread:
Quote
I've been itching to discuss the test all weekend!  Here are my results ranked by average score:
1. Lame MP3
2. WMA Std
3. LC AAC
4. HE AAC
5. Vorbis
6. MP3 Pro
7. FhG MP3
8. Real Audio Gecko

Lame was the winner by a large margin.  Nothing was close to WMA for second place.  Both AAC codecs and Vorbis were actually fairly close.  A different set of samples could see them in a much different order.  MP3 Pro and FhG MP3 were almost tied, the difference was very small.  Real audio was a joke.  Lost by a huge margin, even compared to 64k mp3.

Other than Real, which lost on almost every sample, and Lame which won most samples, every codec had at least one sample where they fell near the bottom of the pile.  Illinois and mybloodrusts turned out to have scores that differed wildly from all the other samples.  Illinois in particular gave the second best rating to FhG MP3 and absolutely killed Vorbis.  WMA fell to pieces on the piano sample (Polonaise) for some reason producing a bunch of nasty hissing noises.

I'd confidently say that the claim that any of these perform as well as mp3 at half the bitrate is a BIG FAT LIE.


Obviously, everyone else is hearing a problem with WMA that I'm not.  I'm going to have to look over the comments to see.  Real Audio, FhG and MP3 Pro had very damaging lowpasses to my ears.  I have no idea how either got rated as high as they did.  They sounded horrible to me!
Title: 64kbps public listening test
Post by: Garf on 22 September, 2003, 07:12:52 AM
Quote
Also, how many participated? Ah.. between 26-43. Not a very large sample then.

I guess everybody assumes someone else will do it for them

The results are statistically valid regardless of the small sample size.
Title: 64kbps public listening test
Post by: Vietwoojagig on 22 September, 2003, 07:29:17 AM
Will the next test be a 96kbps test, with the same player?
Would be interessting if LC can beat HE in that range or if Lame 128 will be beaten by any codec.
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 07:31:18 AM
Quote
Will the next test be a 96kbps test, with the same player?
Would be interessting if LC can beat HE in that range or if Lame 128 will be beaten by any codec.

he-aac isnt available for 96kbps afaik
Title: 64kbps public listening test
Post by: phong on 22 September, 2003, 07:36:47 AM
Quote
I'm not at ease with vorbis at this bitrate during a blind test : it sounds too particular (hiss, desquilibrated tonal range : more treble, poor low-medium, and limited stereo), and it's easy for me to detect the encoder. I'm rating vorbis, and not an unknow encoder. So it isn't blind anymore.

I agree 100%.  I often knew right away which one was vorbis, and I struggled to not let that influence my ratings.  Am I rating it too high because I'm an OSS true-believer?  Am I rating it too low because I'm overcompensating?

I also agree that its performance was spotty.  A few samples had very serious problems (for me, those included Illinois, Polonaise, gone).  I hope 1.0.1 fixes these issues and some of the problems exposed in the 128k test.
Title: 64kbps public listening test
Post by: Garf on 22 September, 2003, 07:46:49 AM
Quote
he-aac isnt available for 96kbps afaik

bitrate = 95
channels = 2
samplerate = 44100
codec = AAC+SBR
tool = Nero AAC Codec 2.5.5.3


Nero 6 can do it. Whether it's smart I don't know.
Title: 64kbps public listening test
Post by: Garf on 22 September, 2003, 07:47:44 AM
Quote
I hope 1.0.1 fixes these issues and some of the problems exposed in the 128k test.

The actual quality problems won't be fixed until 1.1 I guess. Most of the major issues I heard have been there since 1.0RC2 or so...
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 08:20:42 AM
Quote
Quote
he-aac isnt available for 96kbps afaik

Nero 6 can do it. Whether it's smart I don't know.


which settings did you use to create such a file?
Title: 64kbps public listening test
Post by: JohnV on 22 September, 2003, 08:42:36 AM
Quote
Quote
Quote
he-aac isnt available for 96kbps afaik

Nero 6 can do it. Whether it's smart I don't know.


which settings did you use to create such a file?

Umm, just select cbr 96kbps and High Efficiency profile..
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 08:48:55 AM
Quote
Umm, just select cbr 96kbps and High Efficiency profile..

hm, i meant the vbr profiles, as i dont think that ahead wanted to make it possible for example to create he-aac files with 448kbps by using the cbr option 

also the presets use lc aac for ~96kbps (vbr and cbr)
Title: 64kbps public listening test
Post by: JohnV on 22 September, 2003, 08:52:28 AM
Quote
Quote
Umm, just select cbr 96kbps and High Efficiency profile..

hm, i meant the vbr profiles, as i dont think that ahead wanted to make it possible for example to create he-aac files with 448kbps by using the cbr option 

Well you can't do that, it changes to LC if you try to create a cbr 112kbps HE file. 96kbps cbr is the highest allowed HE profile setting.
Title: 64kbps public listening test
Post by: listen on 22 September, 2003, 09:00:59 AM
Gosh!  I only listened to one sample (New York City)... but I gave Vorbis a 1.3    Even the muffled lo-fi sounds of QT and WMA sounded nicer to me than the lavish sweeps of distortion and generous, unrestrained servings of noise that were dished out by Vorbis.  I really wasn't expecting that at all.  Still, I suppose I shouldn't judge it on one sample alone...

And Lame really is very good, (even if twice the bit-rate)... it's nice to know that.
Title: 64kbps public listening test
Post by: sthayashi on 22 September, 2003, 09:03:13 AM
Anyone else think it interesting that Ogg Vorbis was beaten by QT in the 128kbps test, but crushed QT in this test?

Of course, HE-AAC beat Ogg Vorbis, as did MP3Pro.  Thus, SBR technology beats out Ogg Vorbis.
Title: 64kbps public listening test
Post by: JohnV on 22 September, 2003, 09:06:25 AM
Quote
Anyone else think it interesting that Ogg Vorbis was beaten by QT in the 128kbps test, but crushed QT in this test?

Not really. It was pretty much expected, since QT doesn't have a High Efficiency profile.
Title: 64kbps public listening test
Post by: Jojo on 22 September, 2003, 09:09:25 AM
Great test! However, I wonder who is doing 64kbps encoding. I believe that most people use at least 128kbps or 96kbps at the very minimum...flash card prices are getting lower and lower...anyway, that is why I'm very glad that a 128kbps test already took place...
I hope that other test will follow...how about a 160kbps test

By the way: What is HE-ACC?

Thanks
Title: 64kbps public listening test
Post by: JohnV on 22 September, 2003, 09:18:42 AM
Quote
By the way: What is HE-ACC?

Thanks

HE AAC is one of the profiles of MPEG-4 AAC. It uses SBR (Spectral Band Replication) in order to achieve a high efficiency at low bitrates. Currently there's only one publicly available HE AAC encoder implementation - Nero AAC/AAC-HE encoder.
Check here (http://www.audiocoding.com/wiki/index.php?page=Version+3) for more info about HE AAC.
Title: 64kbps public listening test
Post by: Jon Ingram on 22 September, 2003, 09:23:27 AM
Quote
I hope that other test will follow...how about a 160kbps test

Actually, I'd like to see it go even lower... a 32kbps test (with samples containing some music, but mainly speech) would be a fairly good reflection on low-bitrate streaming -- it's around the rate used by the BBC's RealAudio streams, anyhow. Many people found it hard to detect artifacts even at 64kbps, so as we increase the bitrate the likelihood of getting decent statistically valid results crashes through the floor.

Thinking about this test has revived an old idea of mine, which would be to test the samples without the original present -- the users would then mark which sample sounds better, rather than which sounds closest to the original. I've not yet been able to figure out a decent way to analyse the results, though. Not having a scale opens the possibility of a non-transitive chain: i.e. a set of samples X_1,..., X_k where X_1 is preferred to X_2, X_2 to X_3, ..., *and X_k to X_1*. I'd love to see something like that happen in practise.
Title: 64kbps public listening test
Post by: [proxima] on 22 September, 2003, 09:34:01 AM
This is the table of my results:

(https://hydrogenaud.io/imgcache.php?id=ece796d555111f5a59f294022dce56da" rel="cached" data-warn="External image, click to view at original size" data-url="http://members.xoom.virgilio.it/fofobella/64kbps.png)

According to the averege the best my preferred encoder seems to be HE AAC even if i should say that i’m quite surprised for the result of mp3PRO (or maybe i’ve overestimated HE AAC).

I’ve done a sort of  ranking for the best, real antagonists at 64 Kbps. The first position is colored in green, the second in yellow and the third in red. With this direct comparison Mp3PRO is (according to my preferences) often better than HE AAC.

Mp3Pro has shown a detectable lowpass (16Khz) but this is not the real problem for a 64 Kbps, artifacts are more annoying. There is an interesting thing i’ve perceived: with HE AAC the high frequencies seem unnatural, attenuated, as if it was lowpassed. While doing the blind test i imputed this to the lowpass, but later i’ve discovered that the HE AAC files are lowpassed at about 20 kHz (surely inaudible for me). Does FAAD use dithering when decoding ? If not, i think that the fact could be explained with the SBR “problem” of which guruboolez give us an excellent description.

While the two codecs above scored a very close quality level, i can’t say the same for Vorbis that is often behind the others two. I surely agree with guruboolez: Vorbis in this bitrate range is easy detectable because of noise and exaggerated highs (with sharp attacks the result is quite annoying).
We all are waiting for the 1.1 version that should give better result with this type of artifacts.

At the end, i sincerely want to thank Roberto for his effort organizing this useful test. The number of participants is increased and this is a clear indication of good organization.
Title: 64kbps public listening test
Post by: amano on 22 September, 2003, 09:34:38 AM
hmm, surprisingly wma was very close to the low anchor FhG on some samples. 

wma was probably the standard edition?
lame was probably abr 128?
Title: 64kbps public listening test
Post by: Atlantis on 22 September, 2003, 09:36:58 AM
Quote
hmm, surprisingly wma was very close to the low anchor FhG on some samples.  

wma was probably the standard edition?
lame was probably abr 128?

*  Ahead/Nero 6.0.0.15 HE AAC VBR profile Streaming :: Medium, high quality
    * Ogg Vorbis post-1.0 CVS -q 0
    * MP3pro (from Adobe Audition 1.0) VBR quality 40, Current Codec, allow M/S and IS, allow narrowing, no CRC
    * Real Audio Gecko (from Real Producer 9.0.1 64kbps
    * Windows Media Audio v9 VBR quality 50
    * QuickTime 6.3 AAC LC 64kbps, Best Quality
    * Lame MP3 encoder 3.90.3 --alt-preset 128 --scale 1. high anchor
    * FhG MP3 encoder (from Adobe Audition 1.0) 64kbps CBR, Current codec, allow M/S, no I/S, allow narrowing, no CRC. bottom anchor
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 09:41:25 AM
Quote
Will the next test be a 96kbps test, with the same player?

Well, definitely not from me. I'm not interested in testing 96kbps.

Anyway, I replaced all the plots. Now Lame MP3 is Lame 128. This should help avoid confusion.

@bond: Thanks, I used your zoomed in version
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 09:44:59 AM
Quote
@bond: Thanks, I used your zoomed in version

great that it was usefull
Title: 64kbps public listening test
Post by: Gecko on 22 September, 2003, 09:52:28 AM
At first, let me thank Roberto for his efforts: thank you! It was nice to see that the participation level was high.

For me HE AAC and MP3 Pro came out on top, while Ogg was just mediocre. Forget the rest. The below is only true for my personal ratings and usually the anchors are disregarded, unless mentioned otherwise.

The most surprising result was with sample 06, Illinios. Here HE AAC totally sucked, while Ogg shone (and the others did well too). This is contrary to the average public ranking.

On sample 9, Polonaise, MP3 Pro was worst. MP3 Pro also has the highest standard deviation of all encoders (including anchors).

Sample 07 was also interesting. While all other encoders dipped real low in quality, HE AAC was doing well.

Also worth mentioning is that I didn't rate Lame 128k as best on 6 samples. In 5 of these cases, HE AAC was rated higher than Lame.

WMA std was rated worst 9 out of 12 times. WMA std was also most consistent in quality. It consistently sucked.

On the 4 occasions that FHG was not rated lowest, WMA std was worst.
Title: 64kbps public listening test
Post by: Garf on 22 September, 2003, 10:01:25 AM
Quote
Great test! However, I wonder who is doing 64kbps encoding.

Me! I use 64-80kbps Vorbis to put albums on an USB RAM stick and carry home to use in my mom's portable while coding.

Time to switch to HE-AAC for that...
Title: 64kbps public listening test
Post by: Garf on 22 September, 2003, 10:04:07 AM
Quote
Many people found it hard to detect artifacts even at 64kbps, so as we increase the bitrate the likelihood of getting decent statistically valid results crashes through the floor.


You got that right!

Quote
Thinking about this test has revived an old idea of mine, which would be to test the samples without the original present -- the users would then mark which sample sounds better, rather than which sounds closest to the original.


Doesn't work. Why? Well, see you own comment above...
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 10:10:33 AM
Hello.

Menno came with an idea: Replacing "Lame 128" and "FhG MP3" with "High/Low Anchor" on the plots.

As to completely avoid confusion, since several people are thinking Lame won, and it wasn't there to win or lose to start with.

Any comments? Suggestions on alternatives?
Title: 64kbps public listening test
Post by: guruboolez on 22 September, 2003, 10:21:39 AM
Quote
Hello.

Menno came with an idea: Replacing "Lame 128" and "FhG MP3" with "High/Low Anchor" on the plots.

As to completely avoid confusion, since several people are thinking Lame won, and it wasn't there to win or lose to start with.

Any comments? Suggestions on alternatives?

Yes, it's a good thing. A different color maybe for each anchor, in order to avoid subconscious confusion.
Is it possible to re-assignate a different place for the different plot ? I mean : from left to right, the winner (HE-AAC) to loser (Real?). This may be useful, to see on which sample winner(s) fail(ed). Random position aren't useful in my opinion.
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 10:43:27 AM
Quote
Is it possible to re-assignate a different place for the different plot ? I mean : from left to right, the winner (HE-AAC) to loser (Real?). This may be useful, to see on which sample winner(s) fail(ed). Random position aren't useful in my opinion.

Yes, but that will take longer, since I'll have to redo all the excel plots.

I'll do it later tonight, since I must now leave to Uni.
Title: 64kbps public listening test
Post by: smok3 on 22 September, 2003, 10:43:53 AM
@rjamorim:

tnx for organizing and conducting the test.

flame verloren:
can't open http://audio.ciara.us/test/64test/comments...s/comments.html (http://audio.ciara.us/test/64test/comments/comments.html)

B)
Title: 64kbps public listening test
Post by: AstralStorm on 22 September, 2003, 10:46:08 AM
Thank you rjamorim for making my results 'anon01'. 
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 10:47:06 AM
Quote
flame verloren:
can't open http://audio.ciara.us/test/64test/comments...s/comments.html (http://audio.ciara.us/test/64test/comments/comments.html)

OK, I changed comments.html to index.html.

So please use this link now:
http://audio.ciara.us/test/64test/comments/ (http://audio.ciara.us/test/64test/comments/)

Hopefully it'll work :B
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 10:48:29 AM
Quote
Thank you rjamorim for making my results 'anon01'. 

Well, dude, you didn't specify if you wanted me to associate your results with your name...

Now you're "just another number"
Title: 64kbps public listening test
Post by: bond on 22 September, 2003, 10:50:17 AM
good idea calling it low/high anchor (perhaps you should explain than on top of the page what these are, which encoder, settings and of course bitrate

Quote
from left to right, the winner (HE-AAC) to loser (Real?). This may be useful, to see on which sample winner(s) fail(ed). Random position aren't useful in my opinion.

also nice
but i think it would be good if the order is the same in every plot, so perhaps the codecs should be ordered as they were judged in the overall results (so it is also very easy to see when a codec performed worse and when better)...

and one last optical thing:
if you do the plots again, plz write the codec names horizontally, perhaps use a smaller type size too because now the names are really too big imho (like half the size of the spreadsheet itself)
Title: 64kbps public listening test
Post by: guruboolez on 22 September, 2003, 10:51:38 AM
Glad to know anon1's name. It's one of the rare listener that spoke about reverberation of vorbis encodings.
Title: 64kbps public listening test
Post by: 2Bdecided on 22 September, 2003, 11:18:06 AM
I was sad to be anon05 - I'll have to demand credit for my hard work next time - though I didn't work half as hard as you guys who did them all!

What about processed by rank rather than grade? Is that too much hard work?

Cheers,
David.
Title: 64kbps public listening test
Post by: ff123 on 22 September, 2003, 11:31:26 AM
Quote
I was sad to be anon05 - I'll have to demand credit for my hard work next time - though I didn't work half as hard as you guys who did them all!

What about processed by rank rather than grade? Is that too much hard work?

Cheers,
David.

It should be easy to process by rank -- just run the raw data through the statistical tool again, this time using the friedman analysis.

ff123

Edit:
Both versions of abchr need to have a field for nickname (default nick is anonymous) and a checkbox to indicate whether the listener wants his comments associated with his nickname (just to be doubly sure about privacy); this would be written to a line in the results file.
Title: 64kbps public listening test
Post by: c_haese on 22 September, 2003, 12:44:07 PM
Quote
OK, I changed comments.html to index.html.

So please use this link now:
http://audio.ciara.us/test/64test/comments/ (http://audio.ciara.us/test/64test/comments/)

Hopefully it'll work :B

It's not working for me. The page comes up, but none of the links on that page seem to work. Oh, and the title of that page says "AAC@128kbps test comments."

-Carsten
Title: 64kbps public listening test
Post by: Niknak on 22 September, 2003, 12:49:09 PM
Can you give us the actual bitrates or filesizes with each codec and for each of the samples? Are there any significant differences?
Title: 64kbps public listening test
Post by: de Mon on 22 September, 2003, 01:42:18 PM
I can't understand why didn't the best version of Vorbis used - GT3??? Why did  realy hissy and not tuned version compared with quiet old and tuned products when there exists GT3?
Title: 64kbps public listening test
Post by: Continuum on 22 September, 2003, 01:45:17 PM
GT3 is only real tuned for high bitrates (>160 kbit). It would be no better at 64 kbit than the standard version.
Title: 64kbps public listening test
Post by: sony666 on 22 September, 2003, 01:56:50 PM
the mp3PRO result is interesting.. too bad the format never took off due to the licencing  nightmare  I see the same fate for AAC coming.. unless there will be a free and superior "LAME aac" encoder, mp3 is gonna stay (yes, except the divx ripping ppl)

thanks roberto e.a. for organizing this
Title: 64kbps public listening test
Post by: JohnV on 22 September, 2003, 02:10:14 PM
Quote
I can't understand why didn't the best version of Vorbis used - GT3??? Why did  realy hissy and not tuned version compared with quiet old and tuned products when there exists GT3?

The post 1.0 cvs version used includes actually some new tweakings for low bitrates, and as continuum said, it's gt3 which has nothing new for -q0.
Title: 64kbps public listening test
Post by: JohnV on 22 September, 2003, 02:17:46 PM
Quote
the mp3PRO result is interesting.. too bad the format never took off due to the licencing  nightmare  I see the same fate for AAC coming.. unless there will be a free and superior "LAME aac" encoder, mp3 is gonna stay (yes, except the divx ripping ppl)

1. mp3Pro is no standard -> AAC/AAC-HE is
2. mp3Pro is completely proprietary ->  there are several different AAC implementations
3. FAAC is getting better and it's free. It might not be state of the art, but it's not that bad it used to be. I think it's very soon and partly already ready to challenge at least Lame MP3.

There's still no doubt that mp3 will be around for a looong time..
Title: 64kbps public listening test
Post by: AstralStorm on 22 September, 2003, 02:22:36 PM
I can't read the comments from the site.

If I click any link on that page
after trying 20 times to connect and then getting always redirected
Mozilla says the site exceeds redirection limit.
Title: 64kbps public listening test
Post by: de Mon on 22 September, 2003, 04:54:24 PM
Thanks you for your replies, Continuum and JohnV. I had not known about that info.
Title: 64kbps public listening test
Post by: Dologan on 22 September, 2003, 05:00:02 PM
Heh, I find it weird that I usually scored MP3Pro higher than HE-AAC. Those two (apart from Lame) were generally the only ones that needed careful listening to single out, but once I got hold of the artifacts I usually found HE-AAC's to be a little bit more annoying than those of MP3Pro, although there were times where HE-AAC was amazingly difficult to ABX for me, whereas that didn't happen with MP3Pro
Another deviation of mine from the average is that often WMA was usually almost at the level of FhG 64 in terms of annoyance, separated from QT and Real...

My overall: LAME > MP3Pro, HE-AAC > Ogg > Real = QT > WMA > FhG

~Dologan
Title: 64kbps public listening test
Post by: ff123 on 22 September, 2003, 05:42:18 PM
Results have been slashdotted:

http://slashdot.org/article.pl?sid=03/09/2...tid=141&tid=188 (http://slashdot.org/article.pl?sid=03/09/22/2053225&mode=thread&tid=141&tid=188)

Get your comments and rebuttals in while the thread's still hot.

ff123
Title: 64kbps public listening test
Post by: AstralStorm on 22 September, 2003, 06:03:47 PM
My results:
Code: [Select]
LAME     MP3Pro   HE-AAC   Vorbis   LC-AAC   Real     WMA      FHG
 4.07     3.39     3.13     2.67     2.22     2.05     1.77     1.55

LAME is better than MP3Pro, HE-AAC, Vorbis, LC-AAC, Real, WMA, FHG
MP3Pro is better than Vorbis, LC-AAC, Real, WMA, FHG
HE-AAC is better than LC-AAC, Real, WMA, FHG
Vorbis is better than WMA, FHG
LC-AAC is better than FHG
Title: 64kbps public listening test
Post by: pseudoacoustic on 22 September, 2003, 06:28:52 PM
Great test Roberto. However, I also cannot access the comments page. Would it be too much to ask you to email them to me (the same email I sent you the results with)? Thanks.
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 10:11:17 PM
Quote
Results have been slashdotted:

http://slashdot.org/article.pl?sid=03/09/2...tid=141&tid=188 (http://slashdot.org/article.pl?sid=03/09/22/2053225&mode=thread&tid=141&tid=188)

Get your comments and rebuttals in while the thread's still hot.

ff123

Quote
"Hydrogenaudio has just wrapped up a listening test of various audio codecs at 64kbps."


That pisses me off. A lot.

Sure, it was "Hydrogenaudio" that stayed up till 5:30 AM sorting results, it was "Hydrogenaudio" that took shit on every issue that surfaced, it was "Hydrogenaudio" that screened 500+ results ditching the ones ranking the reference...

I'm not an asshole, but I think credit should be given where credit is due. Credits weren't given to ff123's test (http://slashdot.org/articles/02/07/30/0328223.shtml?tid=141) either, but at least it wasn't misrepresented as being conduced by HA.

sigh...

OK, rant off.


@pseudoacoustic: Check the 4th post in this same thread
http://rarewares.hydrogenaudio.org/rja/comments.zip (http://rarewares.hydrogenaudio.org/rja/comments.zip)
Title: 64kbps public listening test
Post by: ff123 on 22 September, 2003, 11:17:25 PM
I always find stuff to reply to in the /. threads way too late.  In the current thread there were a couple that got my goat (having to do with the legitimacy of subjective testing and the drawing of the confidence interval bars).

ff123
Title: 64kbps public listening test
Post by: Dologan on 22 September, 2003, 11:20:14 PM
Quote
That pisses me off. A lot.

Sure, it was "Hydrogenaudio" that stayed up till 5:30 AM sorting results, it was "Hydrogenaudio" that took shit on every issue that surfaced, it was "Hydrogenaudio" that screened 500+ results ditching the ones ranking the reference...

I'm not an asshole, but I think credit should be given where credit is due. Credits weren't given to ff123's test (http://slashdot.org/articles/02/07/30/0328223.shtml?tid=141) either, but at least it wasn't misrepresented as being conduced by HA.

sigh...

OK, rant off.

Don't be sad, Roberto. Who needs Slashdot appreciation, anyway? We all love you here at HA.org, we all know you're da man. 
Title: 64kbps public listening test
Post by: rjamorim on 22 September, 2003, 11:47:19 PM
Quote
I always find stuff to reply to in the /. threads way too late.  In the current thread there were a couple that got my goat (having to do with the legitimacy of subjective testing and the drawing of the confidence interval bars).

Hehe. That's slashdot for you

What are the links to these comments?
Title: 64kbps public listening test
Post by: phong on 23 September, 2003, 12:07:38 AM
I added a big rant to the /. thread, but I don't know why I bother.  People there are idiots.  Probably it's not worth getting slashdotted at all these days.

As with the 128kbps test, I've produced a spreadsheet of the results.  You can download it in OpenOffice (yay!) format here (http://www.phong.org/audio/64kbps_results.sxc).  You can also get it in Excell (boo!) format here (http://www.phong.org/audio/64kbps_results.xls).  I got a little crazier with the meaningless statistics, formatting and color this time around.  It's a convienient way to see how your results compare against the others, or find out who is the biggest codec h8er, or whatever.

Some neat stuff:
The hardest samples for listeners (and easiest for the encoders) were (not at all surprisingly) mybloodrusts (average score of 2.8) and Waiting (2.7).  The opposite were Illinois (3.4) and Polonaise (3.6).  You can also see who were the most and least easily annoyed listeners.  Take a look to see if your results fall in line with the pack or if you're a deviant like me.

[edit]Hoo-haa!  My rant and someone else giving credit to Roberto both got modded up "Insightful".  Maybe there's hope after all?[/edit]
Title: 64kbps public listening test
Post by: ScorLibran on 23 September, 2003, 12:24:19 AM
Quote
(From Slashdot)

The poster offers an interesting interpretation of the results, but only his/her comments support Ogg Vorbis in this case. The numbers tell a completely different story.

The analysis presented leads us to one conclusion: use Lame 128. It's strictly better than all other options. Do not use FhG MP3. Easy.

If you're willing to slip to 4th best encoder, then consider Ogg Vorbis. 4TH BEST. That's hardly the rosey picture painted in the article.

*S*I*G*H*

And my friends ask why I spend most of my online time on Hydrogen Audio.  All of my reasons have just been summed up in a single quote.

[rant=on]
Some people are like defective playdough...easily impressionable, but not always accurately impressionable.  People who have a concept of what's really happening in the world have to be careful what they say to people who will follow anything they *choose* to hear.  Because the "playdoughpeople" in turn can easily become conspiracy theorists, gossipers, self-proclaimed-experts or instant audiophiles.      Oh, if natural selection only had the same effect on reputations as it did on living things.

Internet forums should use color schemes to make knowledge easily identifiable.  Idiots post in red.  Newbies are green.  Regulars are black.  Veterans are blue.  First sixty days as a member is a screening process to determine later what color you will be assigned after green.  Just an idea...
[/rant]
Title: 64kbps public listening test
Post by: JohnV on 23 September, 2003, 12:31:14 AM
ok ok.. everybody knows what kind of place slashdot is.. 
Can we go back to the topic here please..
Title: 64kbps public listening test
Post by: rjamorim on 23 September, 2003, 12:41:16 AM
LOL!
Title: 64kbps public listening test
Post by: rjamorim on 23 September, 2003, 12:50:46 AM
Quote
Internet forums should use color schemes to make knowledge easily identifiable.  Idiots post in red.  Newbies are green.  Regulars are black.  Veterans are blue.  First sixty days as a member is a screening process to determine later what color you will be assigned after green.  Just an idea...

Even though this idea is lovely, it's also very dangerous.

After all, who decides is someone is an idiot or not?

Of course, some people are obviously idiot and deserve to be labeled that way. But some are, maybe... slow.

Based on what I posted in my first 60 days here at HA, I would surely be awarded the red colour. I made all kinds of clueless questions (hey, I was a newbie), advocated VQF sometimes (it wasn't really dead back them, although was already in advanced coma), and befriended a guy some would consider a great troll (I consider him a great friend).

Oh, yes, I also called JohnV an asshole. :B


Edit: BTW, if I had been labeled red, I would probably get pissed and never return (you know, hot latino blood...).
Title: 64kbps public listening test
Post by: phong on 23 September, 2003, 01:10:12 AM
Quote
Some people are like defective playdough...easily impressionable, but not always accurately impressionable.

You just made my quotes page (http://www.phong.org/quotes.shtml).

Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?  It sounded really good to my ears, and I want to figure out what I'm missing because I'm very uncomfortable about liking WMA.  I've gone over a couple samples again, getting similar results as before, but my results are so much different than almost everyone else's that I think my ears are broken. 
Title: 64kbps public listening test
Post by: ScorLibran on 23 September, 2003, 01:34:37 AM
Quote
Quote
Some people are like defective playdough...easily impressionable, but not always accurately impressionable.

You just made my quotes page (http://www.phong.org/quotes.shtml).

Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?  It sounded really good to my ears, and I want to figure out what I'm missing because I'm very uncomfortable about liking WMA.  I've gone over a couple samples again, getting similar results as before, but my results are so much different than almost everyone else's that I think my ears are broken. 

Thanks...that made my day!  (I bookmarked the page...some good stuff there.    )

As for "liking" WMA, I feel the same way, but I'm not worried about basing my determination on how good it sounds.  Even if it were transparent to me at 64kbps, I still wouldn't like it because of so many other reasons (not audio-related)...

1 - It's Micro$oft.
2 - It's not open source, and it doesn't have open specs (except for recent "opening" of their specs to limited corporate audiences...doesn't count IMO).
3 - DRM seems to be becoming "one" with WMA.  I know they're not the same thing, but if M$ had their way...

So how can M$ get me to like WMA?  Open-source it and drop DRM, for starters.

But anyway, I just downloaded the comments.zip file which Roberto was so nice to post in this thread, and I want to look through the comments #1 to educate myself by analyzing what other people heard with samples compared to what I heard, and #2 to look for consistencies with issues like what you're talking about with WMA so I'll know what to listen for next time.  About half (maybe more) of my results were 5.0...I don't want that to happen again (especially on a low-bitrate audio test).
Title: 64kbps public listening test
Post by: ff123 on 23 September, 2003, 01:35:29 AM
Quote
Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?  It sounded really good to my ears, and I want to figure out what I'm missing because I'm very uncomfortable about liking WMA.  I've gone over a couple samples again, getting similar results as before, but my results are so much different than almost everyone else's that I think my ears are broken. 

typical wma artifacting:

1. metallic noises
2. noise pumping (background noise gets louder during transients)
3. a ringing sound in the background

You should be able to read the comments (are the links working yet?) to see where people complained.

ff123
Title: 64kbps public listening test
Post by: [proxima] on 23 September, 2003, 04:48:55 AM
Quote
Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?

WMA manifested a strange behaviour, sometimes is very good (Enola Gay sample) but usually is poor quality. There are cases in which i rated WMA below the 64kbps anchor. f123 has explained you the tipical artifacts of WMA, i think that you can easily hear these problems with:

Sample02 has a metallic HF content.
Sample05 is ringing at beginning while in the loud part cymbals are reduced to a metallic noise.
sample07 has very annoying background ringing.
sample10 in the final part, between trumpets there is a waving background metallic noise.
sample12 is killer for WMA. At beginning (0-13 secs) there is an annoying ringing sound.

Happy listening 
Title: 64kbps public listening test
Post by: joey_m on 23 September, 2003, 10:58:01 AM
Ditto with WMA... I rated it even better than LAME 128    I also rated Vorbis worse than FhG on one sample (Enola Gay, number 3), and in sample 9 (Polonaise), it's the only codec I can easily pick out (by just hitting "X", some sort of "warbling" in the last 10 seconds), the rest just give me a headache for trying so hard...

My personal "ranking", with only 4 samples (I'm still working on the rest, though)

WMA                4.78   
MP3 128 kbps   4.65   
MP3 Pro             4.55   
QT AAC            3.98   
HE AAC            3.93   
Ogg Vorbis        3.9   
Real Audio        2.78   
MP3 64 kbps    2.4   



Cheers, Joey.
Title: 64kbps public listening test
Post by: vinnie97 on 23 September, 2003, 12:01:48 PM
If you can easily detect upper-range frequencies, then the metallic artifacts of WMA become disturbingly obvious.

Roberto, a bit of a n00b question here, but is there any reason why my results aren't included in samples 1, 2 & 4? Was it an effort to keep things fair & balanced, somehow?
Title: 64kbps public listening test
Post by: rjamorim on 23 September, 2003, 12:51:08 PM
Quote
Roberto, a bit of a n00b question here, but is there any reason why my results aren't included in samples 1, 2 & 4? Was it an effort to keep things fair & balanced, somehow?

hrm... Sorry to say this, but it's because you ranked the reference on these samples.

Like this:
Code: [Select]
6L File: .\Sample04\experiencia.wav
6L Rating: 4.7
6L Comment:


You see, the encoded files have a number before the .wav, that identify what encoder was used there. And if there's no number, it means you gave a ranking to the original instead of the encoded.

That's why these results had to be removed.

Regards;

Roberto.
Title: 64kbps public listening test
Post by: [proxima] on 23 September, 2003, 01:07:37 PM
Quote
You see, the encoded files have a number before the .wav, that identify what encoder was used there. And if there's no number, it means you gave a ranking to the original instead of the encoded.

That's why these results had to be removed.

I've done the same mistake with LAME 128 kbps and sample08 but this doesn't necessary mean that i've ranked the original because there is a successful ABX test attached. So.. there is a possibility that i've ranked the file during ABXing.
Obviously i respect your choice but i would know if you've considered this different listener behaviour. Thanks.
Title: 64kbps public listening test
Post by: Volcano on 23 September, 2003, 01:12:35 PM
Just wanted to say... great work as usual, Roberto & Co, and sorry that I didn't participate. (I left it until the very last day, downloaded the samples and then got tied up in other things, so I didn't manage to test... )

@phong: Good post you made there at slashdot.
Title: 64kbps public listening test
Post by: rjamorim on 23 September, 2003, 01:29:03 PM
Quote
,Sep 23 2003, 02:07 PM] I've done the same mistake with LAME 128 kbps and sample08 but this doesn't necessary mean that i've ranked the original because there is a successful ABX test attached. So.. there is a possibility that i've ranked the file during ABXing.

Well, the problem is, even if you ABX'd it correctly, I can't know what score to give to the sample.

If I only delete the part of the offending sample, it's unfair, because then it gets 5 points, and you did detect a difference.

So, should I give the encoded sample the score you gave the reference? Should I give a higher score? Or a smaller one?

Any of these ways is highly discussable and, in one way or another, both fair and unfair. So, to avoid doing something that is wrong one way or another, I delete the results file for that sample.

Regards;

Roberto.
Title: 64kbps public listening test
Post by: [proxima] on 23 September, 2003, 02:22:52 PM
Roberto,

these sort of mistakes clearly indicate that there are very subtle differences.
In these cases the score of the sample is quite often near 5.0 and i think that is scored a little below 5.0 only for the fact of being ABXable. In pratice, above certain levels, the score does not necessarily indicate the annoyances but only a difference that certainly exist (because ABXed).

If the listener has voted in his mind the sample during ABXing, later he have to scroll down the right bar and insert his score. If an error occur in this last operation IMHO this should not be considered  a void result but only a wrong ABX try. So i think that the right choice is to give the encoded sample the score the listener gave the reference.

Ok, let stop now...i agree with you that this cases are high discussable, this is only my way of thinking.
Title: 64kbps public listening test
Post by: Latexxx on 23 September, 2003, 02:54:53 PM
Quote
Ditto with WMA... I rated it even better than LAME 128    I also rated Vorbis worse than FhG on one sample (Enola Gay, number 3), and in sample 9 (Polonaise), it's the only codec I can easily pick out (by just hitting "X", some sort of "warbling" in the last 10 seconds), the rest just give me a headache for trying so hard...

My personal "ranking", with only 4 samples (I'm still working on the rest, though)

WMA                 4.78   
MP3 128 kbps   4.65   
MP3 Pro            4.55   
QT AAC            3.98   
HE AAC             3.93   
Ogg Vorbis        3.9   
Real Audio        2.78   
MP3 64 kbps     2.4   



Cheers, Joey.

It is truly interesting if wma's results are so subjective. Not an offense but i really must ask, how can somebody not hear the wma artifacts. According to some of my own tests, they are even hearable @ 160 kbps (didn't test higher).
Title: 64kbps public listening test
Post by: rjamorim on 23 September, 2003, 03:12:53 PM
Hello.

I just uploaded the bitrate tables. They are available just below the individual plots.


Also, for those interested in a laugh, I uploaded a "comment highlights" to the server.
http://audio.ciara.us/test/64test/comments.../highlights.txt (http://audio.ciara.us/test/64test/comments/highlights.txt)

My personal favourite is Mac's:
Code: [Select]
3R File: .\Sample02\DaFunk_2.wav
3R Rating: 1.5
3R Comment: Uh?  High snares smeared worse than a cheap hooker's makeup.


with an honour mention to phong:
Code: [Select]
4R File: .\Sample07\mybloodrusts_1.wav
4R Rating: 1.0
4R Comment: It's the evil bee codec.  An evil bee encoded this song.  The beginning is an abomination.  The rest sounds muddy and unclear.


and to Gecko:
Code: [Select]
I would have just liked to pull most of them down to 1 and move on, but I restrained myself. It's like rating green from brown shit. Both stink.


Title: 64kbps public listening test
Post by: phong on 23 September, 2003, 03:28:19 PM
Thanks everyone for the comments on WMA, that's exactly what I was looking for.

Quote
It is truly interesting if wma's results are so subjective. Not an offense but i really must ask, how can somebody not hear the wma artifacts. According to some of my own tests, they are even hearable @ 160 kbps (didn't test higher).

I definately heard at least some of them.  They just didn't seem as bad as the artifacts in the other samples.  It may be that my listening environment (loud computer fan and screaming Cheetah HD) nullfied some of the WMA type artifacts (i.e. ringing).  I believe (but haven't tried to prove) that my HF hearing is pretty decent.  The lowpass on the majority of the samples was quite annoying (maybe the most annoying problem for me).  The Real audio samples, for example, usually stuck out as the most annoying by far - they usually had one of the lowest cutoffs AND the swooshing artifacts were usually one of the worst.  I also found MP3Pro to be no better than FhG mp3, because the lowpass on both was the most annoying aspect.

So I guess it's true that each pair of ears is different.

I found digging through the text files to read comments somewhat annoying, so I wrote an inexcuseably ugly perl script to convert all the comments to pretty HTML.  If I want to see what everyone thought of MP3pro on Sample04, I can do it at a glance.  Also, whenever somebody says something like "this one sounds better than #3" it adds a little tag saying what #3 was for that test, and turns it into a hyperlink to the comment for #3.

There are six versions of the file, each sorted differently.  Warning: Each one contains ALL the comments and they're around 300K apiece.  Also, I don't think I've got all the bugs in my script (the inexcusably ugly one), so there are probably a few problems with the pages:

Sorted by codec, then listener, then sample (http://www.phong.org/audio/64k_comments_cls.shtml)
Sorted by codec, then sample, then listener (http://www.phong.org/audio/64k_comments_csl.shtml)
Sorted by listener, then codec, then sample (http://www.phong.org/audio/64k_comments_lcs.shtml)
Sorted by listener, then sample, then codec (http://www.phong.org/audio/64k_comments_lsc.shtml)
Sorted by sample, then codec, then listener (http://www.phong.org/audio/64k_comments_scl.shtml)
Sorted by sample, then listener, then codec (http://www.phong.org/audio/64k_comments_slc.shtml)
Title: 64kbps public listening test
Post by: spoon on 23 September, 2003, 05:36:32 PM
I am thinking that certain manufacturer claims of 64Kbps to be the same quality as mp3 @128Kbps might be true if you were to include other less able mp3 encoders to compare against...Lame is the best but what about an average mp3 encoder?
Title: 64kbps public listening test
Post by: XXX on 23 September, 2003, 06:00:58 PM
It's always a mistake to scale a chart that naturally starts at 0 (or perhaps 1) to start at 2.5.  The only exception where this is valid is for trends, like the stock market, futures, price of gold, and that.  For example, pretend I have 5 tools, rated 1 to 50.  Four score 25 and one scores 30.  If I start the scale at
20, the 25 score looks half as good as the 30 (30 looks 100% better than 25), but the 30 is really only 20% better.

ER Tuft's book _The_ Visual Display of Quantitative Data (1983) was the first on the subject in recent times.  It's been followed-up with another book or two.  Amazon will have these.  Tuft has a website but you have to search.
Title: 64kbps public listening test
Post by: Dibrom on 23 September, 2003, 06:06:24 PM
Quote
Hello.

I just uploaded the bitrate tables. They are available just below the individual plots.


Also, for those interested in a laugh, I uploaded a "comment highlights" to the server.



I especially liked this one:

Quote
2L Comment: Strangely warbly

BUT.

INTERESTING NOTE:  I would be very hard pushed to ABX on my Wharfdale Speakers,
yet my Denons should up Warble straight off!

sod Dibrom and his "Speakers make no odds cack"


Strangely, I don't remember actually saying anything like this.  Perhaps I'm simply forgetful, or maybe it's just the ubiquitous straw man again

Or maybe it's just difficult to understand the difference between "equipment makes up the smallest (or one of the smallest) factor in hearing artifacts" vs "Speakers make no odds."

Oh well, guess you can't expect everyone to be discriminating in thought, especially the disgruntled
Title: 64kbps public listening test
Post by: Dibrom on 23 September, 2003, 06:09:45 PM
Quote
I am thinking that certain manufacturer claims of 64Kbps to be the same quality as mp3 @128Kbps might be true if you were to include other less able mp3 encoders to compare against...Lame is the best but what about an average mp3 encoder?

I've considered this as well.  Given a more average MP3 encoder (or better yet, given an average MP3 file), I believe some of the codecs here would definitely meet this claim.

Some of the encoders here at 64kbps certainly sound better than some of the less proficient mp3 encoders that I've heard at 128kbps.

I think it would be safer to assert that, rather than "no encoder meets these claims", it would be better to say that "these encoders do not necessarily meet these claims."
Title: 64kbps public listening test
Post by: Jon Ingram on 23 September, 2003, 06:13:46 PM
You mention in the blurb that it's interesting to compare to the results from a year ago -- and you're right, it *is* interesting, particularly when you notice how far AAC and WMA have come on in the last year: AAC (in one of its two incarnations)jumping from last to first, while WMA stays firmly toward the back of the pack.

The big disappointment for me has been the almost complete lack of tuning on the Vorbis front. Reading through the comments, I'm surprised at how many negative comments there are about Vorbis, particularly as I almost always found Vorbis encodes relatively acceptable to listen to. I must have hearing similar to the Vorbis tuners, so it's good enough for me, but this test indicates that it's obviously not good enough (at this bitrate, anyhow) for many of you discerning people.

Another interesting thing is noticing how the different results cluster together -- it would be very interesting to perform a cluster analysis on the different samples, and get some idea about whether we listeners cluster naturally into categorisable areas.
Title: 64kbps public listening test
Post by: Mac on 24 September, 2003, 05:04:45 AM
Code: [Select]
7R File: .\Sample09\Polonaise_8.wav
7R Rating: 3.0
7R Comment: It all sounds subtly different.  Couldn't tell on a casual listen through.


I noticed on a couple of samples that if you listened through the whole 20 seconds, Vorbis sounded just fine to me, but if you start listening to short 1 second chunks and concentrated, you noticed a staggering difference between it and the original?

Would it be over-optimistic to suggest this was the intention, make a codec that sounds better in every-day use than on short bursts of concentration?

ps. Roberto - 1) Awesome test, badbwoy! 2) Thankyou for 4 mentions on the funnies list
Title: 64kbps public listening test
Post by: ScorLibran on 24 September, 2003, 05:45:06 AM
Quote
I noticed on a couple of samples that if you listened through the whole 20 seconds, Vorbis sounded just fine to me, but if you start listening to short 1 second chunks and concentrated, you noticed a staggering difference between it and the original?

Interesting concept...Subliminal Transparency.  You only *think* it's transparent if you hear it enough.

Wait...my whole collection is in Vorbis....-q 4.25..............Those Vorbis programmers fooled me!!!.......:fingers in ears:.....LALALALALALALALA.....not listening....not thinking about this.....it really IS transparent....all of it!!!......it's not just subliminal....

Especially worrisome considering I rated Vorbis below the low-anchor overall (8th out of 8).   

But seriously, I'm going to train myself to hear artifacts better before I pull out all my CDs and the FLAC encoder.  This test was a *big* wake up call for me.  But it's actually a big difference between 64kbps nominal and 136kbps nominal, so I'm not losing sleep either.
Title: 64kbps public listening test
Post by: webwonk on 24 September, 2003, 11:36:01 AM
Which CODEC was used for the Real test. Cook or ATRC0? Thanks.

webwonk
Title: 64kbps public listening test
Post by: ErikS on 24 September, 2003, 11:40:44 AM
Quote
Which CODEC was used for the Real test. Cook or ATRC0? Thanks.

webwonk

AFAIK Cook.

see http://www.hydrogenaudio.org/forums/index....showtopic=13127 (http://www.hydrogenaudio.org/forums/index.php?showtopic=13127)
Title: 64kbps public listening test
Post by: rjamorim on 24 September, 2003, 11:47:32 AM
Quote
AFAIK Cook.

see http://www.hydrogenaudio.org/forums/index....showtopic=13127 (http://www.hydrogenaudio.org/forums/index.php?showtopic=13127)

Yes, Cook.

AKA Real Audio Gecko.
Title: 64kbps public listening test
Post by: Garf on 24 September, 2003, 12:03:04 PM
Split Vorbis discussion/flamewar

http://www.hydrogenaudio.org/forums/index....=ST&f=1&t=13531 (http://www.hydrogenaudio.org/forums/index.php?act=ST&f=1&t=13531)
Title: 64kbps public listening test
Post by: c_haese on 24 September, 2003, 12:05:10 PM
Quote
Split Vorbis discussion/flamewar

I'd like to point out, for the record, that it was not I who started the flaming.

Thanks.
Title: 64kbps public listening test
Post by: Garf on 24 September, 2003, 12:08:01 PM
Quote
Quote
Split Vorbis discussion/flamewar

I'd like to point out, for the record, that it was not I who started the flaming.

Thanks.

Nobody ever claimed so.
Title: 64kbps public listening test
Post by: JohnV on 24 September, 2003, 12:10:27 PM
Quote
Quote
Split Vorbis discussion/flamewar

I'd like to point out, for the record, that it was not I who started the flaming.

Thanks.

Well.. I don't think there was/is any flamewar, we are strictly discussing in non-personal level. However, I'm expecting the promised clarification to the patent search issue (meaning documents or such online):
Please continue this discussion here (thread split):
http://www.hydrogenaudio.org/forums/index....showtopic=13531 (http://www.hydrogenaudio.org/forums/index.php?showtopic=13531)
Title: 64kbps public listening test
Post by: phong on 25 September, 2003, 01:54:54 AM
Quote
...I've produced a spreadsheet of the results.  You can download it in OpenOffice (yay!) format here (http://www.phong.org/audio/64kbps_results.sxc).  You can also get it in Excell (boo!) format here (http://www.phong.org/audio/64kbps_results.xls)....

I've updated the spreadsheet - it now has a section on the far right that lets you get a summary of all the results of any one listener and compare to the averages.  Just enter a listener's handle (where it says "garf" in blue right now) and it will show all their scores and the difference between their scores and the averages.  It even highlights lows/highs for each codec in different colors.

ps: I don't mean to pick on Garf.  :-)  His scores were quite average so they produce a nice "typical" chart.
Title: 64kbps public listening test
Post by: tigre on 25 September, 2003, 01:02:12 PM
Quote
Quote
...I've produced a spreadsheet of the results.  You can download it in OpenOffice (yay!) format here (http://www.phong.org/audio/64kbps_results.sxc).  You can also get it in Excell (boo!) format here (http://www.phong.org/audio/64kbps_results.xls)....

I've updated the spreadsheet - it now has a section on the far right that lets you get a summary of all the results of any one listener and compare to the averages.  Just enter a listener's handle (where it says "garf" in blue right now) and it will show all their scores and the difference between their scores and the averages.  It even highlights lows/highs for each codec in different colors.

ps: I don't mean to pick on Garf.  :-)  His scores were quite average so they produce a nice "typical" chart.

Cool thing to play with - thanks!
Title: 64kbps public listening test
Post by: saratoga on 04 October, 2003, 05:42:28 PM
Any reason the link doesn't work?

I'd really like to take a look at the site.

Quote
http://audio.ciara.us/test/64test/results.html (http://audio.ciara.us/test/64test/results.html)
Title: 64kbps public listening test
Post by: verloren on 04 October, 2003, 06:45:37 PM
Quote
Any reason the link doesn't work?

I'd really like to take a look at the site.

No reason at all - I just tried it and it came up fine.  Perhaps you could post what happens and I'll try to sort it out.

Cheers, Paul
audio.ciara.us sponsor
Title: 64kbps public listening test
Post by: Garf on 04 October, 2003, 06:56:31 PM
Quote
ps: I don't mean to pick on Garf.  :-)  His scores were quite average so they produce a nice "typical" chart.


If I interpret the results correctly, for me HE-AAC is a clear winner with MP3Pro second, followed at some distance by WMA and only then Vorbis, which did only marginally better than RealAudio.

There's a goofy result in that I rated the low anchor up to 5.0 in one test, while giving the high anchor a 3.8.
Title: 64kbps public listening test
Post by: ScorLibran on 04 October, 2003, 09:35:22 PM
Quote
Quote
ps: I don't mean to pick on Garf.  :-)  His scores were quite average so they produce a nice "typical" chart.


If I interpret the results correctly, for me HE-AAC is a clear winner with MP3Pro second, followed at some distance by WMA and only then Vorbis, which did only marginally better than RealAudio.

There's a goofy result in that I rated the low anchor up to 5.0 in one test, while giving the high anchor a 3.8.

I had something similar, rating the low anchor at 5.0 on one sample, though the lowest rating I gave the high anchor was a 4.3.

Ironic it seems to me is that I rated Vorbis at this bitrate the lowest overall, slightly under the low anchor.  Ironic, since my entire collection is encoded in Vorbis, although at a bitrate that I tested on many tracks and found to be generally transparent to my ears.  Someone proposed that it may be that since I have listened to more Vorbis than any other codec recently, I may be more "tuned" to pick out artifacts with Vorbis.  I would have thought it would be the other way around...automatically tuning out Vorbis artifacts more than those of other codecs since I listen to Vorbis encodings every day.

Interesting psychoacoustic phenomena either way...
Title: 64kbps public listening test
Post by: webwonk on 06 October, 2003, 10:20:54 AM
Thanks to Both ErikS & rjamorim for their replies to my query (Was the Real CODEC cook or atrc0 - it was Cook), now the follow-up. Was there a reason for not including ATRC0 (Real's 66kbs ATRAC3 implementation) - besides, of course, the obvious difference of 66 and 64kbs. As they are so close, It would be interesting to see how ATRC0 compares to HE-AAC. Has anyone tried this? Any insight would be most appreciated. Thanks again for a very interesting test and subsequent discussion.

Sincerely,

Webwonk.
Title: 64kbps public listening test
Post by: tigre on 06 October, 2003, 01:13:57 PM
Quote
Was there a reason for not including ATRC0 (Real's 66kbs ATRAC3 implementation) - besides, of course, the obvious difference of 66 and 64kbs. As they are so close, It would be interesting to see how ATRC0 compares to HE-AAC. Has anyone tried this? Any insight would be most appreciated.

There is a pre-test thread (http://www.hydrogenaudio.org/show.php/showtopic/12471) with discussion about what codecs to include etc, you might find some answers there.