Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: 64kbps public listening test (Read 61510 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

64kbps public listening test

Reply #75
Quote
I always find stuff to reply to in the /. threads way too late.  In the current thread there were a couple that got my goat (having to do with the legitimacy of subjective testing and the drawing of the confidence interval bars).

Hehe. That's slashdot for you

What are the links to these comments?

64kbps public listening test

Reply #76
I added a big rant to the /. thread, but I don't know why I bother.  People there are idiots.  Probably it's not worth getting slashdotted at all these days.

As with the 128kbps test, I've produced a spreadsheet of the results.  You can download it in OpenOffice (yay!) format here.  You can also get it in Excell (boo!) format here.  I got a little crazier with the meaningless statistics, formatting and color this time around.  It's a convienient way to see how your results compare against the others, or find out who is the biggest codec h8er, or whatever.

Some neat stuff:
The hardest samples for listeners (and easiest for the encoders) were (not at all surprisingly) mybloodrusts (average score of 2.8) and Waiting (2.7).  The opposite were Illinois (3.4) and Polonaise (3.6).  You can also see who were the most and least easily annoyed listeners.  Take a look to see if your results fall in line with the pack or if you're a deviant like me.

[edit]Hoo-haa!  My rant and someone else giving credit to Roberto both got modded up "Insightful".  Maybe there's hope after all?[/edit]
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

64kbps public listening test

Reply #77
Quote
(From Slashdot)

The poster offers an interesting interpretation of the results, but only his/her comments support Ogg Vorbis in this case. The numbers tell a completely different story.

The analysis presented leads us to one conclusion: use Lame 128. It's strictly better than all other options. Do not use FhG MP3. Easy.

If you're willing to slip to 4th best encoder, then consider Ogg Vorbis. 4TH BEST. That's hardly the rosey picture painted in the article.

*S*I*G*H*

And my friends ask why I spend most of my online time on Hydrogen Audio.  All of my reasons have just been summed up in a single quote.

[rant=on]
Some people are like defective playdough...easily impressionable, but not always accurately impressionable.  People who have a concept of what's really happening in the world have to be careful what they say to people who will follow anything they *choose* to hear.  Because the "playdoughpeople" in turn can easily become conspiracy theorists, gossipers, self-proclaimed-experts or instant audiophiles.      Oh, if natural selection only had the same effect on reputations as it did on living things.

Internet forums should use color schemes to make knowledge easily identifiable.  Idiots post in red.  Newbies are green.  Regulars are black.  Veterans are blue.  First sixty days as a member is a screening process to determine later what color you will be assigned after green.  Just an idea...
[/rant]

64kbps public listening test

Reply #78
ok ok.. everybody knows what kind of place slashdot is.. 
Can we go back to the topic here please..
Juha Laaksonheimo

64kbps public listening test

Reply #79
LOL!

64kbps public listening test

Reply #80
Quote
Internet forums should use color schemes to make knowledge easily identifiable.  Idiots post in red.  Newbies are green.  Regulars are black.  Veterans are blue.  First sixty days as a member is a screening process to determine later what color you will be assigned after green.  Just an idea...

Even though this idea is lovely, it's also very dangerous.

After all, who decides is someone is an idiot or not?

Of course, some people are obviously idiot and deserve to be labeled that way. But some are, maybe... slow.

Based on what I posted in my first 60 days here at HA, I would surely be awarded the red colour. I made all kinds of clueless questions (hey, I was a newbie), advocated VQF sometimes (it wasn't really dead back them, although was already in advanced coma), and befriended a guy some would consider a great troll (I consider him a great friend).

Oh, yes, I also called JohnV an asshole. :B


Edit: BTW, if I had been labeled red, I would probably get pissed and never return (you know, hot latino blood...).

64kbps public listening test

Reply #81
Quote
Some people are like defective playdough...easily impressionable, but not always accurately impressionable.

You just made my quotes page.

Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?  It sounded really good to my ears, and I want to figure out what I'm missing because I'm very uncomfortable about liking WMA.  I've gone over a couple samples again, getting similar results as before, but my results are so much different than almost everyone else's that I think my ears are broken. 
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

64kbps public listening test

Reply #82
Quote
Quote
Some people are like defective playdough...easily impressionable, but not always accurately impressionable.

You just made my quotes page.

Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?  It sounded really good to my ears, and I want to figure out what I'm missing because I'm very uncomfortable about liking WMA.  I've gone over a couple samples again, getting similar results as before, but my results are so much different than almost everyone else's that I think my ears are broken. 

Thanks...that made my day!  (I bookmarked the page...some good stuff there.    )

As for "liking" WMA, I feel the same way, but I'm not worried about basing my determination on how good it sounds.  Even if it were transparent to me at 64kbps, I still wouldn't like it because of so many other reasons (not audio-related)...

1 - It's Micro$oft.
2 - It's not open source, and it doesn't have open specs (except for recent "opening" of their specs to limited corporate audiences...doesn't count IMO).
3 - DRM seems to be becoming "one" with WMA.  I know they're not the same thing, but if M$ had their way...

So how can M$ get me to like WMA?  Open-source it and drop DRM, for starters.

But anyway, I just downloaded the comments.zip file which Roberto was so nice to post in this thread, and I want to look through the comments #1 to educate myself by analyzing what other people heard with samples compared to what I heard, and #2 to look for consistencies with issues like what you're talking about with WMA so I'll know what to listen for next time.  About half (maybe more) of my results were 5.0...I don't want that to happen again (especially on a low-bitrate audio test).

64kbps public listening test

Reply #83
Quote
Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?  It sounded really good to my ears, and I want to figure out what I'm missing because I'm very uncomfortable about liking WMA.  I've gone over a couple samples again, getting similar results as before, but my results are so much different than almost everyone else's that I think my ears are broken. 

typical wma artifacting:

1. metallic noises
2. noise pumping (background noise gets louder during transients)
3. a ringing sound in the background

You should be able to read the comments (are the links working yet?) to see where people complained.

ff123

64kbps public listening test

Reply #84
Quote
Can anyone point me to some spots in any of these samples that really show well the typical problems with WMA?

WMA manifested a strange behaviour, sometimes is very good (Enola Gay sample) but usually is poor quality. There are cases in which i rated WMA below the 64kbps anchor. f123 has explained you the tipical artifacts of WMA, i think that you can easily hear these problems with:

Sample02 has a metallic HF content.
Sample05 is ringing at beginning while in the loud part cymbals are reduced to a metallic noise.
sample07 has very annoying background ringing.
sample10 in the final part, between trumpets there is a waving background metallic noise.
sample12 is killer for WMA. At beginning (0-13 secs) there is an annoying ringing sound.

Happy listening 
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1

64kbps public listening test

Reply #85
Ditto with WMA... I rated it even better than LAME 128    I also rated Vorbis worse than FhG on one sample (Enola Gay, number 3), and in sample 9 (Polonaise), it's the only codec I can easily pick out (by just hitting "X", some sort of "warbling" in the last 10 seconds), the rest just give me a headache for trying so hard...

My personal "ranking", with only 4 samples (I'm still working on the rest, though)

WMA                4.78   
MP3 128 kbps   4.65   
MP3 Pro             4.55   
QT AAC            3.98   
HE AAC            3.93   
Ogg Vorbis        3.9   
Real Audio        2.78   
MP3 64 kbps    2.4   



Cheers, Joey.

64kbps public listening test

Reply #86
If you can easily detect upper-range frequencies, then the metallic artifacts of WMA become disturbingly obvious.

Roberto, a bit of a n00b question here, but is there any reason why my results aren't included in samples 1, 2 & 4? Was it an effort to keep things fair & balanced, somehow?

64kbps public listening test

Reply #87
Quote
Roberto, a bit of a n00b question here, but is there any reason why my results aren't included in samples 1, 2 & 4? Was it an effort to keep things fair & balanced, somehow?

hrm... Sorry to say this, but it's because you ranked the reference on these samples.

Like this:
Code: [Select]
6L File: .\Sample04\experiencia.wav
6L Rating: 4.7
6L Comment:


You see, the encoded files have a number before the .wav, that identify what encoder was used there. And if there's no number, it means you gave a ranking to the original instead of the encoded.

That's why these results had to be removed.

Regards;

Roberto.

64kbps public listening test

Reply #88
Quote
You see, the encoded files have a number before the .wav, that identify what encoder was used there. And if there's no number, it means you gave a ranking to the original instead of the encoded.

That's why these results had to be removed.

I've done the same mistake with LAME 128 kbps and sample08 but this doesn't necessary mean that i've ranked the original because there is a successful ABX test attached. So.. there is a possibility that i've ranked the file during ABXing.
Obviously i respect your choice but i would know if you've considered this different listener behaviour. Thanks.
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1

64kbps public listening test

Reply #89
Just wanted to say... great work as usual, Roberto & Co, and sorry that I didn't participate. (I left it until the very last day, downloaded the samples and then got tied up in other things, so I didn't manage to test... )

@phong: Good post you made there at slashdot.

64kbps public listening test

Reply #90
Quote
,Sep 23 2003, 02:07 PM] I've done the same mistake with LAME 128 kbps and sample08 but this doesn't necessary mean that i've ranked the original because there is a successful ABX test attached. So.. there is a possibility that i've ranked the file during ABXing.

Well, the problem is, even if you ABX'd it correctly, I can't know what score to give to the sample.

If I only delete the part of the offending sample, it's unfair, because then it gets 5 points, and you did detect a difference.

So, should I give the encoded sample the score you gave the reference? Should I give a higher score? Or a smaller one?

Any of these ways is highly discussable and, in one way or another, both fair and unfair. So, to avoid doing something that is wrong one way or another, I delete the results file for that sample.

Regards;

Roberto.

64kbps public listening test

Reply #91
Roberto,

these sort of mistakes clearly indicate that there are very subtle differences.
In these cases the score of the sample is quite often near 5.0 and i think that is scored a little below 5.0 only for the fact of being ABXable. In pratice, above certain levels, the score does not necessarily indicate the annoyances but only a difference that certainly exist (because ABXed).

If the listener has voted in his mind the sample during ABXing, later he have to scroll down the right bar and insert his score. If an error occur in this last operation IMHO this should not be considered  a void result but only a wrong ABX try. So i think that the right choice is to give the encoded sample the score the listener gave the reference.

Ok, let stop now...i agree with you that this cases are high discussable, this is only my way of thinking.
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1

64kbps public listening test

Reply #92
Quote
Ditto with WMA... I rated it even better than LAME 128    I also rated Vorbis worse than FhG on one sample (Enola Gay, number 3), and in sample 9 (Polonaise), it's the only codec I can easily pick out (by just hitting "X", some sort of "warbling" in the last 10 seconds), the rest just give me a headache for trying so hard...

My personal "ranking", with only 4 samples (I'm still working on the rest, though)

WMA                 4.78   
MP3 128 kbps   4.65   
MP3 Pro            4.55   
QT AAC            3.98   
HE AAC             3.93   
Ogg Vorbis        3.9   
Real Audio        2.78   
MP3 64 kbps     2.4   



Cheers, Joey.

It is truly interesting if wma's results are so subjective. Not an offense but i really must ask, how can somebody not hear the wma artifacts. According to some of my own tests, they are even hearable @ 160 kbps (didn't test higher).

64kbps public listening test

Reply #93
Hello.

I just uploaded the bitrate tables. They are available just below the individual plots.


Also, for those interested in a laugh, I uploaded a "comment highlights" to the server.
http://audio.ciara.us/test/64test/comments.../highlights.txt

My personal favourite is Mac's:
Code: [Select]
3R File: .\Sample02\DaFunk_2.wav
3R Rating: 1.5
3R Comment: Uh?  High snares smeared worse than a cheap hooker's makeup.


with an honour mention to phong:
Code: [Select]
4R File: .\Sample07\mybloodrusts_1.wav
4R Rating: 1.0
4R Comment: It's the evil bee codec.  An evil bee encoded this song.  The beginning is an abomination.  The rest sounds muddy and unclear.


and to Gecko:
Code: [Select]
I would have just liked to pull most of them down to 1 and move on, but I restrained myself. It's like rating green from brown shit. Both stink.



64kbps public listening test

Reply #94
Thanks everyone for the comments on WMA, that's exactly what I was looking for.

Quote
It is truly interesting if wma's results are so subjective. Not an offense but i really must ask, how can somebody not hear the wma artifacts. According to some of my own tests, they are even hearable @ 160 kbps (didn't test higher).

I definately heard at least some of them.  They just didn't seem as bad as the artifacts in the other samples.  It may be that my listening environment (loud computer fan and screaming Cheetah HD) nullfied some of the WMA type artifacts (i.e. ringing).  I believe (but haven't tried to prove) that my HF hearing is pretty decent.  The lowpass on the majority of the samples was quite annoying (maybe the most annoying problem for me).  The Real audio samples, for example, usually stuck out as the most annoying by far - they usually had one of the lowest cutoffs AND the swooshing artifacts were usually one of the worst.  I also found MP3Pro to be no better than FhG mp3, because the lowpass on both was the most annoying aspect.

So I guess it's true that each pair of ears is different.

I found digging through the text files to read comments somewhat annoying, so I wrote an inexcuseably ugly perl script to convert all the comments to pretty HTML.  If I want to see what everyone thought of MP3pro on Sample04, I can do it at a glance.  Also, whenever somebody says something like "this one sounds better than #3" it adds a little tag saying what #3 was for that test, and turns it into a hyperlink to the comment for #3.

There are six versions of the file, each sorted differently.  Warning: Each one contains ALL the comments and they're around 300K apiece.  Also, I don't think I've got all the bugs in my script (the inexcusably ugly one), so there are probably a few problems with the pages:

Sorted by codec, then listener, then sample
Sorted by codec, then sample, then listener
Sorted by listener, then codec, then sample
Sorted by listener, then sample, then codec
Sorted by sample, then codec, then listener
Sorted by sample, then listener, then codec
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

64kbps public listening test

Reply #95
I am thinking that certain manufacturer claims of 64Kbps to be the same quality as mp3 @128Kbps might be true if you were to include other less able mp3 encoders to compare against...Lame is the best but what about an average mp3 encoder?

64kbps public listening test

Reply #96
It's always a mistake to scale a chart that naturally starts at 0 (or perhaps 1) to start at 2.5.  The only exception where this is valid is for trends, like the stock market, futures, price of gold, and that.  For example, pretend I have 5 tools, rated 1 to 50.  Four score 25 and one scores 30.  If I start the scale at
20, the 25 score looks half as good as the 30 (30 looks 100% better than 25), but the 30 is really only 20% better.

ER Tuft's book _The_ Visual Display of Quantitative Data (1983) was the first on the subject in recent times.  It's been followed-up with another book or two.  Amazon will have these.  Tuft has a website but you have to search.

64kbps public listening test

Reply #97
Quote
Hello.

I just uploaded the bitrate tables. They are available just below the individual plots.


Also, for those interested in a laugh, I uploaded a "comment highlights" to the server.



I especially liked this one:

Quote
2L Comment: Strangely warbly

BUT.

INTERESTING NOTE:  I would be very hard pushed to ABX on my Wharfdale Speakers,
yet my Denons should up Warble straight off!

sod Dibrom and his "Speakers make no odds cack"


Strangely, I don't remember actually saying anything like this.  Perhaps I'm simply forgetful, or maybe it's just the ubiquitous straw man again

Or maybe it's just difficult to understand the difference between "equipment makes up the smallest (or one of the smallest) factor in hearing artifacts" vs "Speakers make no odds."

Oh well, guess you can't expect everyone to be discriminating in thought, especially the disgruntled

64kbps public listening test

Reply #98
Quote
I am thinking that certain manufacturer claims of 64Kbps to be the same quality as mp3 @128Kbps might be true if you were to include other less able mp3 encoders to compare against...Lame is the best but what about an average mp3 encoder?

I've considered this as well.  Given a more average MP3 encoder (or better yet, given an average MP3 file), I believe some of the codecs here would definitely meet this claim.

Some of the encoders here at 64kbps certainly sound better than some of the less proficient mp3 encoders that I've heard at 128kbps.

I think it would be safer to assert that, rather than "no encoder meets these claims", it would be better to say that "these encoders do not necessarily meet these claims."

64kbps public listening test

Reply #99
You mention in the blurb that it's interesting to compare to the results from a year ago -- and you're right, it *is* interesting, particularly when you notice how far AAC and WMA have come on in the last year: AAC (in one of its two incarnations)jumping from last to first, while WMA stays firmly toward the back of the pack.

The big disappointment for me has been the almost complete lack of tuning on the Vorbis front. Reading through the comments, I'm surprised at how many negative comments there are about Vorbis, particularly as I almost always found Vorbis encodes relatively acceptable to listen to. I must have hearing similar to the Vorbis tuners, so it's good enough for me, but this test indicates that it's obviously not good enough (at this bitrate, anyhow) for many of you discerning people.

Another interesting thing is noticing how the different results cluster together -- it would be very interesting to perform a cluster analysis on the different samples, and get some idea about whether we listeners cluster naturally into categorisable areas.