HydrogenAudio

Hydrogenaudio Forum => Listening Tests => Topic started by: rjamorim on 2004-05-24 06:33:37

Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 06:33:37
Hello.

I'd like to announce the results of the Multiformat at 128kbps listening test

Vorbis aoTuV is tied to Musepack at first place, Lame MP3 is tied to iTunes AAC at second place, WMA Standard is in third place and Atrac3 gets last place.

The results page is here:
http://www.rjamorim.com/test/multiformat128/results.html (http://www.rjamorim.com/test/multiformat128/results.html)

For those in a hurry, here are the zoomed overall results:
(http://www.noveo.net/rjamorim/plot18z.png)

Big thanks to everyone that helped and participated.

Best regards;
Roberto.
Title: Multiformat@128kbps listening test - FINISHED
Post by: magic75 on 2004-05-24 06:45:39
Now that was a surprise... Lame as good as AAC??? Anyone expected that?
Title: Multiformat@128kbps listening test - FINISHED
Post by: ScorLibran on 2004-05-24 06:55:12
Vorbis (aoTuV) and MPC tied for first place.  LAME and iTunes tied for second.  Then WMA-S in third, and ATRAC3 at the back of the pack.

Funny that there was no real consistency this time across music types with the formats tested.  Tends to oppose theories about certain formats excelling with certain types of music.  At least among these samples.
Title: Multiformat@128kbps listening test - FINISHED
Post by: bidz on 2004-05-24 07:01:48
What the!  ... surprised!
Title: Multiformat@128kbps listening test - FINISHED
Post by: QuantumKnot on 2004-05-24 07:01:49
Whoa, look at aoTuV!! 

It is now as good as MPC.  Very good work, Aoyumi.  Vorbis is now back in the spotlight.
Title: Multiformat@128kbps listening test - FINISHED
Post by: harashin on 2004-05-24 07:07:19
I believed Musepack would win the test especially such bitrate range(-q4.15).  Anyway,  it's very interesting result, good job Roberto and all participants.
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-05-24 07:07:50
Surprisingly, MPC 1.14 (same tested last year) isn't tied anymore with iTunes AAC, but “win”.
ATRAC3 (minidisc) is obviously a poor encoding solution.
aoTuV is without doubt a great step behind for Vorbis!
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 07:15:45
The codes:

1 - Vorbis aoTuV
2 - Musepack
3 - Lame MP3
4 - iTunes AAC
5 - Atrac3
6 - WMA Std.

The decryption key:
http://www.rjamorim.com/test/multiformat12...multiformat.key (http://www.rjamorim.com/test/multiformat128/comments/multiformat.key)
Title: Multiformat@128kbps listening test - FINISHED
Post by: JohnV on 2004-05-24 07:18:30
Very good results by aoTuV. It seems all the others have a new target for 128kbps quality now.
One thing which this test shows is that VBR coding (aoTuV, MPC) is definitely way to go for 128kbps, and with good enough VBR tweaking it's certainly possible to be clearly better than CBR (iTunes).
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 07:21:03
Quote
One thing which this test shows is that VBR coding (AoTuV, MPC) is definitely way to go for 128kbps, and with good enough VBR tweaking it's certainly possible to be clearly better than CBR (iTunes).

Yes. That is also true for Lame. With a very good VBR implementation, it got close to the best AAC implementation at that bitrate.

Let's hope Apple implements VBR in their codec, and Ahead improves their implementation considerably.
Title: Multiformat@128kbps listening test - FINISHED
Post by: ScorLibran on 2004-05-24 07:22:42
Quote
I believed Musepack would win the test especially such bitrate range(-q4.15).

I thought so too.

I anticipated a tie between MPC and QT-AAC, then Vorbis in second place, then LAME, then WMA-S and ATRAC at the back.  Vorbis and QT-AAC both surprised me.
Title: Multiformat@128kbps listening test - FINISHED
Post by: harashin on 2004-05-24 07:29:25
My browsers(Firefox, MSIE) don't show test comments correctly. Also, the title of this page (http://www.rjamorim.com/test/multiformat128/comments/comments.html) seems to be wrong.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 07:33:32
Quote
My browsers(Firefox, MSIE) don't show test comments correctly.

It's XML. IE should show something like this:
http://esc17.midphase.com/~calmerc/screenshots/screen-1.jpg (http://esc17.midphase.com/~calmerc/screenshots/screen-1.jpg)

XML is worse for readability but easier to be parsed. That's why Schnofler switched to XML results in recent versions of ABC/HR Java.

Quote
Also, the title of this page (http://www.rjamorim.com/test/multiformat128/comments/comments.html) seems to be wrong.


Fixed. Thanks for reporting.
Title: Multiformat@128kbps listening test - FINISHED
Post by: harashin on 2004-05-24 07:38:09
Quote
XML is worse for readability but easier to be parsed. That's why Schnofler switched to XML results in recent versions of ABC/HR Java.

I expected something like in raw *.txt format. Thanks for clarification.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 07:40:06
Quote
I expected something like in raw *.txt format. Thanks for clarification.

Schnofler already has a converter from xml -> txt in ABC/HR. But it only works for encrypted results ATM. Hopefully he'll add support for already decrypted results.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Bonzi on 2004-05-24 08:23:14
Wow, what really impresses me is that I don't think there was one sample where the vorbis encoder did poorly.  This is a little shocking after last test.  Excellent work aoTuV!
Title: Multiformat@128kbps listening test - FINISHED
Post by: Der_Iltis on 2004-05-24 08:32:46
Surprise surprise!
I hope this'll give vorbis development a new boost.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Gabriel on 2004-05-24 08:34:44
Oh! Joy!
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 08:38:20
Quote
Oh! Joy!



I'm happy my test is spreading happiness.
Title: Multiformat@128kbps listening test - FINISHED
Post by: bond on 2004-05-24 08:50:26
woow, now thats what i not expected

- vorbis aotuv: vorbis is back, and i am proud to have helped finding out what vorbis encoder should be used
- mpc vs aac: funny that mpc was that better than itunes (with a only 0.15 higher setting than in the last test)
- wma9: lol, worse than mp3! (and i even wonder that it got rated that high, even at 128 it had this metallic sound sometimes) -> go away m$
- atrac3: even worse than wma9 -> go away sony

and if you take this test as a comparison between some online music stores (itunes vs. wma9 based ones vs. sonys new store) itunes clearly comes out as the winner, leaving wma9 behind by far!
Title: Multiformat@128kbps listening test - FINISHED
Post by: FireStarter on 2004-05-24 08:56:23
I see there is a very small margin between mpc and aoTuv, how would aoTuv react
in higher bitrates.?
Title: Multiformat@128kbps listening test - FINISHED
Post by: JeanLuc on 2004-05-24 08:59:56
Very interesting results ...

I think it could be an interesting addition to show the bitrate for each encoder in the specific diagrams for each sample ...
Title: Multiformat@128kbps listening test - FINISHED
Post by: dev0 on 2004-05-24 09:00:02
The more I think of it the more impressed I am with the performance of LAME. Very good work Gabriel (and consider changing -V 5 default --athaa-sensitivity).
Title: Multiformat@128kbps listening test - FINISHED
Post by: Raptus on 2004-05-24 09:00:45
How many results were discarded because of ranked refs?
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 09:06:59
Quote
How many results were discarded because of ranked refs?

54

Mind you that I didn't discard results that ranked the reference but on that sample pair ABXd the samples to a pval of 0.05 or less.
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-24 09:21:48
Some comments:

1. mpc encoded debussy.wav at too low of a bitrate (98 kbit/s), apparently, because multiple people commented on a distorted sound, and its low rating on this sample (3.53) hurt it in comparison with vorbis.  Note that problem samples are not synonymous with high bitrate!  I would hope Frank could look into what's going on with mpc on this sample.

2. It's not clear that the all of the samples which didn't show significant differences (there were 4) would have benefited much with a larger listener sample size.  The Bartok_strings2.wav and OrdinaryWorld.wav samples in particular are pretty evenly rated across the board.

Roberto did a separate analysis omitting these 4 samples and the overall results were very similar to the results with all 18 samples, except that with 18 samples the confidence level increased.  So I'd say they helped out, even if individually they didn't show significant differences between codecs.

3. The absolute ratings of iTunes is remarkably stable in the tests it's been featured in (4.39, 4.42, 4.20, and 4.26 on this one), even though the tests are not strictly comparable.

4.  MPC should have been expected (and it did appear) to be slightly better this time around than the last multiformat test since its quality setting was tweaked up slightly (from 4 to 4.15).

5.  Excellent job on AoTuVb2, Ayumi and everybody else who was involved.  Seeing such a high score in the test shouldn't have been a real big surprise since those virtuoso tuning ears were rating the beta2 version at around 4.0 overall.

6.  Lame is still improving.  Good job Gabriel and [proxima]

ff123

Edit:  After checking, I see that MPC's absolute score went down from 4.51 to 4.47, so comment 4 is not consistent with what actually happened.  But then again, it's not strictly correct to compare scores on one test with scores on another.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Raptus on 2004-05-24 09:25:44
Quote
Quote
How many results were discarded because of ranked refs?

54

Mind you that I didn't discard results that ranked the reference but on that sample pair ABXd the samples to a pval of 0.05 or less.

Ok.
Thats around 15% of the results... And for me it still doesn't feel right to take them as irrelevant for the stats...

What about all the /.ers? 
Seems they were just interested in wasting bandwidth after all 
Title: Multiformat@128kbps listening test - FINISHED
Post by: Grease on 2004-05-24 09:28:41
I found my chanchan listening test result wrongly classified as a NewYorkCity result.


-Grease
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 09:32:06
Quote
What about all the /.ers?  
Seems they were just interested in wasting bandwidth after all 

More than 500 people downloaded the samples through bittorrent only - not counting HTTP downloads! :B

I won't ever understand these people. 
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-24 09:32:50
Quote
Quote
Quote
How many results were discarded because of ranked refs?

54

Mind you that I didn't discard results that ranked the reference but on that sample pair ABXd the samples to a pval of 0.05 or less.

Ok.
Thats around 15% of the results... And for me it still doesn't feel right to take them as irrelevant for the stats...

What about all the /.ers? 
Seems they were just interested in wasting bandwidth after all 

Some results with ranked refs are worse than others.  Roberto showed me results from one person whose listening results I wouldn't trust at all, they were so bad (meaning lots of ranked refs).

There is always a question about how these results should be treated, and there are probably multiple ways of handling them.  The fairest and simplest way seems to be to just throw them away if you have enough results that you can afford to do that, which in this case seems to be true.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 09:33:54
Quote
I found my chanchan listening test result wrongly classified as a NewYorkCity result.

No worries, that classification happened while uploading. I'll move it back to the correct folder later.
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-05-24 09:37:17
Quote
1. mpc encoded debussy.wav at too low of a bitrate (98 kbit/s), apparently, because multiple people commented on a distorted sound, and its low rating on this sample (3.53) hurt it in comparison with vorbis.  Note that problem samples are not synonymous with high bitrate!  I would hope Frank could look into what's going on with mpc on this sample.

The problem seems to be low-volume. MPC --radio have some troubles with low-volume sample, especially when there's a slight amout of noise. Debussy.wav is just an exemple amoung hundred of this problem.
Problem is shoking if playback volume is exceptionnaly high, but is probably less annoying on normal playback conditions (which explain maybe the overall relative good notation of the encoding - I expected to be lower).

Note that standard preset also suffers from this problem, but it's less critical...
Title: Multiformat@128kbps listening test - FINISHED
Post by: amano on 2004-05-24 10:47:55
Wow. That is interesting. LAME with the --athaa-sensitivity switch and aoTuv being that strong.

Thanks to all participating and - of course - to all these great codec developers and exspecially to Roberto himself!!!
Title: Multiformat@128kbps listening test - FINISHED
Post by: XXX on 2004-05-24 11:12:58
This particular test should be called, "The 128 kbps test for iTunes/WMA, and the low-130 test for AC3 and LAME, and the close-to-160 test for MPC/Vorbious.

Leahy   iTunes   MPC   Vorbis   Lame   WMA   Atrac3

bitrate   128   155   149   133   128   132

Score   4.34   4.41   4.68   4.11   4.37   3.76

I am aware of the rationalization. I am aware of the overall average. But let this be a n"oh?" to those that don't and aren't.
Title: Multiformat@128kbps listening test - FINISHED
Post by: amano on 2004-05-24 11:20:41
AC3??? Vorbious???

Get some sleep.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Lyx on 2004-05-24 11:52:35
maybe it would make sence to rename "iTunes" to "iTunes AAC" in the summary chart, so that people do not mistake the iTunes result with its lousy mp3-encoder?

- Lyx
Title: Multiformat@128kbps listening test - FINISHED
Post by: cuan on 2004-05-24 12:01:34
lame's result is fairly amazing. I was about to begin encoding my cd collection into iTunes aac for an iPod im about to purchase. I think ill just stick with lame now. It's level of quality combined with it's compatiblity between mp3 players is an unbeatable combination.
Title: Multiformat@128kbps listening test - FINISHED
Post by: bond on 2004-05-24 12:07:28
rjamorim, can you plz make a zoomed "music store codecs only" chart too (aac, wma9, atrac3), i think it would be very interesting and important to have such a chart handy for showing people that when they have to choose where they should buy songs from, that not only the prices, but also the quality is very important and varries a lot

btw did i already thank you for your great test? thanks a lot!

Quote
maybe it would make sence to rename "iTunes" to "iTunes AAC" in the summary chart, so that people do not mistake the iTunes result with its lousy mp3-encoder?

yepa and maybe add "mp3" to lame too, (and maybe ogg to vorbis) at least in the final chart to exclude all possible misunderstandings
Title: Multiformat@128kbps listening test - FINISHED
Post by: QuantumKnot on 2004-05-24 12:11:18
A big thank you to Roberto for his efforts in conducting this test.  Let's hope that it is not the last too 
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-24 12:28:15
Quote
This particular test should be called, "The 128 kbps test for iTunes/WMA, and the low-130 test for Atrac3 and LAME, and the close-to-160 test for MPC/Vorbis.


Yup, it's hard to compare CBR encoders with VBR encoders.
Everything you do is wrong

Usually all encders tend to produce files at around 128 kbps on an "average" sound file with the same settings. That's why I think it's ok to compare these codecs with these settings. Many test samples were chosen to be hard-to-encode (weren't they?). VBR encoders use higher bitrates in those complex situations. CBR encoders don't.
Bad Luck for the CBR encoders.

So... you can ask yourself: Is the choice of test samples fair ?
I don't know...

bye,
Sebastian
Title: Multiformat@128kbps listening test - FINISHED
Post by: JeanLuc on 2004-05-24 12:37:17
Quote
This particular test should be called, "The 128 kbps test for iTunes/WMA, and the low-130 test for AC3 and LAME, and the close-to-160 test for MPC/Vorbious.

Leahy   iTunes   MPC   Vorbis   Lame   WMA   Atrac3

bitrate   128   155   149   133   128   132

Score   4.34   4.41   4.68   4.11   4.37   3.76

I am aware of the rationalization. I am aware of the overall average. But let this be a n"oh?" to those that don't and aren't.

That's why I suggested to put the bitrates into the score graphs for each sample ... so everyone can see at which average bitrate the codec's result has been obtained.
Title: Multiformat@128kbps listening test - FINISHED
Post by: JohnV on 2004-05-24 12:39:27
Quote
lame's result is fairly amazing. I was about to begin encoding my cd collection into iTunes aac for an iPod im about to purchase. I think ill just stick with lame now. It's level of quality combined with it's compatiblity between mp3 players is an unbeatable combination.

I suggest you do also your own tests concentrading for example on pre-echo etc.  (I'm not saying that either one is better, I have not compared LAME 3.96 -V5 --athaa-sensitivity 1 against iTunes 4.2 with pre-echo).
Remember however that these are average results of a group with restricted amount of samples and listeners with different abilities. It shows pretty well the quality on average, but doesn't necessarely show some of the details which might be interesting for you.
Also I think that Lame 3.96 -V5 --athaa-sensitivity 1 is not tested enough to say it doesn't fail (badly) in certain cases even pretty often. Imo iTunes 4.2 AAC in this sense is more safe.

But, if it's not so big deal, that Lame setting does seem on average pretty good.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Digga on 2004-05-24 13:02:06
Quote
A big thank you to Roberto for his efforts in conducting this test.  Let's hope that it is not the last too 

second the thanks to Roberto and everyone elso involved (including all the testers).

Roberto: come on, be honest, you would realy miss all the hick-hack and nag-nag going hand in hand with the tests, wouldn't you   
Title: Multiformat@128kbps listening test - FINISHED
Post by: diskvask on 2004-05-24 13:04:39
Quote
Quote
What about all the /.ers?  :rolleyes:
Seems they were just interested in wasting bandwidth after all  :lol:

More than 500 people downloaded the samples through bittorrent only - not counting HTTP downloads! :B

I won't ever understand these people.  :frustrated:

I think a lot of people thought that the test was going to be very easy (me included), "Come on, it's 128kbit! That sounds like crap, everybody knows that.".

...only to find out that there couldn't be found any major imperfections in the couple of  samples tried. Sample 1 looks like it was one of the hardest ones to abx; very tough start, especially for someone who had set his mind on the assumtion above.

And besides, abx is an exhausting way of testing and it can be very frustrating/unmotivating if you don't get the results you're expecting ;).
Title: Multiformat@128kbps listening test - FINISHED
Post by: Jojo on 2004-05-24 13:07:40
Quote
woow, now thats what i not expected

- wma9: lol, worse than mp3! (and i even wonder that it got rated that high, even at 128 it had this metallic sound sometimes) -> go away m$

it's a pitty that wma9 Pro was included in the test ...last test it was included it performed quite well
Title: Multiformat@128kbps listening test - FINISHED
Post by: JohnV on 2004-05-24 13:26:21
Quote
Quote
woow, now thats what i not expected

- wma9: lol, worse than mp3! (and i even wonder that it got rated that high, even at 128 it had this metallic sound sometimes) -> go away m$

it's a pitty that wma9 Pro was included in the test ...last test it was included it performed quite well

Answer why wma9 pro was not included is here: http://www.hydrogenaudio.org/forums/index....ndpost&p=199103 (http://www.hydrogenaudio.org/forums/index.php?showtopic=20301&view=findpost&p=199103)
Title: Multiformat@128kbps listening test - FINISHED
Post by: dev0 on 2004-05-24 13:30:39
Quote
This particular test should be called, "The 128 kbps test for iTunes/WMA, and the low-130 test for AC3 and LAME, and the close-to-160 test for MPC/Vorbious.

Leahy   iTunes   MPC   Vorbis   Lame   WMA   Atrac3

bitrate   128   155   149   133   128   132

Score   4.34   4.41   4.68   4.11   4.37   3.76

I am aware of the rationalization. I am aware of the overall average. But let this be a n"oh?" to those that don't and aren't.

Where did you get those numbers from?
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-24 13:35:57
@ Roberto

A big thanks for making this test possible. I hope you reconsider making more tests in the future.

About the test results, I noticed that for some samples there are no confidence intervals on the graphs (bartok_strings, leahy, mahler, ordinary world). Did everybody score exactly the same on these samples, or maybe you just forgot to put the intervals on the graphs?
Title: Multiformat@128kbps listening test - FINISHED
Post by: JohnV on 2004-05-24 13:44:35
Quote
This particular test should be called, "The 128 kbps test for iTunes/WMA, and the low-130 test for AC3 and LAME, and the close-to-160 test for MPC/Vorbious.

Leahy   iTunes   MPC   Vorbis   Lame   WMA   Atrac3

bitrate   128   155   149   133   128   132

Score   4.34   4.41   4.68   4.11   4.37   3.76

I am aware of the rationalization. I am aware of the overall average. But let this be a n"oh?" to those that don't and aren't.

See here how the average bitrates were decided for this test (personally I'm not absolutely sure if it was enough). Obviously those settings in the table close to 128 were used:
http://www.hydrogenaudio.org/forums/index....ndpost&p=207203 (http://www.hydrogenaudio.org/forums/index.php?showtopic=21079&view=findpost&p=207203)

Also the correct average bitrates for the 18 samples tested are (instead of what you said):
Code: [Select]
iTunes MPC   aoTuV  Lame    WMA  Atrac3
128    136     135   134    128    132
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-05-24 13:54:04
Roberto> what software did you used to obtain wma9 files? Is it VBR-2 pass 128 kbps? What decoder? I've tried to reproduce the same wavform with different settings, and I wasn't able to do it.
Title: Multiformat@128kbps listening test - FINISHED
Post by: 2Bdecided on 2004-05-24 13:57:05
I'm very sorry I couldn't participate in this test.

However, many thanks to you Roberto - and isn't it nice to have such interesting results? I don't think many people expected this.

To me, these bitrates are very interesting. It's amazing both how good, and how bad, some things can sound at 128kbps.

Also, I bet almost every sample in this test was light years better than most people's typical experience of 128kbps lossy coding!

Cheers,
David.
Title: Multiformat@128kbps listening test - FINISHED
Post by: harashin on 2004-05-24 13:59:23
Quote
Roberto> what software did you used to obtain wma9 files? Is it VBR-2 pass 128 kbps? What decoder? I've tried to reproduce the same wavform with different settings, and I wasn't able to do it.

I already asked him about this.
http://www.hydrogenaudio.org/forums/index....ndpost&p=210584 (http://www.hydrogenaudio.org/forums/index.php?showtopic=21370&view=findpost&p=210584)

EDIT:It's certainly Bitrate VBR 128kbps, 44kHz, stereo VBR 1pass.

BTW, a thread is made at Slashdot about the test results.
Vorbis And Musepack Win 128kbps Multiformat Test (http://slashdot.org/article.pl?sid=04/05/24/0623247&mode=thread&tid=141&tid=185&tid=188)
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-05-24 14:04:05
harashin> I need to try again. I probably did a mistake. Thanks


Other question: did someone post these results on minidisc ddicated boards? It would be interesting, because a lot of MD users often said in these boards that atrac3@132 = mp3@192...
Title: Multiformat@128kbps listening test - FINISHED
Post by: kalmark on 2004-05-24 14:18:49
Thanks rjamorim for this test, and all testers for their time and results!

And imagine, the first Hungarian internet music store uses WMA instead of MP3., because it has "better wuality"  (And DRM, but they could have used AAC for DRM'd music)

Yeah, and thanks for the Hungarian references in the samples!  (Bartók, Dances Hongroises - or whatever, they're "Magyar táncok" in Hungarian  )
Title: Multiformat@128kbps listening test - FINISHED
Post by: Cygnus X1 on 2004-05-24 15:04:03
Quote
Other question: did someone post these results on minidisc ddicated boards? It would be interesting, because a lot of MD users often said in these boards that atrac3@132 = mp3@192...

Actually, I used to hear things to the effect of "LP2 (132kbps) sounds better than a 320kbps MP3!!!" As much as I would love to send an email to the webmaster of the MD Community page or create a thread on one of their forums, I don't think posting the results will have much of an effect. If my past experiences are any indication of the MD community's openess to such data, it will be ignored or disputed via the usual subjective, empirically-dubious arguments one might expect to be prevelant on such boards. For example, when I ABX'ed ATRAC-R with ease a few years ago and posted the results in the most unbiased manner possible, I was either flamed for being "anti-MD" or just outright ignored. The truth hurts, eh? 

Roberto, kudos on another great test!
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-05-24 15:25:38
Cygnus X1> I've also read that MD > CD, because MD is 292 kbps and CD 176 

This current test is nevertheless different: there's not only one “biased” tester who posted false results, but a whole community, with 18 samples, on ABX conditions. Some opinions might change
Title: Multiformat@128kbps listening test - FINISHED
Post by: PoisonDan on 2004-05-24 16:36:46
Cool, these results clearly show that

WMA 1ST DEATH :[

/me wonders what the Extremetech editors will think when - if - they see these results...
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-24 17:30:30
This just in:

http://slashdot.org/comments.pl?sid=108647&cid=9238730 (http://slashdot.org/comments.pl?sid=108647&cid=9238730)

and

http://slashdot.org/comments.pl?sid=108647&cid=9238686 (http://slashdot.org/comments.pl?sid=108647&cid=9238686)

"actual ranking is Vorbis, iTunes, MPC, Lame, WMA, Atrac3" after "a quick-n-dirty compensation, [using] the average scores times 128 over the average bitrate."

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: JohnV on 2004-05-24 17:34:20
Quote
This just in:

http://slashdot.org/comments.pl?sid=108647&cid=9238730 (http://slashdot.org/comments.pl?sid=108647&cid=9238730)

and

http://slashdot.org/comments.pl?sid=108647&cid=9238686 (http://slashdot.org/comments.pl?sid=108647&cid=9238686)

"actual ranking is Vorbis, iTunes, MPC, Lame, WMA, Atrac3" after "a quick-n-dirty compensation, [using] the average scores times 128 over the average bitrate."

ff123

Not very hard to guess that these "compensators" appear who don't want to understand the concept of vbr and how the test was conducted. Though last time I remember something like this on HA also. 
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 18:03:45
Quote
rjamorim, can you plz make a zoomed "music store codecs only" chart too (aac, wma9, atrac3), i think it would be very interesting and important to have such a chart handy for showing people that when they have to choose where they should buy songs from, that not only the prices, but also the quality is very important and varries a lot

Here it is. I probably won't add this graph to the results page. There are already plenty of graphs there.
http://www.rjamorim.com/test/multiformat128/plot18b.png (http://www.rjamorim.com/test/multiformat128/plot18b.png)

Quote
Quote
maybe it would make sence to rename "iTunes" to "iTunes AAC" in the summary chart, so that people do not mistake the iTunes result with its lousy mp3-encoder?

yepa and maybe add "mp3" to lame too, (and maybe ogg to vorbis) at least in the final chart to exclude all possible misunderstandings


God, no. If people are that uninformed, they shouldn't be even reading those results.

Quote
A big thank you to Roberto for his efforts in conducting this test. Let's hope that it is not the last too


Penultimate

Quote
That's why I suggested to put the bitrates into the score graphs for each sample ... so everyone can see at which average bitrate the codec's result has been obtained.


That will lead peopel to linking bitrates with scores, just like happened at /. - and that is wrong.

Quote
I think a lot of people thought that the test was going to be very easy (me included), "Come on, it's 128kbit! That sounds like crap, everybody knows that.".


No worries, next test will be at 48kbps. Even people with crappy $5 speakers (like me  ) and tone deaf will be able to participate.

Quote
it's a pitty that wma9 Pro was included in the test  ...last test it was included it performed quite well


It hasn't changed a bit since last test. And I personally believe including WMA Pro in that test was a mistake (my second biggest mistake in test conduction, perhaps). When I included it, I expected microsoft would soon start pushing it with all the might of their marketing department to make it replace WMA Std. Alas, that didn't happen. Microsoft seems to have settled on focusing WMA Pro on DVD players and industry usage, and keeping WMA Std. for consumer usage (portables, online stores, ripping at home...)

Moving on to next post...
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 18:10:43
Quote
About the test results, I noticed that for some samples there are no confidence intervals on the graphs (bartok_strings, leahy, mahler, ordinary world). Did everybody score exactly the same on these samples, or maybe you just forgot to put the intervals on the graphs?

Those samples had too few listeners and/or results were too close to each other. When that happens, friedman.exe doesn't output the LSD (which is essential to build the confidence intervals) and says that results are "not significant" (what in practice means they are tied)

Quote
Other question: did someone post these results on minidisc ddicated boards? It would be interesting, because a lot of MD users often said in these boards that atrac3@132 = mp3@192...


MWAHAHA!
Title: Multiformat@128kbps listening test - FINISHED
Post by: kalmark on 2004-05-24 18:13:51
Quote
No worries, next test will be at 48kbps. Even people with crappy $5 speakers (like me  ) and tone deaf will be able to participate.

Time for me to give it a try 
Even the "Waiting" sample was un-ABX-able for me, when I tried at higher bitrates, I might be deaf 
Title: Multiformat@128kbps listening test - FINISHED
Post by: Latexxx on 2004-05-24 18:17:38
Quote
Quote
Other question: did someone post these results on minidisc ddicated boards? It would be interesting, because a lot of MD users often said in these boards that atrac3@132 = mp3@192...


MWAHAHA!

Some body has posted the results at http://forums.minidisc.org/viewtopic.php?p=22300 (http://forums.minidisc.org/viewtopic.php?p=22300) Nobody has dered to answer yet. Maybe all the minidisc guys have got heart attack after reading the results.
Title: Multiformat@128kbps listening test - FINISHED
Post by: bond on 2004-05-24 18:20:01
Quote
Here it is. I probably won't add this graph to the results page. There are already plenty of graphs there.
http://www.rjamorim.com/test/multiformat128/plot18b.png (http://www.rjamorim.com/test/multiformat128/plot18b.png)

why not? i would vote for that everyone who thinks about buying songs from a music store, which offers wma@128, should be forced to stare at this graph for two hours 

maybe adding lame to this graph would be also good, to proove that probably even the "kazaa music store" will offer better quality at more reasonable prices than wma-based ones
Title: Multiformat@128kbps listening test - FINISHED
Post by: p0wder on 2004-05-24 18:22:33
Quote
Some body has posted the results at http://forums.minidisc.org/viewtopic.php?p=22300 (http://forums.minidisc.org/viewtopic.php?p=22300) Nobody has dered to answer yet. Maybe all the minidisc guys have got heart attack after reading the results.

Let's all go over there and flame them.  Mwahahaha!
Title: Multiformat@128kbps listening test - FINISHED
Post by: Latexxx on 2004-05-24 18:27:50
Quote
Quote

Some body has posted the results at http://forums.minidisc.org/viewtopic.php?p=22300 (http://forums.minidisc.org/viewtopic.php?p=22300) Nobody has dered to answer yet. Maybe all the minidisc guys have got heart attack after reading the results.

Let's all go over there and flame them.  Mwahahaha!

http://microsoftusernetwork.com/forum/viewtopic.php?p=275 (http://microsoftusernetwork.com/forum/viewtopic.php?p=275)
Title: Multiformat@128kbps listening test - FINISHED
Post by: upNorth on 2004-05-24 18:29:26
Quote
Not very hard to guess that these "compensators" appear who don't want to understand the concept of vbr and how the test was conducted. Though last time I remember something like this on HA also.
I have actually been waiting for the complaining about unfair bitrates to start. I personally find it hard to understand the complaining, after the reasoning behind it has been explained over and over.

Btw: Slashdot seems like a nice place to waste time. Why bother with the facts when you can assume things instead and base the discussion on these assumptions (in this case why bother reading how the test was performed. Lets just assume how it was done and base the discussion on that). I only visit Slashdot when someone post a link on this forum, and I don't plan spending more time there either.

Reading discussions at Slashdot, makes me want to thank the Hydrogenaudio staff for keeping Hydrogenaudio a source of information, instead of a source of speculation, assumption and incorrect information. A big thank you to you all!

Also thanks to Roberto for yet another interesting test. Too bad they are always conducted when I have to prepare for my exams, but I don't really have a suitable listening environment for performing such tests anyway.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 18:40:11
Quote
maybe adding lame to this graph would be also good, to proove that probably even the "kazaa music store" will offer better quality at more reasonable prices than wma-based ones

You won't find many lame -V5 --athaa-sensitivity 1 at Kazaa, I reckon 
Title: Multiformat@128kbps listening test - FINISHED
Post by: ep0ch on 2004-05-24 18:44:41
IMHO the test is meaningless.  If codecs can go over the 128 kbps average then essentially these codecs have cheated. 

Don't get me wrong I've encoded most of my tunes into Vorbis as I feel it's a better codec than others, but I would like to see a fair comparison that fits the title of this listening test!!
Title: Multiformat@128kbps listening test - FINISHED
Post by: ilikedirtthe2nd on 2004-05-24 18:53:00
Quote
IMHO the test is meaningless.  If codecs can go over the 128 kbps average then essentially these codecs have cheated.

oh, come on! not again...
Title: Multiformat@128kbps listening test - FINISHED
Post by: Hyrok on 2004-05-24 18:56:23
Good Test
Just the average bitrate is a little bit to high imo (itunes is ok though)...

*edit* havn't seen ep0ch's post before...
Title: Multiformat@128kbps listening test - FINISHED
Post by: ep0ch on 2004-05-24 18:57:09
Quote
oh, come on! not again...


Heh, it was asked for  Just my opinion!

I don't see how you can possibly disagree that it would have been a worse test if each sample had been encoded to give an average 128kbps per sample... oh I'll shut up to keep you all happy!!
Title: Multiformat@128kbps listening test - FINISHED
Post by: atici on 2004-05-24 18:59:26
*yawns* Well the failure of CBR codecs in this test are due to their developers: they did not write a code smart enough to alter bitrate allocation dynamically. VBR is the key issue that requires proper tuning -> the test is fair.

Well ATRAC has a cool name and that's all about it I guess. It looks like this test will be referred many times to whoever claiming ATRAC@132 sounds better than MP3@320 in the upcoming years.  Maybe it's a good idea to open a Codec Comparison Forum to post these tests for future easy access.
Title: Multiformat@128kbps listening test - FINISHED
Post by: phong on 2004-05-24 19:03:36
I really wish people wouldn't post links to slashdot.  It's an incomparable juggernaut of stupidity.  I can't resist the urge to go over and yell at them.  It usually takes me about half an hour to realize that I'm just sticking my finger in a dike.

Quote
IMHO the test is meaningless. If codecs can go over the 128 kbps average then essentially these codecs have cheated.

AHHHHHHH!!  You've got slashdotitus!  Do they make a vaccine for this yet?

*runs off to wash hands*
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 19:03:46
Title: Multiformat@128kbps listening test - FINISHED
Post by: cuan on 2004-05-24 19:06:55
Quote
Also I think that Lame 3.96 -V5 --athaa-sensitivity 1 is not tested enough to say it doesn't fail (badly) in certain cases even pretty often. Imo iTunes 4.2 AAC in this sense is more safe.

I take your points into consideration but tbh i don't think my ears are up to the challenge of telling the two apart. Although im still undecided as to what format i will use.. i can never seem to make up my mind on these things 


thanx to rjamorim for a very informative listening test.


-Brian
Title: Multiformat@128kbps listening test - FINISHED
Post by: rpop on 2004-05-24 19:13:18
Quote
I don't see how you can possibly disagree that it would have been a worse test if each sample had been encoded to give an average 128kbps per sample...

It would have been worse because it wouldn't simulate a real-life situation. In real life, you'd choose one setting, and encode your music with it.. you wouldn't spend time encoding every single song 3 or 4 times until you reach 128kbps average.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Mac on 2004-05-24 19:45:46
My thanks to Roberot for organising such an enlightening test after the string of setbacks, and to AoTuv for improving Vorbis to such an impressive level

Roll on the 48k test!
Title: Multiformat@128kbps listening test - FINISHED
Post by: mobius on 2004-05-24 19:46:21
Quote
Btw: Slashdot seems like a nice place to waste time. Why bother with the facts when you can assume things instead and base the discussion on these assumptions (in this case why bother reading how the test was performed. Lets just assume how it was done and base the discussion on that). I only visit Slashdot when someone post a link on this forum, and I don't plan spending more time there either.


Same here.  I poked my head into the discussion surrounding this particular news item.  Had to get out before my anger welled.  Lots of big mouths just waiting for the chance to open and blah blah blah BLAH.  How totally useless.  The headlines are all that is interesting.
Title: Multiformat@128kbps listening test - FINISHED
Post by: DAvenger on 2004-05-24 20:43:20
Vorbis winning (I know, MPC is very close) a listening test?    That didn't happen for a long time 
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-24 20:45:24
Heh, as a tradition I would like to thank Roberto and partisipants for the test !
Good work, guys !

But I have one question about MPC average bitrate.
It seems (I may be wrong) that for those two samples (Debussy, CouldBeSweet) bit allocation mechanism or something else fails badly.
So, if we exclude those two samples, average bitrate of MPC will be about 142Kbit for such a setting (close to avg bitrate that was in previous test).
I can not tell it professionally, but may be those two bitrate values should be excluded from average bitrate calculation ?
As I remeber, it can be count, if this bitrate results are statistically significant to include in calculation...
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-24 21:09:27
Quote
But I have one question about MPC average bitrate.
It seems (I may be wrong) that for those two samples (Debussy, CouldBeSweet) bit allocation mechanism or something else fails badly.
So, if we exclude those two samples, average bitrate of MPC will be about 142Kbit for such a setting (close to avg bitrate that was in previous test).
I can not tell it professionally, but may be those two bitrate values should be excluded from average bitrate calculation ?
As I remeber, it can be count, if this bitrate results are statistically significant to include in calculation...

ItCouldBeSweet was purposely inserted into the test to compensate somewhat for the higher average bitrate of the other samples (Debussy was not chosen specifically for bitrate).

I don't think it's a matter of the bit allocation mechanism failing, it's that the samples were chosen such that their average bitrates were generally higher than 128 kbit/s.  The rationale was that having such samples would make defects easier to detect.  This is true for defects like pre-echo, but as we saw from the test, if having a high bitrate helps the VBR codecs, having a very low bitrate can also hurt it.  In other words, problem samples can be found at either end of the bitrate spectrum.

I think the bitrate criticism has some validity, but probably not to the extent that the overall results would have been significantly different if the average bitrates were closer to 128 kbit/s.  It's an oversimplification to assume a linear degradation with average bitrate.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: ddrawley on 2004-05-24 21:23:00
Would I be opening up a can of worms to ask if this indicates that 3.96 should now be the recommended compile?
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-24 21:40:44
Quote
ItCouldBeSweet was purposely inserted into the test to compensate somewhat for the higher average bitrate of the other samples (Debussy was not chosen specifically for bitrate).

Quote
it's that the samples were chosen such that their average bitrates were generally higher...

Oh, I didn't know it...
Thanks for clarification ff123 !
I've got the point, it is a material to think about.
Quote
I think the bitrate criticism has some validity, but probably not to the extent that the overall results would have been significantly different...

Agree completely 
This was not a criticim, really. English is not my native as you see, sometimes I can not explain my thoughts clearly, sorry.
But I will try 
My point about MPC(not vorbis) bitrate was:
when you count average on statistical data column, you must exclude values that are outside 3 sigma boundaries. Only in this case result will be statistically valid.
So, in other words, did MPC with used setting produce an average bitrate of 136bit *really* ?
It seems, that not. That two strange samples breaks a statistics a bit, because they were specifically chosen (at least one of them). So, users can be confused (possibly), when the real average bitrate with such a setting will be 142Kbit...
BTW, this do not affect rating calculations, only a bitrate...
This is my IMHO, of course.
Any opinion (and a clarifcation that I'm wrong too    ) will be greatly appreciated
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-24 21:48:18
Quote
My point about MPC(not vorbis) bitrate was:
when you count average on statistical data column, you must exclude values that are outside 3 sigma boundaries. Only in this case result will be statistically valid.
So, in other words, did MPC with used setting produce an average bitrate of 136bit *really* ?
It seems, that not. That two strange samples breaks a statistics a bit, because they were specifically chosen (at least one of them). So, users can be confused (possibly), when the real average bitrate with such a setting will be 142Kbit...
BTW, this do not affect rating calculations, only a bitrate...
This is my IMHO, of course.
Any opinion (and a clarifcation that I'm wrong too    ) will be greatly appreciated

Yes, I understand your point.

Ideally, you'd like the bitrate distribution to look somewhat like a bell curve with its mean at 128 kbit/s.

The two samples with extremely low bitrate do not compensate very well for the other 16 samples which are generally skewed above 128 kbit/s.

For the 48 kbit/s test, if there are VBR codecs, I think we should strive to have about an equal number of bitrates above and below the average bitrate (which should work out to be 48 kbit/s on average across the sample set).

The rationale about wanting to use "hard" samples does not apply at low bitrates.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: Dologan on 2004-05-24 21:50:27
Quote
Would I be opening up a can of worms to ask if this indicates that 3.96 should now be the recommended compile?

IMHO, this test is pretty good evidence that Lame 3.96 performs better than 3.90.3 at this bitrates, at least. Personally, this is enough to convince me to use 3.96 for mid-bitrates from now on, whatever the recommended version happens to be.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-24 21:50:47
Quote
This is true for defects like pre-echo, but as we saw from the test, if having a high bitrate helps the VBR codecs, having a very low bitrate can also hurt it.

Ehhh. When writing a previous reply start thinking about it.
Things are not that easy with VBR encodings ... 
May be two pass ABR is the best encoding mode ? 
Title: Multiformat@128kbps listening test - FINISHED
Post by: music_man_mpc on 2004-05-24 21:53:02
Quote
Would I be opening up a can of worms to ask if this indicates that 3.96 should now be the recommended compile?

I think that this test should have little relevance as far as the recommended LAME version goes.  Remember --preset standard is the setting we are really worried about there.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 22:07:42
Quote
May be two pass ABR is the best encoding mode ? 

It is the best for test conducers, for sure :B

Too bad only WMA implements it.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-24 22:11:44
Quote
Ideally, you'd like the bitrate distribution to look somewhat like a bell curve with its mean at 128 kbit/s.
The two samples with extremely low bitrate do not compensate very well for the other 16 samples which are generally skewed above 128 kbit/s.

Yep, that is what I mean.
Anyway, it is great that such a test are performed !
Thanks again !
EDIT:
Quote
It is the best for test conducers, for sure  :B

He-he
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 22:54:16
Quote
Some body has posted the results at http://forums.minidisc.org/viewtopic.php?p=22300 (http://forums.minidisc.org/viewtopic.php?p=22300) Nobody has dered to answer yet. Maybe all the minidisc guys have got heart attack after reading the results.

http://forums.minidisc.org/viewtopic.php?p=22321#22321 (http://forums.minidisc.org/viewtopic.php?p=22321#22321)
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-24 22:56:30
Quote
Quote
About the test results, I noticed that for some samples there are no confidence intervals on the graphs (bartok_strings, leahy, mahler, ordinary world). Did everybody score exactly the same on these samples, or maybe you just forgot to put the intervals on the graphs?

Those samples had too few listeners and/or results were too close to each other. When that happens, friedman.exe doesn't output the LSD (which is essential to build the confidence intervals) and says that results are "not significant" (what in practice means they are tied)

Hmmm... If the results are too close to each other then it doesn't make sence to find everything equal. For instance, in the leahy sample vorbis gets 4.68 and atrac get 3.76. If the confidence intervals are so tight there is no way these two are statistically equal. And if there are too few listeners then you cannot make any statistical tests on the samples anyway. BTW how many listeners do you considers too few?

Is there a way you can upload the text files with the individual ranks for each sample tested? It is a real pain to build the tables manually from the xml files and I have to exclude ranked references from the start. I'm asking because maybe I can help with providing statistical results for these samples.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-24 23:04:06
Quote
If the confidence intervals are so tight there is no way these two are statistically equal.

Quite the opposite - the confidence intervals are so broad that they all overlap - so there are no winners and no losers in that sample.

Quote
BTW how many listeners do you considers too few?


To make me happy, I need at least 20 valid results/sample.

Quote
Is there a way you can upload the text files with the individual ranks for each sample tested? It is a real pain to build the tables manually from the xml files and I have to exclude ranked references from the start. I'm asking because maybe I can help with providing statistical results for these samples.


First, download the .rar package containing all the XMLs. Decompress it to an empty folder.

Then, install python and Phong's wonderful Chunky:
http://www.phong.org/chunky/ (http://www.phong.org/chunky/)

At the folder you decompressed the RAR, run
Code: [Select]
python "C:\path\to\chunky" -n --codec-file="C:\path\to\codec\list\codecs.txt" --ratings=results --warn -p 0.05


The codecs.txt should be:
Code: [Select]
1, Vorbis
2, MPC
3, Lame
4, iTunes
5, Atrac3
6, WMA


It'll create all result tables (good to be fed to friedman.exe) at the empty folder, and will discard the ranked results that haven't been ABXd to a confidence of 0.05. Chunky is just too wonderful to be true! OMG!

Regards;

Me.
Title: Multiformat@128kbps listening test - FINISHED
Post by: phong on 2004-05-25 00:04:51
Thanks for the plug Roberto.  :-)

If you have windows you don't need Python installed (the standalone windows binary version should work).  You should also try out the --help option to get some other options.  My personal favorite is the --spreadsheet option to output all the scores in a nice spreadsheet (CSV) format.

I intend to add an option for outputting the listener comments as browseable HTML.

I've tried to make the code fairly accessable, though it's gotten a bit crufty in recent versions (the XML support, for example, isn't as pretty and clean as I would like).  The existing code is "incomplete"; there are some features I'd like to add still, but it does all the heavy lifting already (i.e. parsing result files into useful data structures and filtering out bad results).  So, if you feel like "doing something" with the data and you know Python, feel free to jump in fix features or add bugs.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Rash on 2004-05-25 00:19:33
Thanks a lot Roberto!

I think this test showed us, one more time, that open source is still better than any paid stuff. I don't want to start any Open Source political fight here, as I am not an free software defender most of the times. But, hey come on!

I think what impressed me most in this test was LAME's climb. LAME is doing the impossible with MP3, to improve it even more. Perhaps Gabriel should make a very simple preset that uses this configuration so maybe we can see more and more nice MP3 around.

Anyway, congrats to all for the nice encoders. By the way, where is the winner? Did he see the results already?
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 00:27:25
Quote
By the way, where is the winner? Did he see the results already?

Indeed. Aoyumi should show up to receive a big round of applause. (http://pessoal.onda.com.br/rjamorim/clapping.gif)
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-25 00:33:38
Quote
If you have windows you don't need Python installed (the standalone windows binary version should work).

I get a 404 when I try to donwload the windows binary.    I'm not on my linux box right now so I'm downloading python for windows (9MB take a long time to download on 56k  )
Title: Multiformat@128kbps listening test - FINISHED
Post by: phong on 2004-05-25 00:53:07
Ooops, should be fixed now.
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-25 00:54:42
This site apparently interviewed Roberto about the test:

http://p2pnet.net/story/1525 (http://p2pnet.net/story/1525)

They got the contestants wrong, though.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 00:59:20
Quote
This site apparently interviewed Roberto about the test:

http://p2pnet.net/story/1525 (http://p2pnet.net/story/1525)

They got the contestants wrong, though.

Yeah, the site author mailed me earlier today asking for comments, and for that sexy picture.

I'll mail him asking him to correct the competitors list.

Another news site mentioning my test:
http://www.afterdawn.com/news/archive/5257.cfm (http://www.afterdawn.com/news/archive/5257.cfm)
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-25 01:07:28
Quote
Ooops, should be fixed now.

Yes it is. Thanks.
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-25 01:27:37
Nope. It just won't work for me. All I get from chunky is
Code: [Select]
Parsing result files...
Traceback (most recent call last):
 File "chunky", line 639, in ?
 File "chunky", line 595, in main
 File "abchr_parser.pyc", line 634, in __init__
 File "abchr_parser.pyc", line 646, in _handleTargets
 File "abchr_parser.pyc", line 697, in __init__
abchr_parser.Error: Sample directory names must end in a number.
But the directory names already end in a number! 
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 01:30:02
Quote
But the directory names already end in a number! 

The folder where you run chunky from (and where the SampleXX folders are) must be empty

I.E, no files there, only the 18 folders.
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-25 01:37:56
OK the program executed but all I got were files with:
Quote
%
% !EMPTY!:

Vorbis   MPC   Lame   iTunes   Atrac3   WMA

% Codec averages:
% 0.00   0.00   0.00   0.00   0.00   0.00

huh?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Cygnus X1 on 2004-05-25 01:50:10
Quote
Quote
Some body has posted the results at http://forums.minidisc.org/viewtopic.php?p=22300 (http://forums.minidisc.org/viewtopic.php?p=22300) Nobody has dered to answer yet. Maybe all the minidisc guys have got heart attack after reading the results.

http://forums.minidisc.org/viewtopic.php?p=22321#22321 (http://forums.minidisc.org/viewtopic.php?p=22321#22321)

Almost 400 page views and not a peep from anybody. I find this sort of response interesting...it's not like we're personally attacking MD. The test simply showed that it's performance isn't up to par, fair or not. The exact same thing happened when I used to talk about pre-echo samples with ATRAC Type-R. I have to wonder, though, how many readers of that thread will rush out and buy a 1GB "hi-md" machine once they come out...although ATRAC3plus is technically a different animal (much bigger transform window, etc) than ATRAC3, my expectations aren't very high for it either.

(Edit: I kant sphell)
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 01:57:06
Quote
OK the program executed but all I got were files with:

Oops.

Last detail: files extension must be .txt :B

So please rename all xmls to txt (ren /s *.xml *.txt)
Title: Multiformat@128kbps listening test - FINISHED
Post by: echo on 2004-05-25 03:06:53
Quote
Oops.

Last detail: files extension must be .txt :B

So please rename all xmls to txt (ren /s *.xml *.txt)

Got it! Finaly!    Thanks a lot Roberto. Here are some character graphs with confidence intervals included. I will make proper graphs tomorrow after I get some sleep if you like (it's already 5:00 in the morning here!  )

Bartok (p=0.851)
Code: [Select]
Level       N      Mean     StDev  -----+---------+---------+---------+-
Atrac3     16    4.3125    1.2894   (-----------*-----------)
iTunes     16    4.6438    0.9040           (-----------*-----------)
Lame       16    4.4688    0.8845      (------------*-----------)
MPC        16    4.5438    0.9736        (------------*-----------)
Vorbis     16    4.7375    0.5439             (-----------*------------)
WMA        16    4.4125    1.1621     (-----------*------------)
                                  -----+---------+---------+---------+-
Pooled StDev =   0.9880               4.00      4.40      4.80      5.20


Leahy (p=0.532)
Code: [Select]
Level       N      Mean     StDev  -------+---------+---------+---------
Atrac3     12     3.758     1.672   (---------*--------)
iTunes     12     4.242     1.161          (---------*--------)
Lame       12     4.108     1.157        (---------*--------)
MPC        12     4.408     0.955            (---------*---------)
Vorbis     12     4.683     0.876                (---------*---------)
WMA        12     4.367     1.130            (--------*---------)
                                  -------+---------+---------+---------
Pooled StDev =    1.186                 3.50      4.20      4.90


Mahler (p=0.660)
Code: [Select]
Level       N      Mean     StDev  ---------+---------+---------+-------
Atrac3     12     3.617     1.777  (----------*---------)
iTunes     12     4.092     1.328         (---------*----------)
Lame       12     4.167     1.170          (----------*---------)
MPC        12     4.517     0.735               (----------*---------)
Vorbis     12     4.292     1.076            (---------*----------)
WMA        12     4.142     1.323          (---------*----------)
                                  ---------+---------+---------+-------
Pooled StDev =    1.274                   3.50      4.20      4.90


Ordinary world (p=0.846)
Code: [Select]
Level       N      Mean     StDev  ------+---------+---------+---------+
Atrac3     13    4.4769    0.7939       (------------*------------)
iTunes     13    4.3846    1.0808     (------------*------------)
Lame       13    4.4538    0.8866      (------------*------------)
MPC        13    4.7077    0.6396             (------------*------------)
Vorbis     13    4.7077    0.7065             (------------*------------)
WMA        13    4.3077    1.3775   (------------*------------)
                                  ------+---------+---------+---------+
Pooled StDev =   0.9478                4.00      4.40      4.80      5.20
Title: Multiformat@128kbps listening test - FINISHED
Post by: eagleray on 2004-05-25 03:13:37
Good work Roberto.

The results are an upset win for ogg vorbis, and a significant improvement in the venerable Lame MP3 as well.
Title: Multiformat@128kbps listening test - FINISHED
Post by: kwanbis on 2004-05-25 03:29:29
good work roberto ... i would continue to use LAME then till iRiver porperly supports Vorbis
Title: Multiformat@128kbps listening test - FINISHED
Post by: gkmeyer on 2004-05-25 03:32:41
Okay, now that it is over, I want to make a few points.

I really couldn't tell the difference between any of these codecs and the wav at this bitrate.  The first thing that will come to many of your minds is equipment, but I am using a decent pair of headphones (Grado SR-60's) and although I am working without a headphone amp on a Thinkpad which uses the Intel 855 AC'97 codec, things sound pretty good.

A few times I thought I heard a difference, but when I tried to abx the set I was not successful.  At one point I deleted my whole results fileset thinking what I would be submitting wouldn't be acceptable.  Although, I reconsidered and figured I would go ahead and submit them with everything scored a five.

I am relatively new to this, and what is interesting to me is that when I use the myriad of training materials that exist, I can successfully hear the artifacts and problems when I am told what to listen for.  I successfully abx these samples 100% of the time.  However, in a blind test, when I don't know what I am listening for, I cannot hear a difference.  I know quality plays a role here, but I am thinking my problem is that I don't have the attention span and attention to detail necessary (which would be consistent with how I react to other stimuli) to be good at this.

I would be very interested in hearing some isolated artifacts on a few of these samples so I can try to hear what I missed.  It's been a learning experience anyway, thanks for letting me participate.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Daijoubu on 2004-05-25 10:08:21
Quote
woow, now thats what i not expected

- vorbis aotuv: vorbis is back, and i am proud to have helped finding out what vorbis encoder should be used
- mpc vs aac: funny that mpc was that better than itunes (with a only 0.15 higher setting than in the last test)
- wma9: lol, worse than mp3! (and i even wonder that it got rated that high, even at 128 it had this metallic sound sometimes) -> go away m$
- atrac3: even worse than wma9 -> go away sony

and if you take this test as a comparison between some online music stores (itunes vs. wma9 based ones vs. sonys new store) itunes clearly comes out as the winner, leaving wma9 behind by far!

ATRAC3plus is thier new codec

I do understand that Sony Connect service currently offer ATRAC3, so stick to other format for those 99 cents purchase

There are replies:
http://forums.minidisc.org/viewtopic.php?p=22345#22345 (http://forums.minidisc.org/viewtopic.php?p=22345#22345)
http://minidisct.com/forum/showthread.php?threadid=22995 (http://minidisct.com/forum/showthread.php?threadid=22995)

Which implementation of ATRAC3 did this test use?
I only see flac decompressing the wav, where does it originate from?
The hardware and the encoder in SonicStage may leads to different output
Title: Multiformat@128kbps listening test - FINISHED
Post by: Halcyon on 2004-05-25 10:18:50
Daijoubu,

I also had hard time finding artifacts on all samples that I had time to listen to.

However, as in most sensory skills, practise makes perfect, so don't be too worried about your hearing being bad (in an absolute sense).

I've also trained using my own samples, lame/ff123/vorbis/klemm/vqf/MUS420/AES test samples, but clearly I still have a long way to go myself to properly hear even the most obvious of artifacts.

You are also right in assuming that attention plays a part in sensory detection. You wil hear/see/taste/feel more if you know what to "look" for.

You can use the user comments to find problem parts in samples:

http://www.rjamorim.com/test/multiformat12...s/comments.html (http://www.rjamorim.com/test/multiformat128/comments/comments.html)

I tried to put what goes wrong, how and at what time in the playback of the sample (didn't do this to even half the samples though) for _my hearing_.

Using those comments from various testers, it is possible to guide your attention to listen to for something specific and just repeat a certain part of the sample.

Be noted however that people are sensitive to different artifacts. I went through some of the ff123's/Pio's comments and I didn't pay any attention to some of the stuff he heard and found annoying.

Guess, I'll have to train some more

best regards,
halcyon
Title: Multiformat@128kbps listening test - FINISHED
Post by: anishbenji on 2004-05-25 14:33:59
The test was mentioned on the Screensavers last night. Not much was said beyond mentioning that the winners were Ogg vorbis and Musepack. A link to the results were included in the  shownotes (http://www.techtv.com/screensavers/shownotes/story/0,24330,3427263,00.html).
Patrick said that he didn't know how the tests were conducted (e.g. if they were blind etc), and that he was planning to download the test and try it out himself.
Title: Multiformat@128kbps listening test - FINISHED
Post by: earwax on 2004-05-25 15:34:05
Quote
Quote
How many results were discarded because of ranked refs?

54

Mind you that I didn't discard results that ranked the reference but on that sample pair ABXd the samples to a pval of 0.05 or less.

Does that mean that  all of the users rankings for that sample were thrown out or just for the codec(s) where they ranked the reference? 

Was there any pattern in those 54 discarded results as to which codecs' were mis-identified?  If, for example, half of those 54 were thrown out because they ranked the reference vs. MPC that would be somewhat interesting.

I guess I'm just looking for some informatin that would make me more comfortable that throwing out those 54 didn't distort results in any obvious way.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Aoyumi on 2004-05-25 16:18:37
  I did not expect that aoTuV became the first place.
This is delightful miscalculation. 

@Acknowledgement
First, I appreciate very much people which performed the spontaneous comparison test and the tuning friend of Vorbis.

And it is thankful to people of Xiph.org including Monty which created libvorbis(& ogg) which is the code base of aoTuV. Vorbis is a wonderful format!

Finally, it is thankful to all the people concerned with this test.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Halcyon on 2004-05-25 16:37:10
Three things things perhaps worth considering in the future. I noticed these myself, but I'm not sure others see them as important:

1) ABCHR Java version has, imho, some issues:

- buffer length that is small enough for fast switching can cause a lot of skips/gaps on playback (at least on my system. I think Gabriel may also have mentioned this?)

- Often times, I found out that I was about to save a test result from the software with accidental "rankings". That is, I'd sometimes click accidentally on a sample that I knew was the reference, moving it's slider from 5.0 to 4.9 (this happens for instance when I switch from sound card volume mixer window back to ABCHR and accidentally click on the slider for a sample that I didn't mean to rate). Now, the problem is that this change from 5.0 to 4.9 on the UI itself is so small, that it almost went unnoticed by me on various occasions. So, I was about to save/return results where I had erroneously/unintentionally ranked the reference/original sample (even though I had ABXed the reference from the test sample). This would have lead to discarding of the results (i.e. wasted time for me and loss of data for the test). I wish there was a clear indicator (color or something) that showed when any of the slider had been moved from the reference position (even if only 0.1 points). Just a UI design issue and minor at that, but can lead to discarding of perfectly "good" data.

- It is impossible to select the Output sound card and/or the ouput method (DirectSound/WaveOut/Asio/Kernel). On cards that have broken DirectSound (like RME DIGI 96/8), this makes it hard/useless to use that card. I had to resort to my worse sound card, worse headphone amplifier and worse headphones due to this. Not that it necessarily altered my listening accuracy at all, but it was a bummer not to be able to use gear one is accustomed to. I wonder if there is any way around this limtation?

2) Intro: I've been reading Les Leventhal's AES papers, like "Type 1 and Type 2 errros in the Statistical Analysis of Listening Tests". Mr Leventhal is a psychologist who understands auditory testing and statistical analysis issues on the subject of significance leves (I recommend: J. Audio Eng. Soc, Vol 34, No 6, 1986 June as a starting point. He has further papers on the issue). While statistical analysis is not a substitute for a carefully thought out research methodology and test setup, it can help to analyse non-ideal settings with higher confidence.

Suggestion: Considering Leventhal's points and the impossibility of making a perfect test: most test don't even have a research question openly formulated, not to mention analysis of the testing methodology in reference to the research question, both of which could actually further validate the results AND limit the scope of conclusions which can be draw from the results.

With these in mind I'd suggest considering the use of fairness coefficients in the significance calculation (especially in test that have very small audio impairments and a low likelihood of detectability). Neurological research about diminishing of auditory evoked responses with repeat tests also appear to support this conclusion.

3) For general use, for learning how to listen / train one's hearing and for test likes the last 128 kbps test, could we build some general guidelines on how to  conduct listening tests alone. That is, after somebody has offered the samples and the software, how does one actually carry out the listening and ranking, in order to get the best out of it.

This could include issues like volume setting, selecting a good time to test, pros/cons of repeated fast switching, re-inforcement of the neutral reference, attitutional motivation, attention guiding, etc. All these can have a slight and in some cases a dramatic effect on the overall results (not necessarily changing any codec rankings, but enabling testers to find more artifacts). I already know of the fine ff123's pages and they could serve as a starting point. We could inject some basic tips there culled from cognitive, neurological and audiological research. And your personal experience of course.

Unfortunately I'm not much of a person to help with issues 1 & 2 any further, but maybe others can consider them for future alterations, if they feel they are important.

In 3 I could perhaps contribute, if others are interested.

Would this be a good Wiki project? Should we start a new thread to discuss this, if there are any interested parties.


regards,
halcyon
Title: Multiformat@128kbps listening test - FINISHED
Post by: phong on 2004-05-25 16:44:53
Quote
Oops.

Last detail: files extension must be .txt :B

So please rename all xmls to txt (ren /s *.xml *.txt)

I think that's my "oops."  I thought I made it accept the .arf and .xml extensions...  I'll fix that and a couple other little things and upload a new version tonight sometime.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Mac on 2004-05-25 16:52:49
With issue 1, why not just invoke a dialog box saying "Are you sure you want to change the ranked file for sample x" if you try switching from rating A to B?  That way you could click [Oh crap nooooo] and not lose your previous rating.

I would appreciate some advice on testing dos and dont's, basic technique combined with what to look for.
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-25 17:03:10
Quote
Three things things perhaps worth considering in the future. I noticed these myself, but I'm not sure others see them as important:

1) ABCHR Java version has, imho, some issues:

- buffer length that is small enough for fast switching can cause a lot of skips/gaps on playback (at least on my system. I think Gabriel may also have mentioned this?)

I am slowly working towards implementing encryption in abchr for windows.  The first part of that is to be able to decode xml setup files.  I'm currently figuring out how to use expat/arabica to implement a document object model for xml (yes, I could have used msxml.dll, but I want something that works for all windows users without having to ask them to install updated dlls).

Hopefully being able to use a native windows app on pc/windows systems should take care of the clicking issue.

Quote
- It is impossible to select the Output sound card and/or the ouput method (DirectSound/WaveOut/Asio/Kernel). On cards that have broken DirectSound (like RME DIGI 96/8), this makes it hard/useless to use that card. I had to resort to my worse sound card, worse headphone amplifier and worse headphones due to this. Not that it necessarily altered my listening accuracy at all, but it was a bummer not to be able to use gear one is accustomed to. I wonder if there is any way around this limtation?


In java, you are restricted to the java sound library.  In abchr for windows, I only implemented wavOut playback, which is probably the most compatible method for existing PC's (plus it was convenient to use the MCI interface).  I don't have plans to implement DirectSound or ASIO playback.

Quote
2) Intro: I've been reading Les Leventhal's AES papers, like "Type 1 and Type 2 errros in the Statistical Analysis of Listening Tests". Mr Leventhal is a psychologist who understands auditory testing and statistical analysis issues on the subject of significance leves (I recommend: J. Audio Eng. Soc, Vol 34, No 6, 1986 June as a starting point. He has further papers on the issue). While statistical analysis is not a substitute for a carefully thought out research methodology and test setup, it can help to analyse non-ideal settings with higher confidence.


This sounds interesting.  I should note that the method Roberto uses for analyzing the results favors finding differences at the expense of higher type I errors -- i.e., it does not correct for multiple samples.

Currently the biggest remaining criticism I see in Roberto's tests are not statistical.  I think the bitrate criticism should be tackled head on in future tests.  Bitrates over multiple albums and bitrates over the sample set should be about the same, IMO.  So that means choosing samples which might not at first glance appear to be "difficult."

You ask how does the test method affect the test?  Well in this case, we have self-selected listeners and an abc/hr test method.  The self-selection is probably amplifying the differences.  In the general population, I'd bet the vast majority of people would not find the differences this group of listeners has.

The abc/hr and abx test methods are also very sensitive, and certainly not representative of real-world listening.  I think it also has a tendency to over-amplify differences (although those differences are very real).  Bottom line -- these tests are, if anything, too sensitive to represent everyday listening for the general population.

But for the people who actually care, they do a pretty good job of providing information on differentiating codec quality at a very subtle level.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: earwax on 2004-05-25 17:04:58
This test was discussed slightly on the MiniDisc TBoard too -
http://www.minidisct.com/forum/showthread....&threadid=22995 (http://www.minidisct.com/forum/showthread.php?s=&threadid=22995)
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 18:17:35
Quote
Which implementation of ATRAC3 did this test use?

SonicStage2

Quote
I only see flac decompressing the wav, where does it originate from?


Decoding the Atrac3 and encoding to FLAC. There's no other way to distribute the Atrac3 samples.

Quote
Was there any pattern in those 54 discarded results as to which codecs' were mis-identified? If, for example, half of those 54 were thrown out because they ranked the reference vs. MPC that would be somewhat interesting.


Hrm... you would have to check the output of Chunky with the command line I posted earlier to see what results are being discarded, and then analyze these results one by one.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Toe on 2004-05-25 18:33:59
I gotta say this has done a lot to restore my confidence in Vorbis, and I'm probably not the only one.  Mad props to Aoyumi.

Given that LAME is now nipping at the heels of iTunes at 128k, I do really have to wonder about the AAC encoders that didn't win the last AAC listening test.  I wouldn't be surprised if LAME is now ahead of or at least tied with Nero AAC.  Who wants to test?
Title: Multiformat@128kbps listening test - FINISHED
Post by: ExUser on 2004-05-25 19:05:54
Quote
Vorbis winning (I know, MPC is very close) a listening test?    That didn't happen for a long time 

As I said on #foobar2000 to someone saying the same thing: Learn stats, and post again.

I'm surprised this "claim" hasn't been debunked yet. Vorbis did not win. Statistically speaking, it's more likely Vorbis is better than MPC than the other way around, but you cannot say that Vorbis won with any level of certainty. It'd be like 60% probability Vorbis is better, and 40% probability Musepack is better (I pulled these numbers out of my ass for a visual example). I believe this test was run with a significance level of 95%. Am I correct?
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 19:21:31
Quote
I believe this test was run with a significance level of 95%. Am I correct?

Erm... it's in the results page. Read the second sentence of "How to interpret the plots:"

Now, officially they are tied. But considering Vorbis' score is above MPC's confidence margin, I would say, with some confidence, that Vorbis aoTuV is better than MPC, at this bitrate.
Title: Multiformat@128kbps listening test - FINISHED
Post by: JohnV on 2004-05-25 19:53:29
I'd like to see sometime a double test. Meaning, another test after the first one with another set of samples, and see how close the final results are to each others.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 20:02:41
Quote
I'd like to see sometime a double test. Meaning, another test after the first one with another set of samples, and see how close the final results are to each others.

You are invited to conduce it
Title: Multiformat@128kbps listening test - FINISHED
Post by: Daijoubu on 2004-05-25 20:15:54
Quote
Quote
Which implementation of ATRAC3 did this test use?

SonicStage2

Quote
I only see flac decompressing the wav, where does it originate from?


Decoding the Atrac3 and encoding to FLAC. There's no other way to distribute the Atrac3 samples.

Real time recording via internal loopback?
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 20:17:21
Quote
Real time recording via internal loopback?

Total Recorder.
Title: Multiformat@128kbps listening test - FINISHED
Post by: jormartr on 2004-05-25 20:21:13
It seems people outside HA does not understand the language I speak, they interpret 'yes' as 'no' and viceversa, 'better' as 'worst', 'scientific, objective and repicable by yourself' as 'my mother did it and she owns the truth'
Title: Multiformat@128kbps listening test - FINISHED
Post by: Mac on 2004-05-25 21:24:45
Quote
I'd like to see sometime a double test. Meaning, another test after the first one with another set of samples, and see how close the final results are to each others.

Couldn't you just split the current test into two 9-sample tests and pretend one was taken after the other.  Comparing the results of these two 'sub-tests' would in effect be the same, and you've got the benefit of 49'000 different ways to create two sub-tests.  I would imagine from the previous comments that you wouldn't find a great amount of discrepancy, I got the impression that gone are the days when WMA is best for classical and mp3 best for metal [span style='font-size:8pt;line-height:100%'](or whatever codec/genre associations there were)[/span]
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-25 22:13:06
Quote
It seems people outside HA does not understand the language I speak, they interpret 'yes' as 'no' and viceversa, 'better' as 'worst',

Calm down... 
People who WANT to understand, will understand.
People who do not care - will not...
There is old russian saying, i will try to translate:
When you argue with a fool - take care, other people could see no difference 
[between]
Title: Multiformat@128kbps listening test - FINISHED
Post by: ExUser on 2004-05-25 22:13:17
Quote
Erm... it's in the results page. Read the second sentence of "How to interpret the plots:

Now, officially they are tied. But considering Vorbis' score is above MPC's confidence margin, I would say, with some confidence, that Vorbis aoTuV is better than MPC, at this bitrate.

Haha, yeah, figured I could find it out in a few moments, but I didn't really have them when I posted.

You make sense with the confidence margin thing, true, but you're likely going to start confusing the less statistically minded unless you stick pretty hardcore to the 95% confidence interval information. Either that, or you qualify the hell out of any statement that doesn't comply to the 95% interval.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 22:19:47
Quote
You make sense with the confidence margin thing, true, but you're likely going to start confusing the less statistically minded unless you stick pretty hardcore to the 95% confidence interval information.

It's no use. This test is not controlled enough to warrant sticking to the 95% confidence as if it was gospel. Differently from ITU tests, I have no control of participants' listening environment, equipment, training, fatigue, etc. (and that's why ITU tests are damn expensive)

These results are there just to give an idea of how codecs rank. They are not trying to be definitive in what they report. And people should still test for themselves to decide what codec beter suits them, and consider other features like availability, hardware support, etc, etc.
Title: Multiformat@128kbps listening test - FINISHED
Post by: kwanbis on 2004-05-25 22:27:03
Quote
I'm surprised this "claim" hasn't been debunked yet. Vorbis did not win.

even if they are both tied (Vorbis and MPC) it means that vorbis (and (posibli) MPC) won.
Title: Multiformat@128kbps listening test - FINISHED
Post by: earwax on 2004-05-25 22:50:00
Quote
Quote
Was there any pattern in those 54 discarded results as to which codecs' were mis-identified? If, for example, half of those 54 were thrown out because they ranked the reference vs. MPC that would be somewhat interesting.


Hrm... you would have to check the output of Chunky with the command line I posted earlier to see what results are being discarded, and then analyze these results one by one.

Oh, OK, I thought someone may have actually looked at those discarded results already. 

Maybe someone can answer the other part of my question:

Does that mean that all of the users' rankings for that sample were thrown out or just for the codec(s) where they ranked the reference?
Title: Multiformat@128kbps listening test - FINISHED
Post by: yoth. on 2004-05-25 22:58:14
ok, so it looks like the vbr contenders did very well and itunes's cbr held its on.  how safe would it be to assume that using vbr with AAC (for instance the most recent FAAC with FB2K) would be a contender?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Mono on 2004-05-25 23:00:31
Quote
There are replies:
http://forums.minidisc.org/viewtopic.php?p=22345#22345 (http://forums.minidisc.org/viewtopic.php?p=22345#22345)
http://minidisct.com/forum/showthread.php?threadid=22995 (http://minidisct.com/forum/showthread.php?threadid=22995)

This guy's signature is great!
Quote
Best portable setup = 128kbps MP3 (super high quality, > CD!) -> transcoded to the best codec in the world, uber high quality ATRAC3/LP4 (5000% better than SACD) -> NetMD (faster than ur sh*tty firewire) -> N710 (EU version with 1.2mW x2, OH YEAHHHH BABY!) + MDR-E808 (bestest hedfonez in teh world!)
This will shizz on all ur lame iPods! Its sooooo clear dat I can almos feel teh mud flwing dwn da waterfal!

Worst portable setup = CD -> WAV -> (WaveGain @ 87dB) -> iTunes 4.5/QT 6.5.1 encoded 224kbps AAC or ALAC -> 3G iPod + Etymotic ER-4P

I am actually impressed with most responses there, but apparently some believe that the test is not fair because ATRAC was not tested on a preferred hardware DAC. 
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-25 23:18:41
The arguments at the minidisc forums about hardware encoded Atrac3 sounding better than software encoded make no sense. The opposite actually makes more sense. On hardware, you must be worried about real time encoding, voltage consumption and battery consumption. On software, you can go nuts.

So, if Sony cut corners somewhere, it must have been on hardware due to inherent limitations.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Cygnus X1 on 2004-05-26 00:18:48
Quote
The arguments at the minidisc forums about hardware encoded Atrac3 sounding better than software encoded make no sense. The opposite actually makes more sense. On hardware, you must be worried about real time encoding, voltage consumption and battery consumption. On software, you can go nuts.

So, if Sony cut corners somewhere, it must have been on hardware due to inherent limitations.


Even worse is the fact that some people claim ATRAC3 sounds "better" decoded through Type-S or their 1-bit digital amps, so the test is therefore invalid 

I don't think that some people understand the point of comparing lossy codecs: it's not to see which one sounds "warm" or "fat" or "has better bass," it's to compare artifacts, with the best codec having the least number of and/or least annoying artifacts. I want to smack people when they claim that while ATRAC3 sounds worse than MP3 on the computer, it will sound better going through their 1-bit digital amp. NOOOO!!!    An artifact is an artifact...a phasey cymbal or dropout will still be there no matter how good your amp or boost boost is. I'm personally surprised that although many people claim to be able to discern the "higher quality" of a 1-bit digital amp on certain players, they apparently aren't able to pick out what are sometimes blatant artifacts. I wonder how much of that can be attributed to marketing?
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-26 01:05:06
Replies gathering at:

http://microsoftusernetwork.com/forum/viewtopic.php?p=275 (http://microsoftusernetwork.com/forum/viewtopic.php?p=275)

where the response from the forum moderator is surprisingly dismissive.  Oh well.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: QuantumKnot on 2004-05-26 01:08:00
Quote
Quote
I'm surprised this "claim" hasn't been debunked yet. Vorbis did not win.

even if they are both tied (Vorbis and MPC) it means that vorbis (and (posibli) MPC) won.

I agree.  To have a winner, you must have a loser(s).  And there are some notable losers in this test (ie. ATRAC3).  Since Vorbis and MPC are statistically tied, they both won over the rest.
Title: Multiformat@128kbps listening test - FINISHED
Post by: phong on 2004-05-26 02:54:12
I've uploaded chunky-0.8.4 which fixes the filename extension problem.  Also, I've changed the default behavior so that it discards files with ranked references (i.e. -p 0.0 is assumed unless specified otherwise).

You can get it, as usual, at http://www.phong.org/chunky/ (http://www.phong.org/chunky/)

Quote
Does that mean that all of the users' rankings for that sample were thrown out or just for the codec(s) where they ranked the reference?

Yes, the whole result for that sample is thrown out.  To do otherwise would taint the results.  Even if you just guessed without listening, you would get about half of them right - if you just discarded the wrong ones, you'd still have half left with completely invalid ratings.  The only safe route is to toss the whole result file.

On the other hand, it is possible that the reference was ranked inadvertantly even if they did hear a difference (if it was very subtle).  In those cases (i.e. when the differences are subtle), it's best to make an ABX test - if you are successful, the ranked reference won't cause it to be discarded.  If you fail the ABX test, then you know you probably didn't hear a difference and you shouldn't rank the sample at all (leave it at 5.0).
Title: Multiformat@128kbps listening test - FINISHED
Post by: ExUser on 2004-05-26 03:10:23
Quote
I agree.  To have a winner, you must have a loser(s).  And there are some notable losers in this test (ie. ATRAC3).  Since Vorbis and MPC are statistically tied, they both won over the rest.

I meant that Vorbis didn't win when compared to Musepack. That's all. I didn't mean globally. Sorry for the confusion.
Title: Multiformat@128kbps listening test - FINISHED
Post by: StoneRoses on 2004-05-26 08:13:54
Quote
The arguments at the minidisc forums about hardware encoded Atrac3 sounding better than software encoded make no sense.

How can you jump straight into the conclusion like that?

Hardware ATRAC3 encoder in MD player may use different codebase from software counterpart.

Sonic Stage to ATRAC3 maybe something like Blade is to MP3. We have to test it.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-26 08:24:04
Quote
How can you jump straight into the conclusion like that?

Hardware ATRAC3 encoder in MD player may use different codebase from software counterpart.

Sonic Stage to ATRAC3 maybe something like Blade is to MP3. We have to test it.

<sigh>

Have you ever even bothered reading the rest of my post?

Here, let me give you some knowledge. That way, you will think twice before posting next time:

On hardware, a developer must be concerned about constraints like voltage consumption, battery consumption, real time encoding, less precision (no FPU), a fraction of the CPU clocks, etc.

On software, the developer can go nuts since none of those restrictions apply.

On codec development, the usual path is first creating a software implementation (that will also be later used for compliancy tests), and then start cutting corners and complexity for the hardware version until it reaches the desired performance.

FOR THAT REASON, I claim it's nonsense. I don't claim it's impossible, maybe Sony has some serious voodoo going on there. But it does go against common sense.

Common sense is that they aren't deliberately putting a worse version of Atrac3 "like blade is for MP3" on SonicStage for kicks and giggles.

You're welcome.

Regards;

Roberto.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Big_Berny on 2004-05-26 10:56:04
Well, I'm not sure about MPC.
Before the test I mentioned that MPC perhaps is only as good because it uses very high bitrates on this problemsamples! But if the average bitrate is 128 for the tested qualitysetting, there should also be a lot samples with bitrates under 128kbits! Logical, isn't it?
The problem on this test is that most samples had high bitrates and the samples with small bitrates were not ranked as good!

For example you could also modify an mp3-encoder to user very high bitrates (160kbits) on difficult samples and very low (80kbits) on normal samples. In this thest it would probably be better thant the current lame-encoder but in practice there would be a lot of songs which would sound very bad!

I hope you understand what I mean. But perhaps my idea is totally wrong!?

Big_Berny
Title: Multiformat@128kbps listening test - FINISHED
Post by: Kblood on 2004-05-26 11:43:40
Congratulations to Roberto for once more pulling through a tough one. Great work, greatly appreciated.

Regarding the bitrate "issue"...

The encoders in the test were using standard settings, they were not specially tweaked for the test, and you can go ahead and use them with your songs.

So if some of the encoders have a flawed code to choose the bitrate in tough passages of music, well, it's their problem.

I think this test is really useful as an indication of which encoder does a better job with a setting that will end up giving an average 128kbps in a whole bunch of music. And I fail to see what's wrong with the idea.
Title: Multiformat@128kbps listening test - FINISHED
Post by: kalmark on 2004-05-26 11:55:44
Just to throw in my 2 cents, considering the ATRAC and WMA forums' responses:

I think we all more or less knew that there will be such reaction, when we post these results. I even think some hoped for such reaction, so they can say that these people make unsupported claims and such.

I myself trust the results, though I won't change my encoding habits: Lame aps for me, as I only have an mp3-capable portable. The people in the ATRAC/WMA forums won't do that either, IMHO, as they payed a lot of money to be able to use the formats they defend now.

And, to be honest, those people who really care about audio quality, end up at HA finally    And those who don't help the sales of lower quality codec capable devices skyrocket, because they only listen to the commercials.

And don't tell me you didn't read the "2 cents" warning

One more on-topic question: is it possible to send these results to portable manufacturers? Would it make some reason if we'd make a thread for collecting contact email addresses, so we could mail most portable manufacturers, to give them a hint what to develop? E.g. Daisy MM (manufacturer of Diva) wrote me in an email, that they would consider implementing further codecs, if their licensing fees are fine. So why not give the companies a hint?
Title: Multiformat@128kbps listening test - FINISHED
Post by: StoneRoses on 2004-05-26 11:58:16
Quote
Have you ever even bothered reading the rest of my post?

I did read your post on minidisc forum (and agree with you on that sense) before posting that.

My point is if they (minidiscers) claim that their MD hardware encodes better, then we should consider their claim. Similar to how we select the best encoder for other codecs in your test.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Digga on 2004-05-26 12:58:18
Quote
My point is if they (minidiscers) claim that their MD hardware encodes better, then we should consider their claim. Similar to how we select the best encoder for other codecs in your test.

consider and then dismiss, if there is no proof for the claim, other than general subjective opinions.
if there is (semi-) scientific proof, it will will be gladly accepted.

guess what's gonna happen.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Big_Berny on 2004-05-26 14:20:29
Oh I forgot to say that I want to thank you nevertheless (see my last post), Roberto!

In the next listening we should use more songs with lower than average bitrates.

Big_Berny
Title: Multiformat@128kbps listening test - FINISHED
Post by: 2Bdecided on 2004-05-26 14:49:56
Quote
Quote
My point is if they (minidiscers) claim that their MD hardware encodes better, then we should consider their claim. Similar to how we select the best encoder for other codecs in your test.

consider and then dismiss, if there is no proof for the claim, other than general subjective opinions.
if there is (semi-) scientific proof, it will will be gladly accepted.

guess what's gonna happen.

Well, they're the ones with the hardware.

Some one could easily record all the clips to their MD recorder via a digital link (from a non-resampling sound card), and then copy them all back into a PC. Then the three versions of each clip (original, software encoded, hardware encoded) could be the subject of a mini listening test. EDIT: like the lame 3.90.3 vs 3.96 test, not like the present one.

I'm not suggesting Roberto should carry out such a test - I'm just saying it would be easy to prove this one way or the other.

As you said, it probably won't happen. Let's face it, MD users aiming for decent results aren't using this setting anyway, because it adds audible artefacts so often.

Cheers,
David.
Title: Multiformat@128kbps listening test - FINISHED
Post by: StoneRoses on 2004-05-26 15:13:19
Quote
Some one could easily record all the clips to their MD recorder via a digital link (from a non-resampling sound card), and then copy them all back into a PC. Then the three versions of each clip (original, software encoded, hardware encoded) could be the subject of a mini listening test.

If have Net MD I will definitely do that test.

People who care about quality probably won't use MD.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Busemann on 2004-05-26 15:25:09
Quote
Let's hope Apple implements VBR in their codec

The implementation in QT 6.5.1 / iTunes 4.5 is VBR see here (http://www.hydrogenaudio.org/forums/index.php?showtopic=21814&)
Title: Multiformat@128kbps listening test - FINISHED
Post by: Garf on 2004-05-26 15:39:21
Quote
Quote
Let's hope Apple implements VBR in their codec

The implementation in QT 6.5.1 / iTunes 4.5 is VBR see here (http://www.hydrogenaudio.org/forums/index.php?showtopic=21814&)

ABR and recognizing silence is not the same as VBR.

If the codec knows its encoding difficult music, it still can't flex the bitrate higher.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Busemann on 2004-05-26 15:44:49
Quote
If the codec knows its encoding difficult music, it still can't flex the bitrate higher.

How do you know its unable to do that?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Garf on 2004-05-26 16:02:13
Quote
Quote
If the codec knows its encoding difficult music, it still can't flex the bitrate higher.

How do you know its unable to do that?

Because you can only set bitrates and the produced file has exactly that average bitrate?

It can vary in the song, but it can't vary among songs.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-26 16:29:42
Quote
The implementation in QT 6.5.1 / iTunes 4.5 is VBR

small clarification.
As far as I know there is no CBR aac encoders.
Two systems are used: VBR and ABR.
VBR - is quality based mode - bitrate is adjusted to maintain constant quality (measured by S/N ratio for example).
ABR - bitrate based mode. Average bitrate should be as defined.
CBR can (unsure) be used with aac, but by ITU specs it is not defined.
AAC always use ABR mode. If bitrate fluctuations are no more than defined by standart it is considered as CBR.
So, technically, you can consider iTunes encoding as CBR, if requrements mentioned above are met.
BTW: I remeber Ivan Dimkovic explained this somewhere on this forum, but I can't find it, period...
Quote
How do you know its unable to do that?

It seems (from testing, try it) that average bitrate remains the same...
Title: Multiformat@128kbps listening test - FINISHED
Post by: saratoga on 2004-05-26 16:59:38
Quote
As far as I know there is no CBR aac encoders.


iTunes 4.2 is CBR.  Frames are at a constant bitrate.  Search for Ivan's explination.  I believe he refers to it directly as CBR actually.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-26 18:06:14
Quote
One more on-topic question: is it possible to send these results to portable manufacturers?

Problem is, unfortunately, codec quality is one of the least concerns of hardware manufacturers. They have much more to worry about first: hardware requirements for decoding, ease to implement on hardware, licensing fees, user demand...
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-26 18:24:53
Quote
iTunes 4.2 is CBR. Frames are at a constant bitrate.


Uhh. Found it at the end.
Quote
Ivan: AAC is always variable bit rate with following rules:

1. Maximum number of bits per one frame is in range from 0 to 6144, multiplied by the number of channels

See this thread: http://www.hydrogenaudio.org/forums/index....showtopic=8835& (http://www.hydrogenaudio.org/forums/index.php?showtopic=8835&)
Title: Multiformat@128kbps listening test - FINISHED
Post by: upNorth on 2004-05-26 19:04:32
Quote
Well, I'm not sure about MPC.
Before the test I mentioned that MPC perhaps is only as good because it uses very high bitrates on this problemsamples! But if the average bitrate is 128 for the tested qualitysetting, there should also be a lot samples with bitrates under 128kbits! Logical, isn't it?
The problem on this test is that most samples had high bitrates and the samples with small bitrates were not ranked as good!

For example you could also modify an mp3-encoder to user very high bitrates (160kbits) on difficult samples and very low (80kbits) on normal samples. In this thest it would probably be better thant the current lame-encoder but in practice there would be a lot of songs which would sound very bad!

I hope you understand what I mean. But perhaps my idea is totally wrong!?

Big_Berny

You make it sound like MPC is tweaked for winning listening tests. I believe it is tweaked to sound consistent trough the entire track. I'm no expert, but it sounds to me like you kind of confuse VBR with ABR.

I'll try to explaing my view on this matter. Then some of the experts can correct it 

The average bitrate you see with a certain VBR quality setting, is kind of a coincidence. When alot of encoded material has a certain consistent quality during the whole track, it just happens to average to e.g. 128 kbps. If you take the settings used in this test and encode metal or another demanding genre, you won't see this 128kbps average anymore. The way you explain it, it sounds like it has to sacrifice large parts of the song to be able to boost the bitrate on the hard parts. That's not the idea of VBR (more like ABR but not really that either). When a VBR setting uses only 80 kbps on certain parts, it's because it doesn't need more bits to reach the desired quality. It could have used more if it was needed.

Have you actually heard that the quality in between problem samples is lower? I'm not sure how people ended up providing the samples that they did, for this test. If they listened for problems, then from your reasoning, they wouldn't have provided the problem samples themselves, but rather the low quality parts between problem samples, as that low quality probably would stand out from the rest of the track.

CBR: Variable quality. High quality on easy parts, low quality on hard parts.
ABR: As constant quality as possible within the bitrate limitation. Use bits where they are most useful.
VBR: Constant quality.

So, from my point of view your idea is totaly wrong.  I think it would be possible, and a cynical marketing department might be tempted, but I would say that it isn't very likely the case here. These codecs are used every day, right?

IIRC Nvidia or ATI or both, tried to optimize their graphics drivers to win 3Dmark tests and make their cards look better, but I'm not sure if they sacrificed anything by doing it though.

Now, am I totally wrong? 
Title: Multiformat@128kbps listening test - FINISHED
Post by: ckjnigel on 2004-05-26 19:12:44
I wonder why the test didn't compare files that were the same size, though I know it would be a royal PITA to repeatedly encode to get that.  And then somebody would complain that time to encode should be the equalizing measure ...
Title: Multiformat@128kbps listening test - FINISHED
Post by: Big_Berny on 2004-05-26 19:32:42
Quote
You make it sound like MPC is tweaked for winning listening tests.

No! Sorry if it sounded like this! I think that this is one of the objectivest audio tests!

Sorry you misunderstood me. It's very difficult to explain it for me cause I speak german.
I'll try it again:
I only try to say that MPC perhaps only is so good in this tests because we test only samples with high bitrates! (BTW I know what ABR and VBR is!)
In this test we used quality-settings to reach an average bitrate of 128kbits, right? I know that not the average bitrate of the samples should be 128kbits, but the average bitrate of hole musiccollections with different genres. Right?
If you now look at the bitratetable you'll see that most of the samples encoded with MPC have a bitrate over 128kbits! I don't say that MPC is optimized for the listening test, but its bitrate spreading is very high! And if we only test samples with high bitrates MPC will probably give good results. But there must also be songs with a bitrate under 128kbits because of the average bitrate of approx. 128kbits! Right?
And perhaps this "easy" samples could sound bad because they have a very lower bitrate than the other codecs at this qualitysetting!

Does someone understand this theory? It's only a theory without any prove! But if you look at the testresults you can see that the sample "debussy" with a very low bitrate sounded bad with MPC!

Big_Berny
Title: Multiformat@128kbps listening test - FINISHED
Post by: xmixahlx on 2004-05-26 20:54:05
Big_Berny

you are just saying that musepack has an efficient vbr model... but saying this as if it is wrong or confusing or unfair...

this makes no sense


later
Title: Multiformat@128kbps listening test - FINISHED
Post by: upNorth on 2004-05-26 21:07:04
@Big_Berny:
I see what you mean after looking at the data for the Debussy sample. At first I thought your reasoning behind this low bitrate was for the wrong reason, but now I see that it probably wasn't.

I would also like to see a test like the one you suggest. If musepack or any other codec, uses too few bits in places it should have used more, then I suppose it would be useful to investigate it.

Btw: Explaining what I mean in english, isn't my strength either...

Edit: Added the part about the Debussy sample
Title: Multiformat@128kbps listening test - FINISHED
Post by: Big_Berny on 2004-05-26 21:35:57
Quote
Big_Berny

you are just saying that musepack has an efficient vbr model... but saying this as if it is wrong or confusing or unfair...

this makes no sense


later

No, that's not what I want to say. (Very happy that upNorth seems to have understood)!

I want to say that MPC has a very strong VBR-mode with very different bitrates at the same qualitysettings. In this test there were bitrates from 91kbits to 155kbits. You can call it efficient if you want.
But the problem is that: Only two of the samples we tested had a bitrate under 128kbits! And one of them was rated very bad! I just want to say that perhaps the variation of the bitrate is too high but nevertheless MPC will be rated very good in this test overall because we (almost) only tested samples with bitrates over the average (for this qualitysetting).
In the next test we should perhaps also test some samples with very low bitrates because that could be a serious problem of MPC.
If you only test difficult samples on an codec with a high bitrate variation you'll get good ratings if the codec recognizes that the sample is difficult because it will give it a high bitrate. But on the other hand it gives very little bitrate to non-problem-samples so that they perhaps have a too small bitrate. And then MPC will sound worse than the other codecs which have a smaller bitrate variation and will give a higher bitrate on this sample.


Example: Sample Kraftwerk (high bitrate)
Codecs:  iTunes  MPC    Vorbis    Lame    WMA    Atrac3
Bitrates:  128      152      135      141      127      132
Ratings:  4.30      4.78    4.30      3.32      3.11      2.29

Example: Sample Debussy (low bitrate)
Codecs:  iTunes  MPC    Vorbis    Lame    WMA    Atrac3
Bitrates:  128      98      120        108      129      132
Ratings:  4.67      3.53    4.91      3.75      3.95      4.54

You see now what I mean? Perhaps MPC is like "too efficient"!

Big_Berny
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-26 21:58:37
To: Big_Berny, upNorth.
Hey, guys, problem with mpc bitrate exists and was discussed on the page 4 of this thread.
See ff123 comments on this issue and how to avoid this in the future, if possible.
EDIT: grammar.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Big_Berny on 2004-05-26 22:08:15
@SirGrey: I read it now, thanx.
But I think you only mentioned that it is a problem of the test and not that it could be a problem of the MPC encoder!

Big_Berny
Title: Multiformat@128kbps listening test - FINISHED
Post by: Gabriel on 2004-05-26 22:25:40
I understand the point of Big_Berny, never thought about that.

He means that perhaps mpc could be failing on easy tracks, and as the test is featuring mostly hard tracks, it is performing good.

It might also be a matter of track loudness. As an example, the current Lame version is likely to fail in vbr when using a low volume track, as it is not estimating loudness before encoding.
Title: Multiformat@128kbps listening test - FINISHED
Post by: mithrandir on 2004-05-26 22:36:24
Perhaps this isn't the right time for me to comment on it but I think it would have been superior to test LAME 3.96 ABR instead of VBR, i.e. "--preset 128" or something similar. At lower bitrates ABR is generally considered more effective than VBR. Theoretically VBR should be best but that's not going to be the case in the real world always.

Of course I have not been keeping track of much that is going on so perhaps I missed something.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Gabriel on 2004-05-26 22:39:57
Quote
Perhaps this isn't the right time for me to comment on it but I think it would have been superior to test LAME 3.96 ABR instead of VBR, i.e. "--preset 128" or something similar. At lower bitrates ABR is generally considered more effective than VBR. Theoretically VBR should be best but that's not going to be the case in the real world always.


This was checked before the test. Based on pre-tests, vbr was choosed.

Quote
At lower bitrates ABR is generally considered more effective than VBR

For bitrates around 128kbps, it is now time to change this consideration.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-26 22:43:59
Quote
Gabriel: I understand the point of Big_Berny, never thought about that.

Me too 
ff123 mentioned that at page I point to... Never think that way before.
Interesting question to think of.
Quote
Big_Berny: But I think you only mentioned that it is a problem of the test and not that it could be a problem of the MPC encoder!

He-he.
That depends on your point of view and methodology.
Someone in that discussion porposed to use mpc with different setting for every song - to be sure avg bitrate is 128Kbit. So, in this case mpc will have no problems !   
But from another perspective, using just one setting is much more consistant...
BTW: I never used mpc for encoding, except for testing.
As I understand, it is tweaked layer 2 encoder, so for bitrates less than ~130Kbit it should produce low quality output... (?)
May be somebody familiar with mpc could explain it's behaviour ?
Quote
Gabriel: ...as it is not estimating loudness before encoding.

Oh. Thing to do for version 4 ? 
Title: Multiformat@128kbps listening test - FINISHED
Post by: Big_Berny on 2004-05-26 22:44:11
Quote
I understand the point of Big_Berny, never thought about that.

He means that perhaps mpc could be failing on easy tracks, and as the test is featuring mostly hard tracks, it is performing good.

Thank you! That's exactly what I mean! You explained most of my thoughts in one simple sentence... 

Big_Berny
Title: Multiformat@128kbps listening test - FINISHED
Post by: upNorth on 2004-05-26 22:58:50
Quote
To: Big_Berny, upNorth.
Hey, guys, problem with mpc bitrate exists and was discussed on the page 4 of this thread.
See ff123 comments on this issue and how to avoid this in the future, if possible.
EDIT: grammar.

I had read the whole thread already, but forgot about that discussion, sorry.

I still have the feeling that the bitrate is a topic because of the fact that this is a 128kbps test. If the motivation for adding more samples with low bitrate is only, or partly, to make the average bitrate look better (closer to 128kbps), then I would say the results of such a test would be less interesting. As ff123 has said already, some problems arise because of a too low bitrate (as seen with Debussy sample), and as I see it, that is a valid reason for using more such low bitrate samples. Add it because it is another type of problem sample, not because it makes the average closer to 128kbps.

Shouldn't the samples also be picked so that all codecs has something to struggle with? Like two samples wma struggles with, two mp3 struggles with and so on? Or maybe that would be too much to ask if as many genres as possible should be covered?

My point of view is that when it comes to bitrate, the only thing that counts is the long time average. Doing some artificial tweaking to make all codec have the same average bitrate on all these short samples, would ruin the value of the test for me. For short I agree with the way things are done already.

Then a question, or more like hearing if my understanding is right: If a codec where perfect, wouldn't it then, at a specific quality setting, recieve the same rating for all samples?

I'm sorry if all of this is old "news" and covered in another thread, then I would be greateful if someone could point me too it.
I'm just trying to see if my understanding and way of thinking is right.

Btw: As it takes me a while to write, alot has happend in the meantime. I see now that Gabriel has picked up on the point Big_Berny made.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-26 23:15:52
Quote
My point of view is that when it comes to bitrate, the only thing that counts is the long time average.

Yep. Of course. And here the problem with mpc lies.
So, to summarize, my IMHO, formed by last test and this thread:
1. MPC is a VBR encoder (or at least vbr is the mode it performs much better).
2. Test on many different albums showed (see 128Kbit test discussion thread) that 4.15 setting produces and average bitrate of ~130Kbit and that is ok.
3. The idea the samples for the test are selected is to make encoders job harder.
4. So, average mpc bitrate rises for test samples from 130 to 142Kbit.
5. To maintain avg mpc bitrate about projected (as a result - 136Kbit) additional easy sample(s) (one was selected ocassionally) was chosen.
6. MPC failed on this samples.
So, the question - do mpc have a high score because of it's quality or because of samples selected ?
Correct me, if I wrong somewhere...
BTW: ff123 idea to use equal number of overbitrated and underbitrated samples can correct a situation. May be. That's why I wrote, that error could be in test setup, not in mpc...
Title: Multiformat@128kbps listening test - FINISHED
Post by: QuantumKnot on 2004-05-27 00:50:12
The Debussy sample is a great sample for Frank Klemm when he does future MPC tuning.  If indeed the bitrate was too low for MPC to give good quality, then that is an issue that needs to be fixed.

From the Vorbis side, I think guruboolez tested the 1.0 encoder on one sample (may have been creaking sample or brahms) at q 0 that produced a low bitrate of 40 kbps or something.  It sounded pretty bad.  Monty took note of this and made some tweaks to produce 1.0.1 which now gives a more realistic bitrate (somewhere close to nominal 64 kbps) and sounds much better now.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Tripwire on 2004-05-27 13:19:40
Quickly contribution some more humor to this thread. Someone posted the results on some Windows tech forum, someone else found it funny to reply with this:

Quote
I have audible problems with LAME so I use Blade Enc and all's fine now.

http://www.neowin.net/forum/index.php?showtopic=171304 (http://www.neowin.net/forum/index.php?showtopic=171304)
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-05-27 13:33:22
QuantumKnot> I've sent last years two samples: Stockhausen - Stimmung (vocal music) and Liszt - Harmonies Poétiques et Religieuses (piano). Both at very low volume (original, I didn't changed it).

With 1.01 encoder, the problem corrected (or close to be so).


I've played yesterday with mpc 1.14 -q4.15. I've encoded 10 CD of piano music. Average bitrate was < 100 kbps (for a complete work of Erik Satie (5h30, digital recording), bitrate was ~90 kbps. In other words, the average bitrate of the Debussy isn't an accident... or is a very usual one!

Low volume is just a part of the quality problem. I've encoded a piece of contempory music, very quiet too, and without background noise (lossless encoding reached 20%): bitrate was inferior to 80 kbps, and quality was much better, far from the disaster of Debussy.wav. This mean that MPC could achieve good reproduction even with very low bitrate...
Problem for mpc is maybe low volume + background noise? To be verified...
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-28 17:54:34
Heh... just as the Slashdot hammering on my site starts to subside, I get Slashdotted again.

In Japan (http://slashdot.jp/article.pl?sid=04/05/27/020254)!!! 

I wonder if the SNR there is as big as in slashdot.org...



Edit: BTW, I recommend you don't use Babelfish

Quote
Using part the some overseas, when the domestic technician builds up in the country, when it is called "domestic production" is many, because is.
So, including to the part, when it makes from one, it becomes "the purity domestic".


Your head will explode.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Jaester on 2004-05-28 19:25:22
Roberto,

Can you clarify the sample sizes again? What your N means, is that the total people that listened to all songs or the number of people per song? And what is the sample size for the final ANOVA?

Btw, can you make the actual dataset available for other people to analyze?

As a final comment, I think this is too low a sample size.

Jaester
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-28 20:07:05
Quote
Can you clarify the sample sizes again? What your N means, is that the total people that listened to all songs or the number of people per song? And what is the sample size for the final ANOVA?

What do you mean by sample sizes?

N is the amount of results I received for that sample minus discarded results.

Quote
Btw, can you make the actual dataset available for other people to analyze?


It's already there.
http://www.rjamorim.com/test/multiformat12...s/comments.html (http://www.rjamorim.com/test/multiformat128/comments/comments.html)

Quote
As a final comment, I think this is too low a sample size.


?
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-28 20:40:23
Hi !

I just switched on my iRiver H120 in shuffle mode. I'm currently hearing a Madonna Song, I encoded many years ago with a Fraunhofer encoder in 128 kbps. It sounds very cool, probably due to the usage of intensity stereo. I guess LAME could benefit of intensity stereo usage in the 128-ish bitrate area. I really wonder what ranking it would have got...

I'm currently too lazy to search all postings but in case anyone knows: why has LAME been chosen for mp3 encoding in this test ? It lacks intensity stereo support.

I guess the aoTuV beta 2 encoder would sound lousy in the 128-ish area if it would not make use of intensity stereo (or point-stereo, whatever you want to call it)

bye,
SebastianG
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-28 20:57:57
Quote
As a final comment, I think this is too low a sample size.

Typically, a sample size of 30 is recommended to be representative of a population.

However, we already know going into these types of open tests that the group of listeners who'll respond are not going to be representative of the general population.  So if you accept the proposition that this group of listeners represents the smaller population of "listeners who care (LWC)," then all you need are enough samples to produce a significant result.

You can get significant results this way from just one person, as long as he represents the LWC (e.g. Dibrom tuning lame).  However, more listeners are desirable, of course, to average out individual biases (for example, my limited high-frequency ability predisposes me to the sound of *gasp* wma9 standard).  As was shown in the Vorbis and mp3 listening pre-tests, even a handful of listeners listening to multiple samples can produce reliable and accurate result.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-28 21:02:42
Quote
I just switched on my iRiver H120 in shuffle mode. I'm currently hearing a Madonna Song, I encoded many years ago with a Fraunhofer encoder in 128 kbps. It sounds very cool, probably due to the usage of intensity stereo. I guess LAME could benefit of intensity stereo usage in the 128-ish bitrate area. I really wonder what ranking it would have got...

I'm currently too lazy to search all postings but in case anyone knows: why has LAME been chosen for mp3 encoding in this test ? It lacks intensity stereo support.

I guess the aoTuV beta 2 encoder would sound lousy in the 128-ish area if it would not make use of intensity stereo (or point-stereo, whatever you want to call it)

FhG does not use intensity stereo at 128 kbit/s.  IS is a low bitrate technique, in the same vein as spectral band replication, and isn't meant to produce near transparent encodings.

However, that isn't to say that old FhG encodings can't sound competitive.  Roberto's last mp3 test (designed to find the best mp3 encoder at 128 kbit/s, and which lame won) did not include the super slow FhG encoder, which many people with very good high frequency hearing might like best as their mp3 encoder at 128 kbit/s.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-28 21:07:51
Quote
why has LAME been chosen for mp3 encoding in this test ? It lacks intensity stereo support.

IMHO because:
1. It is most widely used mp3 encoder.
2. It, I think, is on pair (or even better, may be) than old mp3enc31.
I 'm sure Gabriel could answer this question more correctly.
BTW, do you know that mp3enc31 cost 199$ ? And you can not purchase it now.
I think, most people used it illegally 
Oh, and I think lame USES joint-stereo for 128Kbit bitrate.
And IS (intensity stereo) corrupts stereo image, thus it is not recommended for such a *high* bitrate as 128Kbit.
EDIT: ff123 was faster 
EDIT2:
Quote
Roberto's last mp3 test (designed to find the best mp3 encoder at 128 kbit/s, and which lame won)

Forgot to mention it as 3. Sorry...
Title: Multiformat@128kbps listening test - FINISHED
Post by: eagleray on 2004-05-28 22:49:55
Lame is the most widely used MP3 encoder?  Perhaps around here.  I would have to say most people are using variations on royalty paying MP3 encoders, probably Fraunhafer, that are included in various all-in-one music solutions.  Because of its so so legal status none of these programs can incorporate Lame, even if many popular applications work with it.  What about Music Match, it comes on nearly every PC?

Frankly, I am amazed at all the tiny details that seem so fascinating around here.

IMO, Roberto's test is a blockbuster.  Look at the politics.  An unofficial build of the open source ogg vorbis encoder blows everything away.  Two proprietary solutions, one from the hated MS and the other from Sony, a mega copyright holder, make a weak showing.  The hightly compatible and easy to find Lame MP3 shows it has a bunch of life left in it.  That is headline news in digital audio compression if there ever was any.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-28 23:23:29
Quote
Lame is the most widely used MP3 encoder? Perhaps around here.

Heh. You are probably right.
Personally, I (nor my friends) 've never use any *box* solutions(except nero, which comes with all my writers), so I simply did not count them. My fault 
Musicmatch, ITunes and so on have a huge auditory, really...
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-28 23:50:31
Quote
FhG does not use intensity stereo at 128 kbit/s.  IS is a low bitrate technique, in the same vein as spectral band replication, and isn't meant to produce near transparent encodings.

However, that isn't to say that old FhG encodings can't sound competitive.  Roberto's last mp3 test (designed to find the best mp3 encoder at 128 kbit/s, and which lame won) did not include the super slow FhG encoder, which many people with very good high frequency hearing might like best as their mp3 encoder at 128 kbit/s.

ff123

I've used an old l3enc which DOES make use of IS, even at 192 kbps.

As for "near transparency": Current Vorbis encoders make use of IS at up to -q5.99. They just don't call it Intensity stereo. Monty seems to have a very different (official) point of view regarding this. He talks about diffuse and point images in the specification. Well, it's basically the same as intensity stereo.

(Maybe seeing/interpreting things from a different angle helps avoiding patent issues, I don't know...)

Anyway, I'm surprised that LAME peforms so well WITHOUT Intensity Stereo in the 128-ish bitrate area - Same for FAAC.  (no IS AFAIK)
I guess Vorbis will have strong competitorw when LAME and FAAC start making use of IS for that kind of bitrates (perhaps PNS for FAAC, too).

time will tell.

bye,
Sebi
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-29 00:05:48
Quote
BTW, do you know that mp3enc31 cost 199$ ? And you can not purchase it now.
I think, most people used it illegally 


Believe it or not. I registered l3enc back in 1997 together with the WinPlay3 software for Win3.11

Quote
Oh, and I think lame USES joint-stereo for 128Kbit bitrate.
And IS (intensity stereo) corrupts stereo image, thus it is not recommended for such a *high* bitrate as 128Kbit.


Yeah, I wasn't talking about joint stereo coding. LAME does M/S coding as one possible Joint-Stereo coding technology but not IS.

How you define stereo image ?
It is a widely believed fact that we are unable to perceive phase differences of high frequencies, so IS is an appropriate tool, even for near transparency encodings.

bye,
Sebi
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-29 02:49:53
Quote
Roberto's last mp3 test (designed to find the best mp3 encoder at 128 kbit/s, and which lame won) did not include the super slow FhG encoder

It did. Audioactive (I.E, slowenc with some tunings done in AudioActive)
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-29 02:51:55
Quote
IMO, Roberto's test is a blockbuster.

Thank-you, but that's the reason conducing listening tests is nearly impossible now
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-29 03:08:21
Quote
I've used an old l3enc which DOES make use of IS, even at 192 kbps.

Buoy...

From l3enc documentation (taken from good ol' ReallyRareWares (http://www.rjamorim.com/rrw)):

*****For l3enc 2.0*****
Quote
For bitrates <= 96 kbps, the default is intensity stereo (-mod 1). For
bitrates >= 112 kbps, the default is ms-stereo (-mod 0). For
more details about encoding modes, please refer to section 1.11 'Encoding
Recommendations'


Quote
For coding of stereo files with bitrates <=96 kbps, the use of intensity
stereo is highly recommended. This is also the default configuration of
the encoder. Note, however, that the use of intensity stereo will destroy
information which is needed for sound processing schemes like
Dolby Surround. For bitrates >= 112 kbps, intensity stereo is not used by
default.


*****For l3enc 2.72*****
Quote
For the coding of stereo files with bitrates <=96 kbit/s, the encoder
will use the intensity stereo technique.
Note, however, that the use of intensity stereo may demage information
which is needed for sound processing schemes like Dolby Surround.
For bitrates >= 112 kbit/s, intensity stereo is not used.



What means that, if you got IS at 192kbps, you were messing where you shouldn't

Quote
As for "near transparency": Current Vorbis encoders make use of IS at up to -q5.99. They just don't call it Intensity stereo. Monty seems to have a very different (official) point of view regarding this. He talks about diffuse and point images in the specification. Well, it's basically the same as intensity stereo.


I always understood Vorbis' implementation as a variation on M/S stereo, not IS.

After all, it's very well known that IS completely ruins the stereo image. There were some pre tests for my listening tests that came to that conclusion (look for a post by tigre, IIRC)

Quote
Anyway, I'm surprised that LAME peforms so well WITHOUT Intensity Stereo in the 128-ish bitrate area - Same for FAAC.  (no IS AFAIK)


Same for MPC and iTunes.

Actually, IS was once available in MPC, and IIRC Andree removed it because it had no place in a codec targeted at high quality

Quote
I guess Vorbis will have strong competitorw when LAME and FAAC start making use of IS for that kind of bitrates (perhaps PNS for FAAC, too).


I keep my point that using IS at bitrates above 96kbps is a very bad idea, except on very specific cases.

Regards;

Roberto.
Title: Multiformat@128kbps listening test - FINISHED
Post by: kwanbis on 2004-05-29 04:52:26
Quote
Because of its so so legal status none of these programs can incorporate Lame, even if many popular applications work with it.

afaik, you are wrong, you are required t pay a license for the right to implement/use an MP3 encoder, so after that, you can use LAME if you want LEGALLY.
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2004-05-29 05:39:26
Quote
Quote
Roberto's last mp3 test (designed to find the best mp3 encoder at 128 kbit/s, and which lame won) did not include the super slow FhG encoder

It did. Audioactive (I.E, slowenc with some tunings done in AudioActive)

Audioactive is a different beast from the very slow codec, which is best represented by mp3enc31, or by using fastencc.exe in -hq mode (this version of the very slow codec has a higher lowpass than mp3enc31).

Audioactive/Opticom/"radium" can be grouped together, but not in the same family as mp3enc31/fastencc.exe -hq.

mp3enc31 is recognizable by low frequency glitches.  Ironically, bAdDuDeX (an mp3 connoiseur from long ago), who could hear a 16 kHz lowpass in applaud.wav, loved mp3enc31 despite the glitching and despite its relatively low 14.5 kHz lowpass because it was free from high-frequency ringing.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: ger@co on 2004-05-29 07:54:22
Quote
Frankly, I am amazed at all the tiny details that seem so fascinating around here.

IMO, Roberto's test is a blockbuster.


I agree completely with both of these statements.  They both, in their own way, provide interesting reading.  Roberto's tests "always" provide informative, useful and constructive information, while the former simply serves to amuse and utter the occasional "WTF is that about?"

Later.
Title: Multiformat@128kbps listening test - FINISHED
Post by: harashin on 2004-05-29 08:04:09
Quote
Heh... just as the Slashdot hammering on my site starts to subside, I get Slashdotted again.

In Japan (http://slashdot.jp/article.pl?sid=04/05/27/020254)!!!  

I wonder if the SNR there is as big as in slashdot.org...

It seems to be even worse than the one in .org. I found some guys say that Sony rules and this test sucks, or they should adopt Japanese songs for samples. A guy who has his doubts about the test didn't even know that has been a double blind test
Title: Multiformat@128kbps listening test - FINISHED
Post by: maikmerten on 2004-05-29 09:34:03
Quote
It is a widely believed fact that we are unable to perceive phase differences of high frequencies, so IS is an appropriate tool, even for near transparency encodings.

The problem with MP3 IS is that it´s not possible to restrict IS usage to certain frequencies - you can only switch stereo modes on a block level, not on a frequency one.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-29 10:58:00
Quote
Quote
I've used an old l3enc which DOES make use of IS, even at 192 kbps.

Buoy...

Ahhh!

I don't know what was going wrong....
I just trimmed a frame out of the middle and checked for myself:

Header: FF FB 90 64 = 1111 1111 | 1111 1011 | 1001 0000 | 0110 0100
=>
MPEG 1, Layer 3, 128 kbps, 44100 Hz, Joint-Stereo
mode_extension = 10 => M/S coding: yes  and  IS coding: no.

I apologize for that. Might have been fuzzy memory or something.
Also, I wasn't that experienced back in 1997.

But I remember that i fed l3enc with out-of-phase high frequency sines that got cancelled ... 

edit:

As for Vorbis: Trust me.  It's intensity stereo for q<6 (called "point-stereo")
I stick to "IS is very powerful if done right."
Vorbis Stereo Stuff (http://www.xiph.org/ogg/vorbis/doc/stereo.html)
At q<6 and at a certain frequency Vorbis encoders switch from lossless coupling to point-stereo. In point-stereo the angle value will always br zero. Therefore, the (unscaled) MDCT samples will be the same for both channels after inverse square polar mapping. Intensity is controlled by the floor curves.

In Vorbis, decorrelation and intensity stereo is achieved by square-polar-mapping and channel-interleaved vector quantization.

bye,
Sebastian
Title: Multiformat@128kbps listening test - FINISHED
Post by: robert on 2004-05-29 11:15:16
Quote
Quote
It is a widely believed fact that we are unable to perceive phase differences of high frequencies, so IS is an appropriate tool, even for near transparency encodings.

The problem with MP3 IS is that it´s not possible to restrict IS usage to certain frequencies - you can only switch stereo modes on a block level, not on a frequency one.

well, in mp3 you can signal the use of IS stereo without using IS at all**.
if IS is used, you will have to use it for the whole frequency range beginning from the last scalefactor band down to some arbitrary but fixed scale factor band. you can use L/R or M/S coding for the lower bands.

---
**) I'm actually not sure if the sfb21--where no scalefactor band exists--wouldn't have to be IS coded with 0 degree direction in this case
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-29 11:31:13
Quote
Quote
It is a widely believed fact that we are unable to perceive phase differences of high frequencies, so IS is an appropriate tool, even for near transparency encodings.

The problem with MP3 IS is that it´s not possible to restrict IS usage to certain frequencies - you can only switch stereo modes on a block level, not on a frequency one.


This is a quote from the mp3 specification:
Quote
Intensity Stereo
This mode switch (found in the header: mode_extension) allows switching from 'normal stereo' to intensity stereo. The lower bound of the scalefactor bands decoded in intensity stereo is derived from the "zero_part" of the right channel. Above this bound decoding of intensity stereo is applied using the scalefactors of the right channel as intensity stereo positions. An intensity stereo position of 7 in one scalefactor band indicates that this scalefactor band is NOT decoded as intensity stereo.


I guess this means the encoder can choose some kind of split frequency. Below this frequency L/R or M/S coding is applied and above IS coding is used.

Agree ?

bye,
Sebastian
Title: Multiformat@128kbps listening test - FINISHED
Post by: robert on 2004-05-29 11:51:54
Quote
I guess this means the encoder can choose some kind of split frequency. Below this frequency L/R or M/S coding is applied and above IS coding is used.

Agree ?

yes.

M/S coding is some special case of doing some main axis transformation of the stereo plane and transmitting the rotation angle, the sum and the difference signal. for mid/side coding the rotation angle is fixed and is not transmitted.

IS coding is some simplification where you leaf out the difference signal.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-29 11:52:31
Quote
ff123: mp3enc31 is recognizable by low frequency glitches. Ironically, bAdDuDeX (an mp3 connoiseur from long ago), who could hear a 16 kHz lowpass in applaud.wav, loved mp3enc31 despite the glitching and despite its relatively low 14.5 kHz lowpass because it was free from high-frequency ringing.

BTW, it was officially recommended to change it's default lowpass to -bw 15995 (number can be wrong).
And it's ability not to produce ringing and very small pre-echo always was a point all free developers gonna to achieve, as I remember. 
Interesting, may be now, when mp3 seems to be already mature standart, we could find somebody from Fhg and ask how they avoid ringing in their encoder ?
If it is not already known, of course...
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-29 12:45:42
I've had a strong feeling that I already saw this discussion about mp3enc.
I've find it at the end: Test old Fhg encoder or not (http://www.hydrogenaudio.org/forums/index.php?showtopic=16270&st=50&)
Title: Multiformat@128kbps listening test - FINISHED
Post by: QuantumKnot on 2004-05-30 03:11:49
I can also confirm that the current Vorbis encoders use a mix of lossless stereo (full mag, full ang preserved) and point stereo (zero ang) below q 6.  Point stereo kicks in for components above a certain frequency which is dependent on the quality.  For lower quality values, more point stereo is used, hence the recognisable 'stereo collapse'.  It does not appear to be the optimal way of doing things but considering the quality we get from current Vorbis, it doesn't do a bad job either.  Monty has plans of implementing a better stereo model.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Gabriel on 2004-05-30 18:00:42
Quote
Because of its so so legal status none of these programs can incorporate Lame, even if many popular applications work with it.

Lame project is only providing a technology implementation. It is up to the the company wanting to use it to acquire a patents license regarding the mp3 patents.
Several companies choosed this solution and are using Lame in their products.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Ivan Dimkovic on 2004-05-30 19:39:48
Quote
Anyway, I'm surprised that LAME peforms so well WITHOUT Intensity Stereo in the 128-ish bitrate area - Same for FAAC.  (no IS AFAIK)

There is absolutely no need to use IS @128 kb/s  for MP3 or AAC.

And I also disagree with the claims that IS could bring good quality - there are lots of cases with stereo configuration impossible to code properly with IS, because IS saves only ILD information (level difference)  and not ITD (time difference) and inter-channel cross corellation. 

Equalized and mixed "left" (IS) channel could completely distort the phase information, and you end up with something which is quite different from the original when the coloration of the sound comes into the question.

Applaud is one of the examples that is impossible to code properly with IS.

Smart psychoacoustic would be able to disable IS for such frames, but @128 kb/s there woud be no need for lossy bit savings,  same goes for PNS (in AAC) more or less- we did a lot of tests with PNS @128 kb/s and in most cases is pretty much useless, or degrades the quality.
Title: Multiformat@128kbps listening test - FINISHED
Post by: maikmerten on 2004-05-30 19:53:06
Quote
Quote
I guess this means the encoder can choose some kind of split frequency. Below this frequency L/R or M/S coding is applied and above IS coding is used.

Agree ?

yes.

M/S coding is some special case of doing some main axis transformation of the stereo plane and transmitting the rotation angle, the sum and the difference signal. for mid/side coding the rotation angle is fixed and is not transmitted.

IS coding is some simplification where you leaf out the difference signal.

Thanks alot for the explanations. I want to apologize for making obviously wrong statements about MP3 IS.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SebastianG on 2004-05-30 22:50:59
Quote
And I also disagree with the claims that IS could bring good quality - there are lots of cases with stereo configuration impossible to code properly with IS, because IS saves only ILD information (level difference)  and not ITD (time difference) and inter-channel cross corellation. 
[...]
Smart psychoacoustic would be able to disable IS for such frames, but @128 kb/s there woud be no need for lossy bit savings,  same goes for PNS (in AAC) more or less- we did a lot of tests with PNS @128 kb/s and in most cases is pretty much useless, or degrades the quality.

Thanks for your reply.

Let's compare ogg to aac. Monty said once, lossless coupling would be like wasting space for 128kbps and 160kbps modes. You guys keep telling me IS is inappropriate for that bitrates. Sure, this mapping is irreversible and only preserves the channel's energy levels - not all their phase relations. But AFAIK phase relations are not that important to us at above 10 kHz because the wavelength is already very short. So if an advanced encoder would make use of this psychoacoustic effect properly by using IS this could save some space and allows to use smaller scalefactors to improve the SNR.

AFAIK IS can be switched on/off for each scalefactor band (AAC). Another cool thing: IS can be done in-phase and out-of-phase. How about the following sheme for scalefactor bands above 10 kHz:

- treat MDCT samples for a scalefactor band as multidimensional vector
- compute the cosine of the angle between L and R by cos_a := \frac{<L,R>}{||L|| ||R||}
- use in-phase IS for cos_a > 0.5
- use out-of-phase IS for cos_a < -0.5

These thresholds (in this case 0.5) could be chosen depending on the quality-preset and frequency area.

Well, I don't know, if Intensity Stereo is or is not appropriate for 128 kbps. But I do know that Vorbis makes use of it and "won" the listening test.

edit: corrected cos_a correlation formula

bye,
Sebastian
Title: Multiformat@128kbps listening test - FINISHED
Post by: enry2k on 2004-05-30 23:19:23
Suprisely! How it is possible Lame mp3 better than Itunes aac?
are previous tests wrong?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Lyx on 2004-05-30 23:56:15
Quote
Suprisely! How it is possible Lame mp3 better than Itunes aac?
are previous tests wrong?

the encoder-version + settings used for lame mp3 during this test for sure weren't the same as in previous large-scale tests.

Lame is not = Lame. In recent versions, lame seems to have made good progress in improving at mid-bitrates. You can see this in the lame 3.90.3 vs. lame 3.96 thread.

- Lyx
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-05-31 01:10:38
Quote
Quote
Suprisely! How it is possible Lame mp3 better than Itunes aac?
are previous tests wrong?

the encoder-version + settings used for lame mp3 during this test for sure weren't the same as in previous large-scale tests.

It's also worth mentioning that Lame is not better in the test. It's officially tied, with a tendency to be a little worse.
Title: Multiformat@128kbps listening test - FINISHED
Post by: neomoe on 2004-05-31 18:27:43
Quote
ok, so it looks like the vbr contenders did very well and itunes's cbr held its on. how safe would it be to assume that using vbr with AAC (for instance the most recent FAAC with FB2K) would be a contender?


okay, it has been proven that iTunes AAC is better then FAAC for instance.
http://www.rjamorim.com/test/aac128test/results.html (http://www.rjamorim.com/test/aac128test/results.html)

but how sure could one say how good iTunes AAC would be if it had VBR implemented?

and, one silly question:

how sure is it how good/bad one encoder would perform at higher bitrates,e.g. 160kbps?
i mean is it right to say that vorbis for instance can reach transparent level at a lower bitrate then MPC or AAC?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Liquid_Predator on 2004-05-31 18:38:49
Quote
okay, it has been proven that iTunes AAC is better then FAAC for instance.
http://www.rjamorim.com/test/aac128test/results.html (http://www.rjamorim.com/test/aac128test/results.html)


This is an rather old test, here are newer test results: http://www.rjamorim.com/test/aac128v2/results.html (http://www.rjamorim.com/test/aac128v2/results.html) but iTunes is still the winner.

Quote
how sure is it how good/bad one encoder would perform at higher bitrates,e.g. 160kbps?
i mean is it right to say that vorbis for instance can reach transparent level at a lower bitrate then MPC or AAC?


You can´t extrapolate the results! WMA is generally better than MP3 at 64kb/s, but at 128kb/s MP3 is better.
Title: Multiformat@128kbps listening test - FINISHED
Post by: neomoe on 2004-05-31 18:53:15
Quote
You can´t extrapolate the results! WMA is generally better than MP3 at 64kb/s, but at 128kb/s MP3 is better.


okay, thank you, but
Quote
i mean is it right to say that vorbis for instance can reach transparent level at a lower bitrate then MPC or AAC?


i assume we would need a listening test at this high bitrate, right?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Liquid_Predator on 2004-05-31 18:58:47
Quote
i assume we would need a listening test at this high bitrate, right?

Indeed
Title: Multiformat@128kbps listening test - FINISHED
Post by: neomoe on 2004-05-31 19:12:16
even if i accounted harashin's private listening test (http://209.152.181.168/~hydrogen/index.php?showtopic=21916&st=50&hl=) i wouldn't be able to do so?
see, we know auTuV is very good at 128kbps and at ~200kbps (at least for harashin).
now we should be able to say how Vorbis would perform at around 160kbps shouldn't we?

and what about my other question:
Quote
but how sure could one say how good iTunes AAC would be if it had VBR implemented?
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-05-31 19:30:53
My post will not be very informative 
Quote
but how sure could one say how good iTunes AAC would be if it had VBR implemented?

No one knows.
As example - Fhg mp3 encoders on low bitrates tends to be better with CBR, than VBR (search forum, if you wish to have more info) but no one would tell (I hope  ) VBR is loosely implemented there...
VBR have it's own problems, as discussed in this thread...
EDIT: grammar
Title: Multiformat@128kbps listening test - FINISHED
Post by: neomoe on 2004-05-31 19:42:23
oh gosh!

trying is superior to studying - isn't it?
(well, I tried to translate a german saying)

okay.now for me it IS like this:

aoTuv is superior to iTunes AAC even at 160kbps and above and second personal truth is iTunes AAC would be better with VBR implemented presumed that it is decent implemented. harrrharrr!
Title: Multiformat@128kbps listening test - FINISHED
Post by: Jojo on 2004-06-06 18:20:34
I wonder how ATRAC3plus performs...it's a pity that it wasn't included in the test
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-06-06 18:41:26
It's currently not possible to oppose atrac3+ to other encoders at 128 kbps. For a simple reason: there's no 130 kbps mode with current and public atrac3+ encoder. Only low bitrate (48 & 64 kbps) and high bitrate (256 kbps) setting.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Jojo on 2004-06-07 09:38:43
Quote
It's currently not possible to oppose atrac3+ to other encoders at 128 kbps. For a simple reason: there's no 130 kbps mode with current and public atrac3+ encoder. Only low bitrate (48 & 64 kbps) and high bitrate (256 kbps) setting.

interesting...so I wonder what bitrate is used in Sony's music store...I read they use ATRAC3plus...don't tell me they use 64kbps 
Title: Multiformat@128kbps listening test - FINISHED
Post by: Pio2001 on 2004-06-07 20:58:53
Hello, I'm back from holydays...

I just wanted to point out an odd thing that happened to me during this test : contrary to the common way of things, I could only ABX the 7th sample (gone) with speakers, and not with headphones ! (Dynaudio Gemini speakers vs Sennheiser HD600 headphones).


A picture of me ABXing "gone" : listeningtest.jpg (http://perso.numericable.fr/laguill2/pictures/listeningtest.jpg) 
To avoid any background noise, the picture is video-projected on a screen in front of me, and the computer is in the next room. 5 meters mouse, keyboard, SPDIF, and DVI cables.

Mhhh, actually, I must admit that my speaker setting is often refered by my father as "the biggest headphones I've even seen".
Title: Multiformat@128kbps listening test - FINISHED
Post by: Lyx on 2004-06-07 21:42:55
Quote
A picture of me ABXing "gone" : listeningtest.jpg (http://perso.numericable.fr/laguill2/pictures/listeningtest.jpg) 
To avoid any background noise, the picture is video-projected on a screen in front of me, and the computer is in the next room. 5 meters mouse, keyboard, SPDIF, and DVI cables.

Mhhh, actually, I must admit that my speaker setting is often refered by my father as "the biggest headphones I've even seen".

   
Title: Multiformat@128kbps listening test - FINISHED
Post by: SirGrey on 2004-06-07 21:55:13
Quote
A picture of me ABXing "gone" : listeningtest.jpg 

Nice settings 
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-06-08 01:27:51
Quote
contrary to the common way of things, I could only ABX the 7th sample (gone) with speakers, and not with headphones ! (Dynaudio Gemini speakers vs Sennheiser HD600 headphones).

Impressive!

Can you describe the artifact you can only detect with your speakers, and not your headphones?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Pio2001 on 2004-06-08 02:23:46
From the "user comment" section, matching the samples ID with the filenames, then the filenames with the codecs, I got

Lame  : 5/5

AAC : 4/5 : Ringing on the first guitar note.
ABX 13/16 from 8.3 to 22 s

Musepack : 5/5

Atrac : 4/5 More treble from 17s
ABX 15/16 from 17.17 to 24.21 s

Vorbis  : 5/5

WMA :  3.5/5 : More treble when the guitar comes in
ABX 13/16, from 8 to 26.6 s


When I say "more treble", it is just a subjective impression. It does not necessarily mean the the treble level is higher, but rather the the treble sounds brighter.

Edit : these speaker have not a linear response. The tweeter is set 1.5 db louder than the woofer. The crossover frequency is 2 kHz. I usually cancel this with Foobar convolver, but with ABCHR I couldn't.

Edit2 : the fact that they are rather to the sides of the listener than in front of him makes treble even harsher (this is the case with audio sources when they are to the side of the listener). But that's the way I often listen to music. I did not move the speakers especially for the test.
Title: Multiformat@128kbps listening test - FINISHED
Post by: phong on 2004-06-08 13:31:04
I thought it was quite impressive that you look so much like your avatar.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Cutter on 2004-10-16 21:46:46
Hello!

I would like to know if the poor results of lame aren't due to its popularuty. The more you listen to a format, the easier it is for you to recognize its artefacts, right? Maybe the people who participated where more able to recognize MP3 than other formats. What do you think?
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-10-16 22:20:35
Quote
Hello!

I would like to know if the poor results of lame aren't due to its popularuty. The more you listen to a format, the easier it is for you to recognize its artefacts, right? Maybe the people who participated where more able to recognize MP3 than other formats. What do you think?
[a href="index.php?act=findpost&pid=248163"][{POST_SNAPBACK}][/a]


Well, I wouldn't consider Lame's resulst poor. It ended up tied with AAC - that is supposed to sound much better!

Now, for your concern: I think the artifacts that happen on lossy music are pretty much the same across all formats. So, if you learn to distinguish pre-echo, smearing or stereo collapse on MP3, you will probably detect these same artifacts in AAC, Vorbis, MPC... if they are there.

That's just a supposition though, maybe MP3's popularity did affect its results in some way...
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-10-17 00:25:57
Quote
The more you listen to a format, the easier it is for you to recognize its artefacts, right?
[a href="index.php?act=findpost&pid=248163"][{POST_SNAPBACK}][/a]


I don't think so. For exemple, tons of people are listening to vorbis for years, and still can't detect anything wrong in stereo image or timbre coarseness...
To recognize with ease artifacts, you probably need to track them. It's an active attitude, opposed to the daily listening, which is passive.

On the other side, artifacts don't really differ from one encoder to another. mp3, aac, mpc, wma, atrac3... are really close each others. SBR (mp3pro, he-aac) introduce specific problems in addition to the previous one; vorbis is also slightly different (see above); hybrid encoders produce noise. But most artifacts (pre-echo, warbling, chirping, metallic sound...) are common to all transform encoders.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Cutter on 2004-10-17 01:57:44
Quote
Quote
The more you listen to a format, the easier it is for you to recognize its artefacts, right?
[a href="index.php?act=findpost&pid=248163"][{POST_SNAPBACK}][/a]


I don't think so. For exemple, tons of people are listening to vorbis for years, and still can't detect anything wrong in stereo image or timbre coarseness...
To recognize with ease artifacts, you probably need to track them. It's an active attitude, opposed to the daily listening, which is passive.

On the other side, artifacts don't really differ from one encoder to another. mp3, aac, mpc, wma, atrac3... are really close each others. SBR (mp3pro, he-aac) introduce specific problems in addition to the previous one; vorbis is also slightly different (see above); hybrid encoders produce noise. But most artifacts (pre-echo, warbling, chirping, metallic sound...) are common to all transform encoders.
[a href="index.php?act=findpost&pid=248177"][{POST_SNAPBACK}][/a]

That's what most of the people who participated to this test did, I guess. We're not talking about average music listeners here, but people who have "trained ears".
Title: Multiformat@128kbps listening test - FINISHED
Post by: guruboolez on 2004-10-17 02:18:35
Quote
We're not talking about average music listeners here, but people who have "trained ears".
[a href="index.php?act=findpost&pid=248182"][{POST_SNAPBACK}][/a]


Sorry, but when you said "The more you listen to a format, the easier it is for you to recognize its artefacts, right?" I thought you made a general assumption.

To answer to this: many results were sent by people which are not trained. Take a look to the overall notation: wma@128 is "near transparent" according to the test. It can't be true for someone having a small experience in artifacts hunting.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Cutter on 2004-10-23 01:33:45
Ok. Thank you both for your answers.
Title: Multiformat@128kbps listening test - FINISHED
Post by: moi on 2004-10-24 09:01:50
Quote
Quote
Roberto> what software did you used to obtain wma9 files? Is it VBR-2 pass 128 kbps? What decoder? I've tried to reproduce the same wavform with different settings, and I wasn't able to do it.

I already asked him about this.
http://www.hydrogenaudio.org/forums/index....ndpost&p=210584 (http://www.hydrogenaudio.org/forums/index.php?showtopic=21370&view=findpost&p=210584)

EDIT:It's certainly Bitrate VBR 128kbps, 44kHz, stereo VBR 1pass.


I don't get that. From what I have seen, for 1 pass WMA VBR you cannot specify a bit rate at all, only the "quality settings"such  as 50, 75, 90, etc.

With two pass WMA VBR you specify an average bit rate.

You state it was WMA one pass 128 kbps VBR. How could that be?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Latexxx on 2004-10-24 09:42:33
Windows media encoder allows you to do 1-pass bitrate vbr. It is somekind of ABR.
Title: Multiformat@128kbps listening test - FINISHED
Post by: moi on 2004-10-24 19:14:17
Quote
Windows media encoder allows you to do 1-pass bitrate vbr. It is somekind of ABR.
[a href="index.php?act=findpost&pid=249646"][{POST_SNAPBACK}][/a]


I guess I've only tried WMA VBR using DBPoweramp. For 1 pass VBR, there are no bit rate settings, just the quality settings like 50, 75, 90, etc. With the two pass VBR, you set a target bit rate. (I guess they figure that with two passes they can come closer to a target bit rate, but not with one pass.)

Surprised it's different in WME. It doesn't have the "quality settings"?
Title: Multiformat@128kbps listening test - FINISHED
Post by: Latexxx on 2004-10-24 19:23:29
It has them also.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2004-10-24 19:45:36
I used two-pass VBR (Bitrate VBR)
Title: Multiformat@128kbps listening test - FINISHED
Post by: 1kyle on 2005-01-28 08:50:30
In some cases tests like this are not very subjective -- for example an Opera lover will probably cringe at listening to tracks of "House Music" played with ANY CODEC and probably vice versa. It's almost impossible to find music that everyone likes which rather invalidates some of the test findings.

I've tried the new HI-MD minidisc units from Sony particularly the NH-1 and I'm pretty fussy with my music. The HI-SP (Atrac3 +) format seems to me certainly for music on the move or when wearing some decent cans as good as CD (also CD's have pretty varying quality as well).

For Classical Music which on the whole has a higher dynamic range than most rock type music then MP3's can sound pretty hopeless. Acoustic instruments also tend to sound somewhat "quirky" on MP3's as well whereas the more "electronic sound" of dance music tends to hide some of the more obvious problems with MP3's  especially at the lower bit rates.

My main problem with ATRAC3 + is some of the really STUPID DRM problems which make copying and distributing YOUR OWN MUSIC a real pain.

-K
Title: Multiformat@128kbps listening test - FINISHED
Post by: Gabriel on 2005-01-28 08:54:48
You are right, placebo effect is way more suggestive, and often works quite well.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Kblood on 2005-01-28 14:11:14
Quote
You are right, placebo effect is way more suggestive, and often works quite well.
[a href="index.php?act=findpost&pid=268810"][{POST_SNAPBACK}][/a]

Title: Multiformat@128kbps listening test - FINISHED
Post by: Pio2001 on 2005-01-30 03:50:27
Quote
Acoustic instruments also tend to sound somewhat "quirky" on MP3's as well whereas the more "electronic sound" of dance music tends to hide some of the more obvious problems with MP3's  especially at the lower bit rates.[a href="index.php?act=findpost&pid=268808"][{POST_SNAPBACK}][/a]


My experience is the exact opposite (at high bitrates at least). Guruboolez' harpsichord and orchestral samples sound transparent to me, while the electronic boxes of Amnesia, Fsol, Autechre, Spahm, Astral, Transwave etc. sound ugly to me once encoded.
Title: Multiformat@128kbps listening test - FINISHED
Post by: jmitch on 2005-02-15 14:16:52
Anyone who is basing their ideas off this listening test is taking a bit of a risk. This test is very good and I commend rjamorim for taking his time to conduct it, however I don't believe there was nearly enough testers to validate any accurate data, and therefore come to any valid conclusions. I think this test should be redone and spread much more widely over the internet audio boards, not just this one. Then we could formulate some accurate conclusions. In my opinion, there is just not enough data to do that.
Title: Multiformat@128kbps listening test - FINISHED
Post by: nyarlathotep on 2005-02-15 14:29:08
Quote
I think this test should be redone and spread much more widely over the internet audio boards, not just this one.

Yes I think that's right to say that not enough people did participate.

Still, IIRC there were anouncements made on other boards about this test. I personnaly made one there (on a French popular board, but not as specialized as HA is about audiocoding):
http://forum.hardware.fr/forum2.php?config...sh=0&subcat=131 (http://forum.hardware.fr/forum2.php?config=hardwarefr.inc&post=66496&cat=3&cache=&sondage=0&owntopic=0&p=1&trash=0&subcat=131)

To tell the truth of what I think: not so many people really want to spend some time testing different samples. Not even mentionning those who don't know what an ABX test is and claim that everything is just like "night and day" or so.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2005-02-15 14:29:48
Formal listening tests conduced by the ITU and EBU sometimes use as few as 9-10 listeners. Trained listeners, of course, but still, it's quite few compared to the amount of people that participated in some of the samples of this test...
Title: Multiformat@128kbps listening test - FINISHED
Post by: Busemann on 2005-02-15 14:33:48
Quote
Formal listening tests conduced by the ITU and EBU sometimes use as few as 9-10 listeners. Trained listeners, of course, but still, it's quite few compared to the amount of people that participated in some of the samples of this test...
[a href="index.php?act=findpost&pid=273760"][{POST_SNAPBACK}][/a]


They also use reference systems, though. The downside to these internet tests is the wide array of equipment, which means the transparency threshold is often quite low. But then again, it reflects the real world nicely.
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2005-02-15 15:02:45
Quote
Anyone who is basing their ideas off this listening test is taking a bit of a risk. This test is very good and I commend rjamorim for taking his time to conduct it, however I don't believe there was nearly enough testers to validate any accurate data, and therefore come to any valid conclusions. I think this test should be redone and spread much more widely over the internet audio boards, not just this one. Then we could formulate some accurate conclusions. In my opinion, there is just not enough data to do that.
[a href="index.php?act=findpost&pid=273756"][{POST_SNAPBACK}][/a]


I think the conclusions reached were quite valid and accurate -- for the goup of people who participated and for the samples listened to.  That was the whole point of doing a statistical analysis.

If one wants to generalize to a larger group of people or a different set of samples, yes there is a bit of a risk, but the results are probably not far off the mark.  A different sample set would probably get you the most different results.  And of course trying to apply group results to a particular individual is quite a bit more risky.  I would say that the variations are bigger from individual to individual than from one group to another.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: jmitch on 2005-02-16 13:31:08
Yes, there are many factors and variables that ruin the validity of the test. One being, which you named, the audio equipment being used to do the testing. Most users have shit audio equipment, therefore their results are pretty poor and innacurate. Secondly, many people, like you said, don't even know what the hell ABXing is, so you can tell by that that they don't know much about audio. Their ears, and/or listening skills probably suck. This would dramatically alter the results of the test.

Anyways, the test is better than no test. It gives us a reasonable idea, but not accurate enough, in my opinion, to really make any conclusive judgements.

I would be interested in gathering a group of good listeners that have quality equipment. I think we should have enough here. I myself have Etymotic Research ER-4s, which are basically the best you can get as far as equipment goes.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2005-02-16 13:39:02
Quote
I would be interested in gathering a group of good listeners that have quality equipment.[a href="index.php?act=findpost&pid=274122"][{POST_SNAPBACK}][/a]


And, by doing that, you would conduce a test that would only have meaning to people with good listening and quality equipment

By accepting everyone and all equipment on my test, I got much closer to the average user than if I only targeted it at golden ears with headphones that cost more than 100 dollars.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Jojo on 2005-02-16 13:41:51
no audiophile will encode in 128kbps anyway...so that was a real world test with real people that use that bitrate...nothing wrong with it and well done
Title: Multiformat@128kbps listening test - FINISHED
Post by: ff123 on 2005-02-16 15:11:16
Quote
Yes, there are many factors and variables that ruin the validity of the test. One being, which you named, the audio equipment being used to do the testing. Most users have shit audio equipment, therefore their results are pretty poor and innacurate. Secondly, many people, like you said, don't even know what the hell ABXing is, so you can tell by that that they don't know much about audio. Their ears, and/or listening skills probably suck. This would dramatically alter the results of the test.

Anyways, the test is better than no test. It gives us a reasonable idea, but not accurate enough, in my opinion, to really make any conclusive judgements.

I would be interested in gathering a group of good listeners that have quality equipment. I think we should have enough here. I myself have Etymotic Research ER-4s, which are basically the best you can get as far as equipment goes.
[a href="index.php?act=findpost&pid=274122"][{POST_SNAPBACK}][/a]


You keep saying "invalid" and "inaccurate."  But in what way?  As in the rankings would have produced a different order of rankings, or another winner or loser?  I don't think so.  The effect of having different setups, in my opinion, is to add random variability to the results, so that the uncertainty is greater.  But I don't think it would add a bias, i.e., change the order of the rankings by much, if any.

ff123
Title: Multiformat@128kbps listening test - FINISHED
Post by: nyarlathotep on 2005-02-16 15:19:21
Quote
Secondly, many people, like you said, don't even know what the hell ABXing is, so you can tell by that that they don't know much about audio. Their ears, and/or listening skills probably suck. This would dramatically alter the results of the test.

I just meant that most of these people will not bother testing, not that they will do the test WITHOUT knowing what an ABX test is.
Title: Multiformat@128kbps listening test - FINISHED
Post by: stephanV on 2005-02-16 15:56:23
Quote
This test is very good and I commend rjamorim for taking his time to conduct it, however I don't believe there was nearly enough testers to validate any accurate data, and therefore come to any valid conclusions.


Quote
I would be interested in gathering a group of good listeners that have quality equipment. I think we should have enough here. I myself have Etymotic Research ER-4s, which are basically the best you can get as far as equipment goes.


First you say there are not enough participants, then you want to conduct a test with a very selective group... 

The test does not claim to be more than it is, it is not the definite anwer to which codec is "best". However, it does give an good indiciation what is good and bad (or perhaps i should say "not so good"?) for the average user. There is no more inaccuracy in the test than the error bars in the graphs suggest.

A tests which proves codec A does better than codec B on some 10.000$ piece of equiptment is completely useless for most people.
Title: Multiformat@128kbps listening test - FINISHED
Post by: Busemann on 2005-02-16 15:59:16
Quote
Quote
I would be interested in gathering a group of good listeners that have quality equipment.[a href="index.php?act=findpost&pid=274122"][{POST_SNAPBACK}][/a]


And, by doing that, you would conduce a test that would only have meaning to people with good listening and quality equipment


It would have meaning for people with poor equipment as well; the quality headroom will be bigger with the winning codec. Even if the current equipment isn't good enough to reveal flaws at 128kbps, I bet most people would always want to encode with the best format at that bit-rate, rated by people where the equipment/hearing isn't the bottleneck.

When hifi mags test speakers, they tend to use the best possible cables, amplifiers and most trained ears. That doesn't make the test useless to people with average hearing/equipment
Title: Multiformat@128kbps listening test - FINISHED
Post by: jmitch on 2005-02-17 13:44:12
Quote
A tests which proves codec A does better than codec B on some 10.000$ piece of equiptment is completely useless for most people.


This is exactly my point. Crappy equipment, and poor ears don't provide and accurate data. Yes, it may be real world to the majority of listeners. But, nevertheless, it does not prove anything substantial. For all we know these users were guessing. I would trust good ears and good equipment with a small majority of users, over poor ears and crappy equipment with a large majority of listeners.
Title: Multiformat@128kbps listening test - FINISHED
Post by: rjamorim on 2005-02-17 14:02:33
Quote
For all we know these users were guessing. [a href="index.php?act=findpost&pid=274486"][{POST_SNAPBACK}][/a]


Do you even know how ABC/HR testing works?
Title: Multiformat@128kbps listening test - FINISHED
Post by: upNorth on 2005-02-17 14:40:15
Quote
For all we know these users were guessing.[a href="index.php?act=findpost&pid=274486"][{POST_SNAPBACK}][/a]

Nice one...
Title: Multiformat@128kbps listening test - FINISHED
Post by: stephanV on 2005-02-17 15:20:02
Quote
Quote
A tests which proves codec A does better than codec B on some 10.000$ piece of equiptment is completely useless for most people.


This is exactly my point. Crappy equipment, and poor ears don't provide and accurate data. Yes, it may be real world to the majority of listeners. But, nevertheless, it does not prove anything substantial. For all we know these users were guessing.


Please read up on how the test was performed. You cannot make any valid conclusion from it if you do not understand how to interpret the data.
Title: Multiformat@128kbps listening test - FINISHED
Post by: SoNiX on 2005-02-17 16:38:24
Quote
Quote
For all we know these users were guessing. [a href="index.php?act=findpost&pid=274486"][{POST_SNAPBACK}][/a]


Do you even know how ABC/HR testing works?
[a href="index.php?act=findpost&pid=274491"][{POST_SNAPBACK}][/a]



@rjamorim: As I see, you couldn´t resist the oportunity. 

@jmitch: Well, that´s what a listening test´s supposed to be, isn't it? A "so good" audio equipment is not that relevant.


SoNiX 
Title: Multiformat@128kbps listening test - FINISHED
Post by: Pio2001 on 2005-02-18 12:09:54
Quote
Crappy equipment, and poor ears don't provide and accurate data. [{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=274486")


The accuracy of the answers is given in the test results. It is 95 %.

Explanations : [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=16295&]http://www.hydrogenaudio.org/forums/index....howtopic=16295&[/url]
The ABC/HR method : http://ff123.net/abchr/abchr.html (http://ff123.net/abchr/abchr.html)
The Anova analysis (which gives the 95 % above) : http://www.psychstat.smsu.edu/introbook/sbk27.htm (http://www.psychstat.smsu.edu/introbook/sbk27.htm)
Title: Multiformat@128kbps listening test - FINISHED
Post by: DonP on 2005-02-18 13:22:24
Quote
For all we know these users were guessing. I would trust good ears and good equipment with a small majority of users, over poor ears and crappy equipment with a large majority of listeners.
[a href="index.php?act=findpost&pid=274486"][{POST_SNAPBACK}][/a]


Just to satisfy yourself, why don't you take the test with your gear?  All the samples and ABX software are still there.  when you are done, just for yuks, go slumming and borrow someones "crappy" sub $100 phones and see if that makes a difference (not in how good things sound, just in how they affect your ability to abx a codec)

Let us know how it went.

edit: fixed quote markers
Title: Multiformat@128kbps listening test - FINISHED
Post by: Halcyon on 2005-02-21 16:47:04
Quote
Crappy equipment, and poor ears don't provide and accurate data.


I participated in this test myself (when it was run). I used:

- RME Digi 96/8 PAD (professional level 96kh/24bit sound card with 1:1 bit accuracy and very nice analog measurements)
- High quality minimum capacitance shielded interconnects
- Meier Audio Pre head (very high quality solid state headphone amp)
- Sennheiser HD600, AKG K271s, Ultrasone HFI-650 and Etymotics ER4p/s headphones (Ultrasones and etymotics getting most of the listening time)
- A quiet room with a silenced computer

My ears have been tested to be flat to 11 kHz (with less than average attenuation after that up to 14kHz, the maximum that the test equipment at national hearing clinic was able to test).

I have c. 4 years in trying to get into lossy audio, perhaps 3 of that with slowly increasing listening acuity. I've gone through several training sessions with the AES "Perceptual Audio encoders - what to listen for" CD, as well as many example samples from here, mpeg development archives and previous listening tests. I've also purchased and gone through the Moulton labs "Golden Ears" hearing training cd set. In addition, I regularly audition new hifi and high end gear (and also write about hifi to a national publication). I think my hearing (both as an instrument and as a skill) is better than average.

While I'm far from being a "golden ear" I can invalidate the above argument by saying that neither my equipment or hearing is crap.

My results didn't significantly differ from that of the statistical averages in this test.

While neither my hearing or equipment are not "best in class", I think they are clearly better than average population in both cases. It is debatable how good they are, but surely not crap.

As such, I don't think the test can be in invalidated by only referring to "crappy equipment and poor listeners".

Had I magnificiently surpassed every other listener in this test by picking out artifacts other couldn't hear, I could _perhaps_ be willing to entertain the possibility of the argument being right.

But alas, I wasn't even among the best listeners in the test. Surely equipment at least wasn't a limiting factor in my case.

I must say that I was also a wee surprised that spotting artifacts in a 128kbps ABR test was so difficult. I knew it was going to be difficult, but it was even more so than I initially had imagined.

friendly regards,
halcyon

PS I really should not even have needed to reply with this defense, as ad hominem type attacks don't really need refutation. I think arguments should be evaluted based on the evidence available (and the logic of reasoning). Not on the basis who makes the argument, UNLESS there is strong proof to show that the author is not to be trusted (which in this case is non-existent). Conjecture is not enough. Arguments need evidence, not prejudice as their support.
Title: Multiformat@128kbps listening test - FINISHED
Post by: krabapple on 2005-03-12 04:33:06
Quote
When hifi mags test speakers, they tend to use the best possible cables, amplifiers and most trained ears. That doesn't make the test useless to people with average hearing/equipment
[a href="index.php?act=findpost&pid=274192"][{POST_SNAPBACK}][/a]


Tell me you're kidding.  Hifi mags may *claim* that the ancillary equipment they use is the 'best possible ' (though it's almost never been subjected to a controlled listening test) and that their listeners are 'trained'  (no proof of that either...but  I guess it's how guys like Robert Harley can hear the directionality of the crystalline structure of cables)... but that's no reason to believe what they claim. 

And it's always funny when the caution that the listener who isn;'t using a $10,000 amp and $100/ft cabling might not hear the amazing microdynamics they hear...thereby covering their asses.

FWIW, Floyd Toole, Sean Olive, et al, who are doing controlled comparisons of speakers  using trained listeners at their facility at Harman/JBL, seem rather more credible to me than any 'hi fi' mag in this area