HydrogenAudio

Hydrogenaudio Forum => Validated News => Topic started by: Sebastian Mares on 2006-01-14 21:01:25

Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 21:01:25
The much awaited results of the Public, Multiformat Listening Test @ 128 kbps are ready.

Here is the results page: http://www.maresweb.de/listening-tests/mf-128-1/results.htm (http://www.maresweb.de/listening-tests/mf-128-1/results.htm)

(http://www.maresweb.de/listening-tests/mf-128-1/resultsz.png)

Edit 1: A description of the mentioned Nero problem can be found here: http://www.maresweb.de/nero-problem (http://www.maresweb.de/nero-problem)

Edit 2: For people who want to decrypt their results, here are the encoder IDs:

1 = iTunes
2 = LAME
3 = Nero
4 = Shine
5 = AoTuV
6 = WMA Professiona
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: minisu on 2006-01-14 21:05:48
Great job!

*goes decrypting*
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Ivan Dimkovic on 2006-01-14 21:11:49
Just a quick info regarding the Nero bug - unfortunately it was found too late, as the encoder was given away couple of days  before the test (*** too short ***) and it was completely "rushed in" to be ready for the test, even few months before the complete release.  In the whole "rush" process, the bug was overlooked in the internal Nero Digital Audio QA as well by the external people testing the codecs (Guru and few others)

This bug reflects quality in the unpredictable way (as it does not allocate bits according to the psychoacoustics but drains the bit reservoir) - but in general we believe that the quality difference would not be significant (in fact it is my belief that it would be better without the bug as the extra bits would be allocated according to psychoacoustic model) - however Sebastian decided to exclude the Nero codec from the test, which is IMO unfortunate, but I can understand his decision.

Fortunately, the bug has been fixed (thanks to Guruboolez and his very good hearing) - and this kind of behavior will be included in pre-test screening of the codecs to seek for such obvious bugs.

I am dissapointed that this bug was found out too late, but hopefully for the next tests we won't be having such problems.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: skelly831 on 2006-01-14 21:18:51
Wow! awesome results, it's interesting to see how close iTunes and AoTuv are in the overall rating but they're somewhat disparate in the bitrate table.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: ff123 on 2006-01-14 21:21:17
How about making a hyperlink to the page describing the nero bug?  Good explanation of what went wrong.

ff123
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 21:22:18
By the way, results that were invalid (didn't meet ABX minimums) were not uploaded. Since I posted the encryption key, you can decrypt the results yourself if you are wondering why your result is not counted.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-14 21:32:59
Great job to everyone who participated to the test - especially to the conducer!

The first immediate lesson of this test is that ~132...135 kbps main encoders are very, very good. Even LAME at -V5 is close to transparency. I hope that -V2/--preset standard will progressively cease to be the automatic recommendation when new members are asking for good quality MP3 encodings: the first step to excellency is below the historical presets!

Developers must be celebrated: Apple's and Nero's developers for leading AAC to the best places; Aoyumi for having resurrecting Vorbis; Microsoft's developers for -at last- offering a very good encoding and free encoding tool; and of course the whole LAME team who are making MP3 better and better even if people are sometimes not realizing it or believing that MP3 couldn't be improved!

Last, people could compare the collective results to my individual test (http://www.hydrogenaudio.org/forums/index.php?showtopic=38792). I've conclude on aoTuV and iTunes superiority, with both Nero and LAME slightly lower. The group results are concluding on exactly the same order, with only less significance and an higher notation. It looks that my hearing thus my listening tests may be very representative from the group's one
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 21:37:01
By the way, Francis noticed another problem. The sample "Yello" submitted by Alex B was not lossless. By mistake, Alex B uploaded a file transcoded from a high bitrate MP3.

Guru noticed that the reference file he was listening to sounded too much like an encoded file. After looking at the spectral view of the track with CE, he noticed what he calls "an adaptive lowpass with some spectral 'holes'" which is typical to lossy encoders. I contacted Alex B about the problem and received the following mail today:

Quote
Hi again!

Here's what I found. It is a long story, but explains how accidents can happen when dealing with a large number of test files, even if you try to [be] careful.

I checked the reference and compared it with my ripped CD archive file, which is in Monkey's Audio disc image file & cue format and noticed that there really is a difference.

However, at first I couldn't understand why. It could not be the WMA 2-pass version I made for testing the bitrates. WMA 128 kbps has a lower lowpass. I have also a Musepack Q8 version on my home audio server, but that does not have such a lowpass.

Then I remembered that I cut the sample from the 25 files bitrate test archive I have stored. I have the separated track already there so it was faster than converting the big disc image ape file.

I selected these 25 files originally for testing LAME 3.97 VBR bitrates in September. When I made the different VBR sets I had the original cue files loaded in foobar and I converted each VBR set from the same playlist. Later when I cleaned the resulting 500 MP3 files I converted the 25 original cue tracks to separate lossless tracks for future use.

It appears that I have somehow accidentally loaded a high bitrate MP3 file instead of the cue to a foobar playlist and converted it to Monkey's Audio. The other 24 Monkey's Audio tracks seem to be fine, only this one is unfortunately different.

When I made this particular sample for testing WMA 2-pass in November I converted that ape file to wave and cut it. By looking the lowpass I believe the MP3 file was encoded at -V0. There is no way to tell if it was VBR new or old.

I am sorry about the mistake.

I think this sample will accidentally show that these new encoders are not bad transcoders, at least when the source is this kind of loud track. It is a different thing if you like to publish this information. It could be only meaningless clutter. It was never exactly mentioned how the source files were obtained. The test compared these reference files and the encoded versions.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Shade[ST] on 2006-01-14 21:38:12
The URL in the image is incorrect (www.maresweb.de/nero-problem.txt) -- might as well make the redirect at /nero-problem, too, anyways...

In any case, It's a shame I misunderstood the testing scenario : I did not know I needed to ABX the codecs before rating them :-/

All my results are invalid :-(

However, it's surprising to learn that all popular codecs around 128 kbps are of high enough quality to be practically imperceptible to everyone.

Great!

edit : de / net, whatever.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Lyx on 2006-01-14 21:38:56
Pointing out an old issue with listening-test presentations: neither the plot, nor the detailed results page, mention clearly that "iTunes" means "iTunes AAC Encoder". Taking into account that results of ha.org listening tests are often posted elsewhere without all the necessary info, people may once more mistake iTunes (good) AAC performance with its (bad) MP3 performance.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 21:41:54
Quote
The URL in the image is incorrect (www.maresweb.net/nero-problem.txt) -- might as well make the redirect at /nero-problem, too, anyways...
[a href=\"index.php?act=findpost&pid=357127\"][{POST_SNAPBACK}][/a]

maresweb.de is the correct domain. .net and .org are only redirecting to the .de domain. Also, I tested the URL from the picture and Apache redirects the users to the correct file. No problem I guess.

Quote
In any case, It's a shame I misunderstood the testing scenario : I did not know I needed to ABX the codecs before rating them :-/

All my results are invalid :-(

However, it's surprising to learn that all popular codecs around 128 kbps are of high enough quality to be practically imperceptible to everyone.

Great!
[a href=\"index.php?act=findpost&pid=357127\"][{POST_SNAPBACK}][/a]

No, no, no... You don't have to ABX all files. ABX logs are only required when you ranked a reference.
IIRC, I used a large part of your results.

And BTW, this was a funny result:

Code: [Select]
 ABC/HR for Java, Version 0.5b, 06 december 2005 
 Testname: DontLetMeBeMisunderstood
 
 Tester: 
 
 1R = Sample05\DontLetMeBeMisunderstood_1.wav
 2L = Sample05\DontLetMeBeMisunderstood_2.wav
 3L = Sample05\DontLetMeBeMisunderstood_3.wav
 4L = Sample05\DontLetMeBeMisunderstood_6.wav
 5R = Sample05\DontLetMeBeMisunderstood_4.wav
 6R = Sample05\DontLetMeBeMisunderstood_5.wav
 
 ---------------------------------------
 General Comments: Focus on 4.16 - 6.46
 ---------------------------------------
 1L File: Sample05\DontLetMeBeMisunderstood.wav
 1L Rating: 4.8
 1L Comment: 
 ---------------------------------------
 5L File: Sample05\DontLetMeBeMisunderstood.wav
 5L Rating: 1.0
 5L Comment: 
 ---------------------------------------
 
 ABX Results:
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Shade[ST] on 2006-01-14 21:42:58
Also a question / comment.  Is Itunes' encoder actually in iTunes, or is it in quicktime?  What versions were used?  Maybe you should list those on the results page (versions of each encoder..)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Shade[ST] on 2006-01-14 21:45:13
Quote
Code: [Select]
Testname: DontLetMeBeMisunderstood 
Tester:  

 1R = Sample05\DontLetMeBeMisunderstood_1.wav
 2L = Sample05\DontLetMeBeMisunderstood_2.wav
 3L = Sample05\DontLetMeBeMisunderstood_3.wav
 4L = Sample05\DontLetMeBeMisunderstood_6.wav
 5R = Sample05\DontLetMeBeMisunderstood_4.wav
 6R = Sample05\DontLetMeBeMisunderstood_5.wav
  
 ---------------------------------------
 General Comments: Focus on 4.16 - 6.46
 ---------------------------------------
 1L File: Sample05\DontLetMeBeMisunderstood.wav
 1L Rating: 4.8
 1L Comment:  
 ---------------------------------------
 5L File: Sample05\DontLetMeBeMisunderstood.wav
 5L Rating: 1.0
 5L Comment:  
 ---------------------------------------
Wuups.. Thanks for hiding my tester name
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: minisu on 2006-01-14 21:45:24
Ok, so this is one of my decrypted results... http://www.maresweb.de/listening-tests/mf-...12-result02.txt (http://www.maresweb.de/listening-tests/mf-128-1/miscellaneous/results/Sample02/anon12-result02.txt)

How to know which encoders that matches which sample number?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Shade[ST] on 2006-01-14 21:49:44
More comments : http://www.maresweb.de/listening-tests/mf-128-1/ (http://www.maresweb.de/listening-tests/mf-128-1/) Maybe you should update this page to point to the results.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 21:50:27
Quote
Also a question / comment.  Is Itunes' encoder actually in iTunes, or is it in quicktime?  What versions were used?  Maybe you should list those on the results page (versions of each encoder..)
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357131\")

Well, they're on the presentation page which I think is enough.

Quote
Quote
Code: [Select]
Testname: DontLetMeBeMisunderstood 
Tester: 

 1R = Sample05\DontLetMeBeMisunderstood_1.wav
 2L = Sample05\DontLetMeBeMisunderstood_2.wav
 3L = Sample05\DontLetMeBeMisunderstood_3.wav
 4L = Sample05\DontLetMeBeMisunderstood_6.wav
 5R = Sample05\DontLetMeBeMisunderstood_4.wav
 6R = Sample05\DontLetMeBeMisunderstood_5.wav
 
 ---------------------------------------
 General Comments: Focus on 4.16 - 6.46
 ---------------------------------------
 1L File: Sample05\DontLetMeBeMisunderstood.wav
 1L Rating: 4.8
 1L Comment: 
 ---------------------------------------
 5L File: Sample05\DontLetMeBeMisunderstood.wav
 5L Rating: 1.0
 5L Comment: 
 ---------------------------------------
Wuups.. Thanks for hiding my tester name
[a href=\"index.php?act=findpost&pid=357132\"][{POST_SNAPBACK}][/a]

I didn't say it was your result. Or to be more precise, it wasn't your result. It was from anonymous user.

Quote
Ok, so this is one of my decrypted results... [a href=\"http://www.maresweb.de/listening-tests/mf-128-1/miscellaneous/results/Sample02/anon12-result02.txt]http://www.maresweb.de/listening-tests/mf-...12-result02.txt[/url]

How to know which encoders that matches which sample number?
[a href=\"index.php?act=findpost&pid=357133\"][{POST_SNAPBACK}][/a]

1 = iTunes
2 = LAME
3 = Nero
4 = Shine
5 = AoTuV
6 = WMA Professional
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 21:51:51
Quote
More comments : http://www.maresweb.de/listening-tests/mf-128-1/ (http://www.maresweb.de/listening-tests/mf-128-1/) Maybe you should update this page to point to the results.
[a href="index.php?act=findpost&pid=357134"][{POST_SNAPBACK}][/a]


If you go one lever higher, you come to the listening tests page which lists all tests as pairs of presentation page and results page. AFAIK, Roberto does it the same way, too. But no problem, I can edit that page when the FTP server works again.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: fpi on 2006-01-14 22:26:33
Quote
maresweb.de is the correct domain. .net and .org are only redirecting to the .de domain. Also, I tested the URL from the picture and Apache redirects the users to the correct file. No problem I guess.

You missed the .de on the graph.
Can you also add more complete info of the encoder in the graph? e.g.: AoTuv -> Vorbis AoTuV 4.51, Nero -> AAC Nero 3.1.0.2, etc... I prefer first the format, then vendor and version. Many sites link only to the image and can give a confusing idea of which encoder was used. Also on that image should be a link to the full explanation of the results.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-14 22:29:45
First of all great test. Thanks, Sebastian.

However it's sad what happens with Nero encoder.
Maybe it will be a fair idea to cut 2/3 of encoded samples and see how much size of nero's samples is bigger. However it's not correctly 100% because of  VBR distribution. I've already learned that the bigger/smaller size is not always the indicator  of quality  . However if distribution 150-130 (140) kbps is a good idea to check it. 10 kbit/s isn't issue 7% of total bitrate.  It was admited that Itunes had 10 kbit/s extra of real bitrate  that's ok because of VBR.

However I think it was right decesion to keep us informed  about this issue.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 22:30:41
Ah, damn, you're right. Now I see what you guys mean. OK, going to PSP the graphs again... >_<
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Shade[ST] on 2006-01-14 22:31:46
Quote
Quote
,Jan 14 2006, 10:42 PM]Also a question / comment.  Is Itunes' encoder actually in iTunes, or is it in quicktime?  What versions were used?  Maybe you should list those on the results page (versions of each encoder..)

Well, they're on the presentation page which I think is enough.

Actually, nowhere is that page linked on here... Also, If you wish, I can set up a nice 'design' image to show the results;  Could you post the numbers (ranges), PM them to me or email them?  I'll also include the necessary links and info in the image.  And I can give you a PDF format of it, if you like.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-14 22:45:09
Quote
I've already learned that the bigger/smaller size is not always the indicator  of quality  
[a href="index.php?act=findpost&pid=357143"][{POST_SNAPBACK}][/a]

That's right. But in this case, a careful listening reveals (revealed in my case) that this bug has an audible impact, leading to a brutal drop in perceived quality. The impact on quality varies from nothing to considerable.

Quote
However if distribution 150-130 (140) kbps is a good idea to check it. 10 kbit/s isn't issue 7% of total bitrate.  It was admited that Itunes had 10 kbit/s extra of real bitrate  that's ok because of VBR.

Even if bitrate stays within the tolerence of 10%, there's still the problem of unrepresentativity (the same one which decided to not use WMA Std 2-pass with short samples). The tested samples content and quality is different from what a user would get.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: JeanLuc on 2006-01-14 22:46:30
Damn ... looking at the bitrate table and regarding the overall ranking, iTunes AAC seems so effective.

It is good to see, though that users can today chose between different formats at comparable bitrates without having to ask about a possible sacrifice in quality.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-14 22:49:47
Quote
there's still the problem of unrepresentativity (the same one which decided to not use WMA Std 2-pass with short samples). The tested samples content and quality is different from what a user would get.

yes , I had fear to it.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 22:56:04
Quote
Quote
maresweb.de is the correct domain. .net and .org are only redirecting to the .de domain. Also, I tested the URL from the picture and Apache redirects the users to the correct file. No problem I guess.

You missed the .de on the graph.
Can you also add more complete info of the encoder in the graph? e.g.: AoTuv -> Vorbis AoTuV 4.51, Nero -> AAC Nero 3.1.0.2, etc... I prefer first the format, then vendor and version. Many sites link only to the image and can give a confusing idea of which encoder was used. Also on that image should be a link to the full explanation of the results.
[a href="index.php?act=findpost&pid=357142"][{POST_SNAPBACK}][/a]


Well, if someone posts the image, he should also post to results page.
Adding the full encoder version / information is useless IMHO - it's stated already on the presentation page (which can be accessed if you are on the results page, that is supposed to be posted together with the plot).

Quote
,Jan 14 2006, 11:31 PM]
Quote
Quote
,Jan 14 2006, 10:42 PM]Also a question / comment.  Is Itunes' encoder actually in iTunes, or is it in quicktime?  What versions were used?  Maybe you should list those on the results page (versions of each encoder..)

Well, they're on the presentation page which I think is enough.

Actually, nowhere is that page linked on here... Also, If you wish, I can set up a nice 'design' image to show the results;  Could you post the numbers (ranges), PM them to me or email them?  I'll also include the necessary links and info in the image.  And I can give you a PDF format of it, if you like.
[a href="index.php?act=findpost&pid=357145"][{POST_SNAPBACK}][/a]


Sent! Thanks.

BTW, I only changed iTunes and Nero to iTuness AAC and Nero AAC on the final plots. The sample plots still have iTunes only. I hope it's not such a big deal since those images are almost never posted alone.

That's it guys - a warm bed is waiting for me.

BTW... Gambit is screwed if his neighbors read his comments.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-14 23:13:16
could a ranking based on rating and bitrate be made? also as somebody already pointed, it would be good to include the version of the encoders used on the graphs, as many people would be linking to them.

edit:by the way, great job guys.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: mdmuir on 2006-01-14 23:13:30
When are we reaching the point that further tests in the future become superfluous? Just from looking at the results of this test, not one codec is significally better than another one. Can we safely say "stick to lame for universal compatibilty or take your pick for whatever your hardware device will suport"

Choice for compatibilty:

1. lame

2. apple aac/nero aac

3. vorbis

4.wma pro-good codec, nothing but computer software plays it
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-14 23:15:23
Quote
When are we reaching the point that further tests in the future become superfluous? Just from looking at the results of this test, not one codec is significally better than another one. Can we safely say "stick to lame for universal compatibilty or take your pick for whatever your hardware device will suport"

i prety much agree ... ony thing is gapless option on some of them ...
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-14 23:18:13
Quote
Quote
Quote from: Sebastian Mares,Jan 14 2006, 03:41 Can you also add more complete info of the encoder in the graph? e.g.: AoTuv -> Vorbis AoTuV 4.51, Nero -> AAC Nero 3.1.0.2, etc... I prefer first the format, then vendor and version. Many sites link only to the image and can give a confusing idea of which encoder was used. Also on that image should be a link to the full explanation of the results.[/quote


Well, if someone posts the image, he should also post to results page.
Adding the full encoder version / information is useless IMHO - it's stated already on the presentation page (which can be accessed if you are on the results page, that is supposed to be posted together with the plot).

i agree with fpi, i would add encoder, version, and probably command line. Even if i link to your page, people most of the time just look at the graph.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-14 23:23:10
Quote
When are we reaching the point that further tests in the future become superfluous?
[a href="index.php?act=findpost&pid=357164"][{POST_SNAPBACK}][/a]

At this bitrate, further tests are indeed questionable. Quality of the tested encoders is apparently too high for most listeners at ~130 kbps - at least for those interested by participating in such tests. The 192 kbps syndrom has now reached the 128 kbps area: it's beyond most listeners abilities, including HA.org members' one. At this stage, all people who can't differenciate MP3 from Vorbis or AAC and interested by these formats should try to lower the bitrate (I guess that it's already the case for many of them).
It's maybe the last 128 kbps multiformat collective test organized here. The next "mid/high" collective test should maybe lower the pretension and be limited to 100...112 kbps.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-14 23:26:21
Quote
i agree with fpi, i would add encoder, version, and probably command line. Even if i link to your page, people most of the time just look at the graph.


You know, space is limited. Writing "VBR/Stereo - Streaming, 100-120 kbps [LC AAC]" in the graph is pretty much overkill. Anyways, I will see what I can do tomorrow.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Halcyon on 2006-01-14 23:35:19
Thanks for the results and work everyone!

I could only offer 6 results, not only because I fell ill, but because trying to spot the differences (assumed anchor excluded) it was really hard work!

It also showed me how some samples truly are more useful (at least for me).

The louder, more compressed and "busy" samples (like metal, rap and pop) were not nearly as useful for me as the acoustic and classical tracks. Not that they were easy either!

c. 128 kbps level has gone a long way in the past few years. I'm almost afraid to think when we'll reach the similar performance with 96kbps  Not that I'm complaining, it's all for good.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-14 23:54:37
Quote
You know, space is limited. Writing "VBR/Stereo - Streaming, 100-120 kbps [LC AAC]" in the graph is pretty much overkill. Anyways, I will see what I can do tomorrow.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357170")

you can do better than this paint shop pro hack for sure, but its the idea:

[a href="http://imageshack.us](http://img253.imageshack.us/img253/5935/results1lp.png)[/url]
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: vinnie97 on 2006-01-15 00:00:29
Vorbis' place at the top correlates with it being the highest bitrate.

Good results from everyone all around.  Thanks for the test.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 01:26:12
Quote
And BTW, this was a funny result:

Code: [Select]
 ABC/HR for Java, Version 0.5b, 06 december 2005 
 Testname: DontLetMeBeMisunderstood
 
 Tester: 
 
 1R = Sample05\DontLetMeBeMisunderstood_1.wav
 2L = Sample05\DontLetMeBeMisunderstood_2.wav
 3L = Sample05\DontLetMeBeMisunderstood_3.wav
 4L = Sample05\DontLetMeBeMisunderstood_6.wav
 5R = Sample05\DontLetMeBeMisunderstood_4.wav
 6R = Sample05\DontLetMeBeMisunderstood_5.wav
 
 ---------------------------------------
 General Comments: Focus on 4.16 - 6.46
 ---------------------------------------
 1L File: Sample05\DontLetMeBeMisunderstood.wav
 1L Rating: 4.8
 1L Comment: 
 ---------------------------------------
 5L File: Sample05\DontLetMeBeMisunderstood.wav
 5L Rating: 1.0
 5L Comment: 
 ---------------------------------------
 
 ABX Results:
[a href=\"index.php?act=findpost&pid=357129\"][{POST_SNAPBACK}][/a]
I'm the culpit here 

I started by listening to all encoders to locate the low anchor.
Apparently I pulled the wrong slider on sample 5, although I had it nailed.

Next time I better ABX the samples to lock the reference slider, in order to eliminate this kind of stupid "pressing the wrong button" mistakes.

I hope your are able to use my other test results, even though I'm positive that I on several occasions have rated the reference.

Anyway great work Sebastian.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: richard123 on 2006-01-15 01:50:20
Do these results help answer the question: Are these encoders transparent to most of those participating in the tests?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: westgroveg on 2006-01-15 02:13:43
Does the results page state what settings where used for each encoder?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Serge Smirnoff on 2006-01-15 02:24:32
SoundExpert preliminary results on the same contenders are here (http://www.soundexpert.info/coders128.jsp). Alternative testing will end on 22 Jan. Details and discussion are in this thread (http://www.hydrogenaudio.org/forums/index.php?showtopic=40561).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 03:22:44
Quote
By the way, results that were invalid (didn't meet ABX minimums) were not uploaded. Since I posted the encryption key, you can decrypt the results yourself if you are wondering why your result is not counted.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357124")
Sebastian,

Could you please elaborate on the criteria used for invalidating a result file?

I'm also interested in knowing how many result files have been discarded and for what reasons.

Previously discussions have taken place on criterias for [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=18474&view=findpost&p=189044]discarding result files[/url].

However I seem to have missed any discussions on the subject prior to this listening test.

How would the consolidated result look if you would have used results were the reference where rated slightly below 5.0?

In the AAC @ 128 kbps listening test only using "clean" files did not change the outcome of the test. Can the same be said for this test?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Pio2001 on 2006-01-15 04:47:35
Thank you for the results. I was Anon08.
Here are my individual results. Every mark different from 5.0 has been validated with a successful ABX test.

Code: [Select]
Sample  AoTuV AAC-Itunes AAC-Nero Lame Shine WMApro
01      5.0   4.5        5.0      2.0  2.0   5.0  
03      5.0   5.0        5.0      2.0  2.0   5.0
04      5.0   5.0        5.0      5.0  2.0   5.0
05      5.0   5.0        3.0      5.0  1.0   5.0
06      5.0   5.0        5.0      5.0  1.0   5.0
07      5.0   5.0        5.0      5.0  3.0   5.0
08      5.0   4.0        4.0      3.0  1.5   4.5
10      5.0   5.0        5.0      5.0  2.0   5.0


I can tell that I dislike MP3
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: plonk420 on 2006-01-15 07:27:04
i only got thru 10 of the tracks.... but i could only ABX the diff on maybe 1 or 2 of the samples out of the 4 or 5 X 10 tracks i tried.....!! i blame my noisy DLP projector >_> (where i have my nice speakers setup) ;-\
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Raptus on 2006-01-15 07:43:49
Pio, have you automated somehow the process of compiling this personal result table of yours?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Raptus on 2006-01-15 08:35:56
As I stated in the other topic, I was not very happy with the sample selection. Anyway I managed to descriminate more than average:
Code: [Select]
Sample iTunes LAME Nero Shine AoTuv WMA Pro
1      3,2    3,2  4    1     5     3,5
2      5      4,5  4    1     5     4,5
3      5      2,2  5    1,5   4     3,5
4      4      3,2  3,5  1     5     2,8
5      5      4    5    1,5   4     3,5
6      4,4    2,8  4    1,5   5     3
7      5      4    4,5  2,5   5     4,5
8      5      3,5  4    1     5     3,5
9      5      4    3,5  1     5     4
10     5      4    4    1,5   5     3,5
11     3,5    4    4    1,5   5     3,5
12     4,5    3,5  4    1,5   4,5   3,5
13     5      3,5  5    1     5     4
14     5      4    4,5  1     5     4
15     4,5    3,5  4    1     5     3,5
16     5      4    4    1     5     4
17     5      3    4    1     5     3,5
18     5      4    5    1     5     5

AVG    4,67   3,61 4,22 1,25  4,86  3,74

Congrats to the conducer and everyone who participated 

EDIT: Corrected a number
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 09:00:54
Quote
Does the results page state what settings where used for each encoder?
[a href="index.php?act=findpost&pid=357197"][{POST_SNAPBACK}][/a]


No, but on the presentation page.

Quote
Quote
By the way, results that were invalid (didn't meet ABX minimums) were not uploaded. Since I posted the encryption key, you can decrypt the results yourself if you are wondering why your result is not counted.
[a href="index.php?act=findpost&pid=357124"][{POST_SNAPBACK}][/a]
Sebastian,

Could you please elaborate on the criteria used for invalidating a result file?[a href="index.php?act=findpost&pid=357204"][{POST_SNAPBACK}][/a]


If the results contain ranked references and no ABX logs, the results are invalid.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 09:06:51
Quote
If the results contain ranked references and no ABX logs, the results are invalid.
[a href="index.php?act=findpost&pid=357238"][{POST_SNAPBACK}][/a]
Well that is new to me. I thought that slightly ranking a reference would result in a 5.0 rating, not invalidating the result file for that sample.

BTW: Were was this mentioned?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Garf on 2006-01-15 09:28:23
On the total results: how much % of the gradings gave a "transparent" mark, if we exclude shine?

The same question but adding the graded references as 5.0 for the codec in question?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 09:31:20
Is there a way to see your own results, including results for discarded result files?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 09:58:14
Quote
On the total results: how much % of the gradings gave a "transparent" mark, if we exclude shine?

The same question but adding the graded references as 5.0 for the codec in question?
[a href="index.php?act=findpost&pid=357241"][{POST_SNAPBACK}][/a]


Quote
Is there a way to see your own results, including results for discarded result files?
[a href="index.php?act=findpost&pid=357242"][{POST_SNAPBACK}][/a]


Sorry, I didn't understand both of you. 

Anyways, I edited all graphs to show the encoder and its version. Additionally, I included a link to the full results page in the overall rankings and the zoomed plot. The encoder settings can be seen on the presentation page (which as I said can be reached from the results page).

Edit: sehested, do you want to see the results you submitted or what? If you still have the encrypted files, you can decrypt them using ABC/HR and the encryption key I linked to on the results page.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 10:10:53
Quote
sehested, do you want to see the results you submitted or what? If you still have the encrypted files, you can decrypt them using ABC/HR and the encryption key I linked to on the results page.
[a href="index.php?act=findpost&pid=357246"][{POST_SNAPBACK}][/a]
Thanks, I will check it right away.

The other question I have is about the amount of result files that where discarded.

Can you give any numbers?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 10:29:16
Quote
Quote
sehested, do you want to see the results you submitted or what? If you still have the encrypted files, you can decrypt them using ABC/HR and the encryption key I linked to on the results page.
[a href="index.php?act=findpost&pid=357246"][{POST_SNAPBACK}][/a]
Thanks, I will check it right away.

The other question I have is about the amount of result files that where discarded.

Can you give any numbers?
[a href="index.php?act=findpost&pid=357249"][{POST_SNAPBACK}][/a]


I am going to check ASAP.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Pio2001 on 2006-01-15 10:53:15
Quote
Pio, have you automated somehow the process of compiling this personal result table of yours?
[a href="index.php?act=findpost&pid=357231"][{POST_SNAPBACK}][/a]


No, I typed it manually using the results published in the rar file, and the tables of encoder IDs.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 11:25:36
I received a total of 467 results. Since only 403 are valid, I had to delete 64 results.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 11:45:26
BTW, could someone post the results to Doom9?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: halb27 on 2006-01-15 11:48:27
Wonderful results!
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 12:00:22
It may be interesting to observe the performance's evolution of various formats according to the collective tests already performed. A rigorous comparison wouldn't make sense (samples and listeners aren't the same), but some tendency may appear.

[span style='font-size:14pt;line-height:100%']Summer 2003:[/span]

) and for more and more people. When HA.org was found in 2001, such quality at this bitrate was only a dream; four years later it becomes our reality. I take my hat off to all developers
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 12:15:16
Managed to decode my own results:

Code: [Select]
                               iTunes   LAME    Nero    Shine   AuTuV  WMA pro                    
01 BigYellow                    4.9     4.9     5.0     1.0     5.0     5.0
02 bodyheat                     5.0     5.0     5.0     1.0     5.1x    5.2x
03 Carbonelli                   5.7x    3.0     4.7     4.0     5.9x    4.9     Focus on 0.00 - 5.55
04 Coladito                     5.5x    5.2x    4.9     1.0     4.7     5.4x    Focus on wosh - wosh 5.63 - 10.88
05 DontLetMeBeMisunderstood     5.2x    5.0     5.0     9.0y    5.0     5.0     Focus on 4.16 - 6.46
06 yello                        5.0     5.0     5.0     1.0     5.0     5.1x
07 Elizabeth                    5.0     5.0     5.0     5.0     5.0     5.0     Focus 0.00 - 3.71
08 eric_clapton                 5.2x    5.1x    4.5     1.0     5.8x    6.0x    Focussed on 9.96 - 14.69
09 ReunionBlues                 5.7x    4.1     4.0     1.0     5.3x    5.2x    Focussed on 24.26 - 27.06
10 LesJoursHeureux              5.2x    4.4     4.6     2.0     5.8x    4.0     Focus on 0.10 - 5.18
11 macabre                      4.9     4.9     5.1x    1.0     4.9     5.0
12 MysteriousTimes              5.0     5.0     5.0     3.0     5.0     5.0     Focus on 0.00 - 3.13
13 ravel                        4.6     5.2x    4.8     1.0     5.4x    5.2x    Focus on 4.08 - 7.93
14 School                       4.7     5.2x    4.6     2.1     4.3     4.0     Focused on 8.18 - 13.49
15 Senor                        5.0     5.0     5.0     1.0     5.0     5.0
16 SongForGuy                   4.8     2.0     5.0     5.3x    3.0     5.4x    Focus 2.20 - 4.97
17 TheDraperyFalls              4.8     5.0     4.9     1.0     5.2x    4.8     Focus on 17.21 - 26.31
18 WhiteAmerica                 5.2x    4.2     4.6     1.0     5.6x    4.8     Focus 0.00 - 7.74

Average                         4.93    4.58    4.81    1.84    4.83    4.86
Ranked reference                 7x      4x      1x      1x      8x      7x

x = Ranked reference. Rating calculated as 10 - rate of reference in the above table to indicate rank given to reference.
y = Pulled the wrong slider on the low anchor

Only managed to avoid ranking the reference for all encoders in samples 1, 7, 12, and 15. 

Funny observation: The ranking of references doesn't seem to be random as the encoders most often having ranked references are the top codecs of this test. 

I know my results wouldn't change anything with respect to the overall result the multiformat listening test. However next time around please specify the criteria for discarding result files. 

I could then have used a different approach in my testing that would not result in ranked references. 

Any suggestions for improving my approach to ABC/HR are welcome.

Edit: Added average. Removed direct question for Sebastion about which of my results were used.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 12:47:59
What was the mail address you used for testing (you can send me a PM if you don't want it to be public)?

Nevermind found it - decrypting results...
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: SirGrey on 2006-01-15 12:48:41
Thanks for working on the test, Sebastian.
As for me, I can hardly ABX most modern codecs at this bitrate from the original.  And can not ABX one from another.
So it really seems that it is time to lower a bitrate a bit in a such public tests 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 12:55:28
sehested, you are anon10.

I used your results for BigYellow, Elizabeth, Mysterious Times and Senor.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 13:00:13
By the way... People who did not enter their (nick)name in ABC/HR (like sehested) are going to have anonXX in front of their result.
Please don't ask me who has which number since I don't know it by heart.

One more thing - IIRC, a tester entered his name for all results except one. The respective result is also marked as anonXX since the tester might've had a reason for not disclosing his name for that single result.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-15 13:18:47
Quote
sehested, you are anon10.

I used your results for BigYellow, Elizabeth, Mysterious Times and Senor.
[a href="index.php?act=findpost&pid=357280"][{POST_SNAPBACK}][/a]

Thanks! I better include my name with the test results next time around.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 13:52:44
thx again for this interesting test, sebastian!

btw can you create a final zoomed in plot without the anchor and without the nero results plz, its nicer to point people to (most newbies will propably not read or understand the whole nero explanation)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: minisu on 2006-01-15 14:05:17
Quote
thx again for this interesting test, sebastian!

btw can you create a final zoomed in plot without the anchor and without the nero results plz, its nicer to point people to (most newbies will propably not read or understand the whole nero explanation)
[a href="index.php?act=findpost&pid=357289"][{POST_SNAPBACK}][/a]

Wouldn't that mislead beginners even more since the point that none of the tested encoders is proved to be better than another?

(I don't mind if you do, just wanted to point out that there might be a risc...)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 14:12:08
Quote
thx again for this interesting test, sebastian!

btw can you create a final zoomed in plot without the anchor and without the nero results plz, its nicer to point people to (most newbies will propably not read or understand the whole nero explanation)
[a href="index.php?act=findpost&pid=357289"][{POST_SNAPBACK}][/a]


The final zoomed plot does not contain the anchor and I am not going to remove Nero since I see no point in doing so. If people want to read about the Nero problem, they can follow the link from the plot.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 14:13:50
your statement doesnt make any sense for me. what does the fact that the encoders are on par have to do with that the nero results are not really comparable

sebastian wrote
Quote
Because of the mentioned problems (unfairness, no real-life relevance...) and after discussing the issue with Francis, Roberto Amorim (rjamorim on Hydrogenaudio Forums) and Darryl Miyaguchi (ff123 on Hydrogenaudio Forums) thoroughly, I decided, against Ivan's and Juha's suggestion, to exclude Nero from the test.
because of this exclusion i think there should be also a final plot provided that doesnt mention nero

edit:

Quote
Quote
thx again for this interesting test, sebastian!

btw can you create a final zoomed in plot without the anchor and without the nero results plz, its nicer to point people to (most newbies will propably not read or understand the whole nero explanation)
[a href="index.php?act=findpost&pid=357289"][{POST_SNAPBACK}][/a]


The final zoomed plot does not contain the anchor and I am not going to remove Nero since I see no point in doing so. If people want to read about the Nero problem, they can follow the link from the plot.
[a href="index.php?act=findpost&pid=357291"][{POST_SNAPBACK}][/a]
so the exclusion of nero from the test is not reason enough to provide a final plot without the excluded nero?

well ok, if this makes sense for you... 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Garf on 2006-01-15 14:29:38
Quote
Quote
On the total results: how much % of the gradings gave a "transparent" mark, if we exclude shine?

The same question but adding the graded references as 5.0 for the codec in question?
[a href="index.php?act=findpost&pid=357241"][{POST_SNAPBACK}][/a]


Quote
Is there a way to see your own results, including results for discarded result files?
[a href="index.php?act=findpost&pid=357242"][{POST_SNAPBACK}][/a]


Sorry, I didn't understand both of you. 
[a href="index.php?act=findpost&pid=357246"][{POST_SNAPBACK}][/a]


Let me rephrase: How many of the grades given were 5.0? And how much if you re-add the ranked references as meaning that codec got a 5.0 for that sample?

So, there's 403 valid test results times 5 codecs (Shine doesn't count), or about 2015 grades. How many of those are 5.0, i.e. perfectly transparent?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: yulyo! on 2006-01-15 14:35:50
When can we see the final version of Nero's new AAC encoder? (5)
I am curoius how Nero would compete if there was not bug 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 14:40:57
Quote
I am curoius how Nero would compete if there was not bug 
[a href="index.php?act=findpost&pid=357298"][{POST_SNAPBACK}][/a]

nero always performs better in a not yet available version ™

/cynism
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 14:42:46
Quote
your statement doesnt make any sense for me. what does the fact that the encoders are on par have to do with that the nero results are not really comparable
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]

I believe that Nero's overall result (and only overall's one) is purely indicative. People have participate to this test, and it would be frustrating to not see any indication about the quality of the disqualified encoder. Of course, nobody should claim that Nero is as good as encoder x or y according to this test: the tested samples are giving a wrong and probably overrated image of the real performances of Nero Digital AAC. That's why results are put on red, outside from the main area, and without any confidence interval bar.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 14:46:19
Quote
Quote
your statement doesnt make any sense for me. what does the fact that the encoders are on par have to do with that the nero results are not really comparable
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]

I believe that Nero's overall result (and only overall's one) is purely indicative. People have participate to this test, and it would be frustrating to not see any indication about the quality of the disqualified encoder. Of course, nobody should claim that Nero is as good as encoder x or y according to this test: the tested samples are giving a wrong and probably overrated image of the real performances of Nero Digital AAC. That's why results are put on red, outside from the main area, and without any confidence interval bar.
[a href="index.php?act=findpost&pid=357303"][{POST_SNAPBACK}][/a]

yeah, thats why i meant there should be both, a final plot with nero (as currently available) and one without, that can be thrown on newbies without making them think nero performed as its shown on the plot (even if its red and with the link to the explanation)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 14:51:40
Quote
well ok, if this makes sense for you... 
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]


You can always create the plot yourself. 

Quote
Let me rephrase: How many of the grades given were 5.0? And how much if you re-add the ranked references as meaning that codec got a 5.0 for that sample?

So, there's 403 valid test results times 5 codecs (Shine doesn't count), or about 2015 grades. How many of those are 5.0, i.e. perfectly transparent?
[a href="index.php?act=findpost&pid=357297"][{POST_SNAPBACK}][/a]


Geez, no idea. That would take too much time - time that I don't have right now. I could send you all results if you really want to do it yourself.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 14:55:52
Quote
Quote
well ok, if this makes sense for you... 
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]


You can always create the plot yourself. 

you wish we play:
removed

everyone feel free to link to it
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 14:57:49
Quote
Quote
Quote
your statement doesnt make any sense for me. what does the fact that the encoders are on par have to do with that the nero results are not really comparable
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]

I believe that Nero's overall result (and only overall's one) is purely indicative. People have participate to this test, and it would be frustrating to not see any indication about the quality of the disqualified encoder. Of course, nobody should claim that Nero is as good as encoder x or y according to this test: the tested samples are giving a wrong and probably overrated image of the real performances of Nero Digital AAC. That's why results are put on red, outside from the main area, and without any confidence interval bar.
[a href="index.php?act=findpost&pid=357303"][{POST_SNAPBACK}][/a]

yeah, thats why i meant there should be both, a final plot with nero (as currently available) and one without, that can be thrown on newbies without making them think nero performed as its shown on the plot (even if its red and with the link to the explanation)
[a href="index.php?act=findpost&pid=357305"][{POST_SNAPBACK}][/a]




Damn, you were faster!
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 15:00:37
Quote
Quote
Quote
Quote
your statement doesnt make any sense for me. what does the fact that the encoders are on par have to do with that the nero results are not really comparable
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]

I believe that Nero's overall result (and only overall's one) is purely indicative. People have participate to this test, and it would be frustrating to not see any indication about the quality of the disqualified encoder. Of course, nobody should claim that Nero is as good as encoder x or y according to this test: the tested samples are giving a wrong and probably overrated image of the real performances of Nero Digital AAC. That's why results are put on red, outside from the main area, and without any confidence interval bar.
[a href="index.php?act=findpost&pid=357303"][{POST_SNAPBACK}][/a]

yeah, thats why i meant there should be both, a final plot with nero (as currently available) and one without, that can be thrown on newbies without making them think nero performed as its shown on the plot (even if its red and with the link to the explanation)
[a href="index.php?act=findpost&pid=357305"][{POST_SNAPBACK}][/a]




Damn, you were faster!
[a href="index.php?act=findpost&pid=357309"][{POST_SNAPBACK}][/a]

lol yeah happy now 

tough i would be even more happy if it would be shown on the results page too
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 15:09:25
OK, results page now contains 3 final graphs: non-zoomed with Nero, zoomed with Nero and zoomed without Nero.

BTW, could you please remove your plot, bond? I would like people to link to my graphs in case I change something.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: yulyo! on 2006-01-15 15:11:53
Yes bond, it seems you're right.
every time some test is released, nero have some problems. "it was the old old version", "the new one has a bug", "but the one we will release will be the best. Maybe i shoud switch to Ogg 
I think i will.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 15:13:49
Quote
OK, results page now contains 3 final graphs: non-zoomed with Nero, zoomed with Nero and zoomed without Nero.

thx, i really think its less confusing to not bug newbies with the nero issues, by being able to show the plot without nero

Quote
BTW, could you please remove your plot, bond? I would like people to link to my graphs in case I change something.
[a href="index.php?act=findpost&pid=357316"][{POST_SNAPBACK}][/a]


removed
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Alex B on 2006-01-15 15:56:25
I too would like to thank everyone who contributed this test. I tried all 18 samples and like many others I found that I could clearly differentiate only the low anchor. I forgot to add my name to the results, but I could ABX only a few occasional samples besides the low anchor.

The result is very interesting and in my opinion it shows that the used encoding format can now be freely selected for other reasons than audio quality. MP3, Vorbis, AAC and WMA Pro are good enough for almost everyone.


In regard to the mistake with the yello sample I started a new thread here: http://www.hydrogenaudio.org/forums/index....showtopic=40625 (http://www.hydrogenaudio.org/forums/index.php?showtopic=40625). That thread has a small listening test with a lossless version of the same sample.

I would like to add that originally I didn't offer the sample for a listening test. I cut it only for evaluating WMA 2-pass behavior before the test. It made WMA standard 2-pass internally use high bitrates when it was combined with an almost silent piano part. For a listening test I would have selected a bit different sample from the same album. Actually, I offered to make a new sample, but the time was limited and Sebastian preferred to use this one. So it ended up in the test.

[span style='font-size:7pt;line-height:100%']Edit: a small fix[/span]
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-15 16:03:23
Quote
Yes bond, it seems you're right.
every time some test is released, nero have some problems. "it was the old old version", "the new one has a bug", "but the one we will release will be the best. Maybe i shoud switch to Ogg 
I think i will.
[a href="index.php?act=findpost&pid=357317"][{POST_SNAPBACK}][/a]


Maybe somebody will move to ogg. But many people wil stay with MP3/AAC.
Some guys have already said about compability.  AAC and OGG are tight in this test.  IMHO  AAC is the best tradeoff quality/compability.

Let's see globally.
1. CT has a good HE-AAC v1 and v2  codec  but not LC-AAC
2. iTunes has a best LC-AAC but  not HE-AAC (not SBR neither PS)
3. Nero has 2d place LA-AAC  and first available  VBR HE-AAC v1 and v2 both.

It's dificult to develop 2 profiles at the same time.  But Nero is in great position. 

I think it's not good idea for cinysm like "Nero sucks" .  A little respect was always  welcome.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 16:27:46
Quote
AAC is the best tradeoff quality/compability.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357332")

That's interesting  Could I get the list of compatible AAC players? When I was interested of changing my own player (I finally repaired it), I found several Vorbis players but only few AAC ones. And all of them were made by one company. There are now products like phones or game device compatible with AAC even here the list is rather small compared to Vorbis players. And price of such device is often dissuasive.

Here is the list of compatible Vorbis players:
[a href="http://wiki.xiph.org/index.php/PortablePlayers]http://wiki.xiph.org/index.php/PortablePlayers[/url]
http://wiki.xiph.org/index.php/StaticPlayers (http://wiki.xiph.org/index.php/StaticPlayers)

Is there any way to get the (longer) list of AAC players? I'm interested to get one. Thanks
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: moozooh on 2006-01-15 16:30:55
Quote
Maybe somebody will move to ogg. But many people wil stay with MP3/AAC.
Some guys have already said about compability.  AAC and OGG are tight in this test.   IMHO  AAC is the best tradeoff quality/compability.[a href="index.php?act=findpost&pid=357332"][{POST_SNAPBACK}][/a]

Not sure about that one. How many different hardware players support AAC aside from iPod? How many players support HE-AAC? HE-AAC v2?

Quote
Let's see globally.
…[a href="index.php?act=findpost&pid=357332"][{POST_SNAPBACK}][/a]

Well yeah, but none of them does all of that better than others. Nero shows an effort, but there are numerous bugs promises like the ones yulyo stated.
So it's all a matter of personal affection or something like that. At least with Vorbis, there is no hard choices: aoTuV b4.51 for the win. Everything else is inferior.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-15 16:33:55
Quote
Quote
AAC is the best tradeoff quality/compability.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357332")

That's interesting  Could I get the list of compatible AAC players?


yes, that's interesting.    I said imho. It significates for me.  For HD-DVD or/and blu ray  players AAC will be wisely supported. As mp3 is supported now. AAC MPEG-4 is the next standard from MPEG like mp3 (mpeg 1 layer 3) is now. [a href="http://www.blu-raydisc.com/assets/downloadablefile/2b_bdrom_audiovisualapplication_0305-12955.pdf]http://www.blu-raydisc.com/assets/download..._0305-12955.pdf[/url]

Asbolutly all HD-DVD, Blu ray and mobile devices will support AAC.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-15 16:36:45
Quote
How many players support HE-AAC? HE-AAC v2?


I didn't said nothing about  HE-AAC device's support. Could you point me?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 16:43:54
For HD-DVD or/and blu ray  players AAC will be wisely supported.
> Will?! So AAC isn't more compatible but rather will be more compatible. I didn't see AAC as audio part of these new formats. It would be a nice thing to see this becomes true 


Quote
As mp3 is supported now. AAC MPEG-4 is the next standard from MPEG like mp3 (mpeg 1 layer 3) is now.

LC-AAC is the MPEG-4 standard for years, but hasn't really reached the marked (except for Apple's product). Most manufacturers are still supporting 1/ MP3 then 2/ WMA then 3/Vorbis instead of AAC. Currently at least. That's why I wonder about the claim that AAC is more compatible. So if I understand correctly, AAC is more compatible with the future whereas Vorbis or WMA standard are more compatible with the present 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: yulyo! on 2006-01-15 16:46:47
IgorC: "I think it's not good idea for cinysm like "Nero sucks" . A little respect was always welcome."
Igor, i didn't said anything like this in my post. I just said that there are a lot of problems whit Nero AAC.
Short question: is iTunes encoding MP4 too, or just M4A?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 16:51:09
I just finished to download the Blu-Ray PDF. Supported audio formats are:
- "LPCM as well as Dolby® Digital - Dolby Digital Plus, Dolby Lossless, DTS Digital Surround and DTS-HD audio formats".

Did I miss something ?

PS: CTRL + F "AAC" shows no results.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 16:51:15
Quote
For HD-DVD or/and blu ray  players AAC will be wisely supported.
> Will?! So AAC isn't more compatible but rather will be more compatible. I didn't see AAC as audio part of these new formats. It would be a nice thing to see this becomes true

aac isnt part of the normal audio formats in hddvd, its only part of the "rom-zone" whatever this mean. i dont even know if support is mandatory and i dunno how things are handled on bluray

even if it would be mandatory it still wouldnt necessarily mean that your hddvd player can play your aac mp4 files

Quote
IgorC: "I think it's not good idea for cinysm like "Nero sucks" . A little respect was always welcome."
Igor, i didn't said anything like this in my post. I just said that there are a lot of problems whit Nero AAC.
Short question: is iTunes encoding MP4 too, or just M4A?
[a href="index.php?act=findpost&pid=357342"][{POST_SNAPBACK}][/a]

m4a is the same as mp4
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: minisu on 2006-01-15 16:55:24
I don't like WMA, and not many players support WMA Pro IIRC. Lame performed good in this test but I tend to rate mp3 poorer than other codecs (maybe because with mp3 you know what you're looking for).

This leaves me with Vorbis and AAC. Which format drains less battery from a portal player? (kind of OT...)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: moozooh on 2006-01-15 16:58:10
Quote
I didn't said nothing about  HE-AAC device's support. Could you point me?
[a href="index.php?act=findpost&pid=357339"][{POST_SNAPBACK}][/a]

You said AAC, not LC-AAC. How do I know it doesn't include all the profiles (especially considering HE/HE v2 is the most beneficial profile for AAC)?

AFAIR, HE is a part of the standard, and if you imply that AAC will be supported by all the aforementioned devices only because of that (which is fairly doubtful in real world), why not support the other part of that standard? (I could be wrong about it, though; discard the previous sentence if I am.)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 16:58:36
Quote
aac isnt part of the normal audio formats in hddvd, its only part of the "rom-zone" whatever this mean. i dont even know if support is mandatory and i dunno how things are handled on bluray
[a href="index.php?act=findpost&pid=357344"][{POST_SNAPBACK}][/a]

So if AAC is currently supported by less audio devices as Vorbis and if it isn't a part of big multimedia project such as SACD, HD-DVD or BRD, it's more compatible with what?  With hope? Dreams?  I have some troubles to explain to myself some claims.
I really like AAC, as well as Vorbis, but I must say that I'm disappointed by the industry attitude and lack of interest for AAC.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Busemann on 2006-01-15 16:59:48
Quote
That's interesting  Could I get the list of compatible AAC players?


The iPod family is AAC compatible of course, so in terms of market share AAC is a lot more widespread than OGG Vorbis. Then there's the emerging cell phone market which also excludes OGG in favor of AAC (Phones from Motorola, Sony Ericsson & Nokia are all AAC compatible). Then there's the PSP, which seems to support everything but OGG.. The list of OGG Vorbis players is fairly long, but most of them I haven't even heard of. I would say AAC is much much more future proof than OGG, simply because of its big user base and company support, excluding all the cheap Asian stuff.

This is not a bash of OGG. I think it's a great format, but I wouldn't encode my music with it if I were into portable DAP's.

Just my 2¢
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: moozooh on 2006-01-15 17:07:01
Quote
I would say AAC is much much more future proof than OGG, simply because of its big user base and company support, excluding all the cheap Asian stuff
[a href="index.php?act=findpost&pid=357348"][{POST_SNAPBACK}][/a]

I wouldn't say that only because market needs something absolutely royalty free, and that is the strongest point of Vorbis. Why pay for a standardised format when you can pay nothing at all and have the same quality? You can't discard that so easily.

Also, where did you read AAC has big user base?

And another one: those "cheap Asian stuff" companies use the same chips as iPods, if not better. Almost all of the chips are produced in Asia.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bond on 2006-01-15 17:07:46
well it still shouldnt be forgotten that the ipod simply dominates the market for handheld music players

there can be thousands of different players handling wma9 and it would still mean not much if only one player handling aac would have a marketshare of 90%
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 17:07:51
Quote
Quote
That's interesting  Could I get the list of compatible AAC players?


The iPod family is AAC compatible of course, so in terms of market share AAC is a lot more widespread than OGG Vorbis.
[a href="index.php?act=findpost&pid=357348"][{POST_SNAPBACK}][/a]

Of course, but it's just a market share. iPod is one and single family. If someone wants a:
- true UMS device
- or a very small device
- or a device working with AA or AAA cells
- or a gapless device
- or a non-MP3-suttering device
- or a very cheap device
- or a longer-battery life device
etc... this person has few chance to find something compatible with AAC. Simply because there's mainly one company producing AAC compatible players, and this company isn't interested to support one of the listed feature.

And if most manufacturers are unknown, there are companies like Samsung, iAudio, iRiver, LG, Iomega, MPIO, Neuros, Rio, TEAC which are widely available. My parents are living in a small village of 3000 souls, and even here the supermarket has Vorbis compatible players!
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-15 17:09:01
Quote
So if AAC is currently supported by less audio devices as Vorbis and if it isn't a part of big multimedia project such as SACD, HD-DVD or BRD, it's more compatible with what?  With hope? Dreams?  I have some troubles to explain to myself some claims.
I really like AAC, as well as Vorbis, but I must say that I'm disappointed by the industry attitude and lack of interest for AAC.
[a href="index.php?act=findpost&pid=357347"][{POST_SNAPBACK}][/a]


That's why I said IMHO. If somebody has a audioplayer that supports only mp3  never will use aac,ogg.
Btw, can you also provide online music shops wich support OGG?  AAC? MP3?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: moozooh on 2006-01-15 17:11:14
Quote
That's why I said IMHO.
[a href="index.php?act=findpost&pid=357354"][{POST_SNAPBACK}][/a]

IMHO, an IMHO must be backed up by something that has relevance in the real world.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 17:14:18
Quote
Quote
So if AAC is currently supported by less audio devices as Vorbis and if it isn't a part of big multimedia project such as SACD, HD-DVD or BRD, it's more compatible with what?  With hope? Dreams?  I have some troubles to explain to myself some claims.
I really like AAC, as well as Vorbis, but I must say that I'm disappointed by the industry attitude and lack of interest for AAC.
[a href="index.php?act=findpost&pid=357347"][{POST_SNAPBACK}][/a]


That's why I said IMHO.
[a href="index.php?act=findpost&pid=357354"][{POST_SNAPBACK}][/a]

True. But I supposed that your opinion is surely based on objective facts. You can hardly say that AAC offers more compatible IMO if the current situation is showing the opposite.

Quote
Btw, can you also provide online music shops wich support OGG?  AAC? MP3?

That's indeed a good point. Neither MP3 or Vorbis are compatible with online music stores. iTunes Music Store is giving a big advantage to AAC here; unfortunately for AAC, Apple's closed attitude is making WMA Standard stronger; this format is now supported by most online shops. And there are more and more companies making WMA PlaysForSure a marketing argument.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Busemann on 2006-01-15 17:21:06
Quote
And another one: those "cheap Asian stuff" companies use the same chips as iPods, if not better. Almost all of the chips are produced in Asia.
[a href="index.php?act=findpost&pid=357351"][{POST_SNAPBACK}][/a]

Of course, but there's more to a DAP than its chips. Most components in the tech industry are standardized
Quote
Of course, but it's just a market share. iPod is one and single family. If someone wants a:
- true UMS device
- or a very small device
- or a device working with AA or AAA cells
- or a gapless device
- or a non-MP3-suttering device
- or a very cheap device
- or a longer-battery life device
etc... this person has few chance to find something compatible with AAC. Simply because there's mainly one company producing AAC compatible players, and this company isn't interested to support one of the listed feature.


First off, you should really take a second look at the iPod line as many of your requests are now implemented, such as small size, good battery, low price, etc. The lame-vbr stuttering doesn't even affect newer models.

Anyways, if you want total freedom you should just stick to mp3. When your AA-powered $20 supermarket ogg-player breaks, it sure is a lot better to have a library of universally compatible mp3s than Oggs.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 17:28:04
Quote
First off, you should really take a second look at the iPod line as many of your requests are now implemented, such as small size, good battery, low price, etc. The lame-vbr stuttering doesn't even affect newer models.
[a href="index.php?act=findpost&pid=357361"][{POST_SNAPBACK}][/a]

I don't want to start a iPod flame war (I like them and I even bought it once - which never worked...). But several flash players have a battery which last more than 40 hours; there are ultra-small flash players and the Shuffle is twice longer; Apple price are not ultra-expensive but they're not cheap; the sutter problem affect the Nano which is the latest model.
Quote
Anyways, if you want total freedom you should just stick to mp3. When your AA-powered $20 supermarket ogg-player breaks, it sure is a lot better to have a library of universally compatible mp3s than Oggs.

Everybody knows that. The fact is that the consumer looking for an alternate format has more choice for Vorbis compatible players than for AAC ones. That's why I wouldn't say that "AAC is the best tradeoff quality/compability". It can't be seriously defended by objective arguments.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Busemann on 2006-01-15 17:30:05
Quote
The fact is that the consumer looking for an alternate format has more choice for Vorbis compatible players than for AAC ones. That's why I wouldn't say that "AAC is the best tradeoff quality/compability". It can't be seriously defended by objective arguments.
[a href="index.php?act=findpost&pid=357364"][{POST_SNAPBACK}][/a]


Versus OGG Vorbis it definitely can. There might be a gazzillion OGG players, but that doesn't matter as long as the market leaders don't support it (only one out of the top 20 mp3 players on Amazon.com supports OGG). There are more players supporting WMA than AAC too, but also there you're limited to the crappy players fighting for the 15% or so.

When OGG excludes the iPod, cell phones and PSP, then there's hardly "more choice" don't you agree?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: JohnV on 2006-01-15 17:31:01
Quote
Quote
your statement doesnt make any sense for me. what does the fact that the encoders are on par have to do with that the nero results are not really comparable
[a href="index.php?act=findpost&pid=357292"][{POST_SNAPBACK}][/a]

I believe that Nero's overall result (and only overall's one) is purely indicative. People have participate to this test, and it would be frustrating to not see any indication about the quality of the disqualified encoder. Of course, nobody should claim that Nero is as good as encoder x or y according to this test: the tested samples are giving a wrong and probably overrated image of the real performances of Nero Digital AAC. That's why results are put on red, outside from the main area, and without any confidence interval bar.
[a href="index.php?act=findpost&pid=357303"][{POST_SNAPBACK}][/a]

Well.. it's hard to say what effect the bug has exactly.
The fact is after about half of the track positions because the bitreservour isn't flexible enough, the quality may be worse than it could be.
People base rating on what they hear differs from original, so it can be just as well speculated that the overall score could be better in case of properly working reservour.

If it's speculated that people only listen the beginning, and the overcoding makes a huge quality improvement (which it shouldn't do, because we are near transparency level) then imo Nero score is overrated. But, the samples have the bug which causes both over- and undercoding (reservour induced), and about 50% of the track lengths are undercoded. And again, people rate based on the problems they hear.. Well, this is just my opinion, others may come to different conclusion.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-15 17:31:09
Quote
Quote
That's why I wouldn't say that "AAC is the best tradeoff quality/compability". It can't be seriously defended by objective arguments.
[a href="index.php?act=findpost&pid=357364"][{POST_SNAPBACK}][/a]


Versus OGG Vorbis it definitely can.
[a href="index.php?act=findpost&pid=357365"][{POST_SNAPBACK}][/a]

Well, I give up
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 17:41:19
ganymed just pointed me to a German article about my listening test: http://www.mpex.net/news/archiv/00612.html (http://www.mpex.net/news/archiv/00612.html)
Just in case anyone is interested.

BTW: I am half Romanian - therefore, "unter deutscher Leitung von Sebastian Mares" is not 100% true.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Ivan Dimkovic on 2006-01-15 17:43:08
One remark regarding the bit reservoir drain - IMO it could just make the quality worse, because bit reservoid is always drained, and not according to the psychoacoustic demands.

If you overcode something which is already "near transparent" - effects on this are very very subtle - if you undercode something which needs more bits - you end up with obvious artifact.  As it is well known from the psychoacoustic theory, average bit demand of the normal music signal is just about 128 kbps, maybe a bit more - so, even without big bit reservoir effects would be small - with overcoding, not too much is improved.

However, this causes frames that need more bits (200+ kbps) to actually receive much less bits than they should - therefore making encoding quite suboptimal.

But, I would rather skip discussion about this, we had a pretty long conversation (Sebastian, Guru, Juha, Roberto and myself) - and for sure I'd like to avoid repeating it as there are a lot of tasks I anyway need to do and this would take so much time.

What I think is way better - is actually testing the bugfixed mode - this encoder will appear soon (during next two weeks I hope) - as well as with bugfixed 128 kbps mode.  @Guru, I am sure you would be able to test it and give some remarks providing that you find some time.

Once again - I am very sorry (and I am personally dissapointed as well because the true quality of the new AAC encoder couldn't be tested properly) the bug was found too late - we will make sure and invest all our resources for such thing not happening again.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Garf on 2006-01-15 17:48:25
Quote
I could send you all results if you really want to do it yourself.
[a href="index.php?act=findpost&pid=357306"][{POST_SNAPBACK}][/a]


Are they in an easy to process form? (Like an Excel sheet... or something similar)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 18:01:49
Quote
Quote
I could send you all results if you really want to do it yourself.
[a href="index.php?act=findpost&pid=357306"][{POST_SNAPBACK}][/a]


Are they in an easy to process form? (Like an Excel sheet... or something similar)
[a href="index.php?act=findpost&pid=357373"][{POST_SNAPBACK}][/a]


No, as encrypted ABC/HR results.  That's why I said it would take too much time. I only have the valid results sorted and as TXT files.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Hyrok on 2006-01-15 19:45:58
Interesting results. Somehow it proofs LAME got even better (there's really no need for version 3.90.3 anymore), the new vbr model is stable and for most people 128kkbps MP3 is transparent. The magical number until now was 192kbps. I think the truth is somewhere between them. I'll stick with LAME -V 3 --vbr-new. It's about ~150-160kbps.

Thanks to all people who participated in this listening test!
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: yulyo! on 2006-01-15 19:48:19
Bond: M4A is the same as Mp4.
I know Bond. But i want Mr.Q to see my encoded files as mp4 not m4a.

Off topic: Sebastian, when i saw your name, i tought that this look like a romanian name.
 
I'm from Romania too.
A good day.(night)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: smok3 on 2006-01-15 21:30:35
iam still not getting the ipod, mkay!?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Jojo on 2006-01-15 22:37:37
I really like the idea of having the settings used included in the graph. As you said, people will post those graphs all over the place and therefore it is better to state what settings achieved those results. It helps educating people and will hopefully stop them from using those strange command lines (especially true for lame).

Many people will not follow the link (they have to type the address by hand). And even if they do, the settings are hard to find and there is a lot of reading. People that use those strange command lines will simply upgrade to the encoder tested and keep using their crappy command lines and wonder why it sounds like crap...many might not even know that there is more than one setting one can use.

Please consider it.
Quote
(http://img253.imageshack.us/img253/5935/results1lp.png) (http://imageshack.us)
[a href="index.php?act=findpost&pid=357175"][{POST_SNAPBACK}][/a]
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-15 22:43:08
Sorry, no.
You can do it yourself and state that it's your plot - I have no problem with that. I am not responsible for the way people post that image and what they do and don't do. I doubt that image is going to appear on a forum out of the sudden and without any comments - people who post the plot can also create a link below or above. The encoder version is more than enough in the graphics.
Discussion about this is over.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-15 23:12:38
Quote
Sorry, no.
You can do it yourself and state that it's your plot - I have no problem with that. I am not responsible for the way people post that image and what they do and don't do. I doubt that image is going to appear on a forum out of the sudden and without any comments - people who post the plot can also create a link below or above. The encoder version is more than enough in the graphics.
Discussion about this is over.
[a href="index.php?act=findpost&pid=357432"][{POST_SNAPBACK}][/a]

ok then, can you post the xls/openoffice/whatever you used?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: rjamorim on 2006-01-15 23:30:04
Very big congratulations to Sebastian for managing to produce interesting results after all the hardships he had to go through to discuss his test and later conduct it.

And here's to your upcoming tests! May the next test (64kbps?) be as successfull (but less stressful!) than this one. I know for sure I can hardly wait for it...
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: danchr on 2006-01-16 00:47:53
Quote
Quote
Quote
That's why I wouldn't say that "AAC is the best tradeoff quality/compability". It can't be seriously defended by objective arguments.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357364")


Versus OGG Vorbis it definitely can.
[a href="index.php?act=findpost&pid=357365"][{POST_SNAPBACK}][/a]

Well, I give up
[a href="index.php?act=findpost&pid=357368"][{POST_SNAPBACK}][/a]

I was about to say that on the Mac side of the fence, it is not so, and that there is bad support for it. However, I decided to check if things still are the way they used to be. They aren't.

There's a [a href="http://www.xiph.org/quicktime/]Xiph component[/url] available that allows you to play Ogg Vorbis files in QuickTime (and thus iTunes). Thanks to QuickTime 7, it even supports tagging, and in iTunes too. So Ogg Vorbis is a viable alternative, if you don't mind the lack of iPod-compatibility.

Ogg Vorbis reminds me of Apple in the mid-nineties: You really don't expect it to survive, but it does; quite well, even.

(By the way, this test is a very good argument against the people claiming that the 128kbps AAC used in the iTunes Music Store isn't good enough quality. It's very good quality, and probably more than good enough for the vast majority of listeners.)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Caroliano on 2006-01-16 01:11:59
How I can decrypt the results with the key.key file?

In hush I forgot to send my results of the sample 12.... I tested it and even found diferences between the samples.

Also, I forgot to add my name to most of my tests I think. I'm the anon26.

If you wanted something about what coment, you could have talked about shine's performace. It was not bad in the "song for guy" and few others.

I rated vorbis as the worst (4.0, but still) between the encoders that I could hear the diference in the Senor sample (excluding low-anchor). The only two other people that rated vorbis put vorbis in first or tied with the first ones. Not even guruboolez rated vorbis in this sample!?!? 
I think everyone hearing is diferent....

Maybe an equipament diference? I have an ECS's on-board sound card and some crap pc-speakers that come with the computer. I can't even plug an headphone here because the "eletric-zoom". I think it is the average home user equipament, but very far from what HA's people probably have...

I'm not an native english speaker nor I read any guide for artfacts in audio, so I don't know nothing about the nomeclature for audio artfacts, but I think it can be the pre-echo thing plus a bost in the source of the echo. This occur around 15.0s point of the sample. My coment at the time:
"3R Comment: make some drums more harsh/difuse"

This was and is clear for me. And I even thought about vorbis that time, because that harsh sounding. I think that I don't like the way that vorbis sond. Many times I prefer an 32kbps AACv2 than an 64kbps vorbis aoTuV 4 (I don't tested the 4.5b yet).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Alex B on 2006-01-16 01:30:40
Quote
(By the way, this test is a very good argument against the people claiming that the 128kbps AAC used in the iTunes Music Store isn't good enough quality. It's very good quality, and probably more than good enough for the vast majority of listeners.)[a href="index.php?act=findpost&pid=357454"][{POST_SNAPBACK}][/a]

This test tested iTunes AAC in the new VBR mode. I think iTunes Music Store still offers standard AAC "CBR" files. So that argument is not valid. Or have they changed the format recently?

Actually, even the standard iTunes AAC is not strictly CBR. The bitrate varies, but not as much as in VBR mode and the average is always the 128 kbps. iTunes 128 kbps VBR allows more bitrate variation and higher average bitrates when needed. It does not allow a lower average than 128 kbps. So it is most likely usually better than the older standard mode or at least as good if the files are not difficult to encode, but that has not been directly tested.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-16 03:57:37
Thanks to everyone with ears who participated and Sebastian for organising the whole listening test.  The results are quite interesting and I'm always pleased to see Ogg Vorbis at the front.  It is particularly interesting to see the scores so consistently high, which makes me wonder whether the samples used were perhaps too easy (ie. there wasn't a codec-killer like 'Waiting')?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Nayru on 2006-01-16 06:43:36
Vorbis did not finish 'at the front'.  The difference between AoTuV and iTunes AAC is within the margin of error and is not significant.  Moreover, the slightly higher score needs to be weighed against the fact the AoTuV used a slightly higher bitrate than iTunes.

The fact that all that all the codecs scored highly is unsurprising given the high bitrate.

There are three conclusions that can be drawn from this test:

1) AoTuV, iTunes AAC, and WMA Pro produce encoded files of roughly equivalent quality at bitrates near 140kbps.

2) Lame was very close to the others on most samples, but there were two samples where lame scored significantly lower.

3) With all of these encoders, some encoded files can be distinguished from the original.  The samples were not "too easy", as there were some samples which none of the codecs could encode transparently at the settings used.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Serge Smirnoff on 2006-01-16 07:05:02
If my memory serves me, Ivan promised to publish results of PEAQ analyses of encoded sound samples from the listening test. May we hope to see them?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: shadowking on 2006-01-16 07:09:50
The scores are interesting and it reflects an increase in quality over time. But is it so hard to find problems ?  I have many problem samples from my collection - at least for mp3 and its on 'normal' music.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-16 08:44:36
Quote
Vorbis did not finish 'at the front'.  The difference between AoTuV and iTunes AAC is within the margin of error and is not significant.  Moreover, the slightly higher score needs to be weighed against the fact the AoTuV used a slightly higher bitrate than iTunes.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357498")


From the pre-test discussions:
Quote
Quote
Quote
Updated bitrate table with Vorbis -q 4.25: [a href="http://maresweb.de/bitrates2.htm]http://maresweb.de/bitrates2.htm[/url]

(http://index.php?act=findpost&pid=346951)
I think that it would be more fair to use a lower quality setting for Vorbis.

Since the bit rate control of Vorbis is very flexible it seems wrong to select a setting that will give it the highest average bitrate of all encoders. People could challenge such a decision as favoritism towards Vorbis.

At -q 4 Vorbis has an average bitrate similar to that of iTunes and Nero and that would in my opinion be the right setting.

[/url]



According to this post:

Quote
I updated my bitrate table with Vorbis -q 4.20 and gathered all previous results in the same table:

bitrates_public2.xls (http://kotisivu.mtv3.fi/alexb/ha/bitrates_public2.xls)


Vorbis -q 4 reached an average of exactly 128 kbps on Alex B's side. On the other hand, -q 4.2 and -q 4.25 didn't produce too high bitrates (around 134 kbps which is OK). So, what do the experts thing - should -q 4 be used or -q 4.2(5)?...
[a href="index.php?act=findpost&pid=347010"][{POST_SNAPBACK}][/a]
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: fpi on 2006-01-16 09:20:29
Quote
I really like AAC, as well as Vorbis, but I must say that I'm disappointed by the industry attitude and lack of interest for AAC.


Note that AAC has license fees:
http://www.vialicensing.com/products/mpeg4...ense.terms.html (http://www.vialicensing.com/products/mpeg4aac/license.terms.html)

Vorbis is license free and there are also open source MIT licensed floating point encoder end decoder, and a fixed point decoder. I also prefer Vorbis bacause is the only lossy format we linux users can use without infringing any patent and using only free software. We should all thank xiph.org and Aoyumi for their free work in creating a great patent-free codec.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Alex B on 2006-01-16 09:47:19
Quote
Thanks to everyone with ears who participated and Sebastian for organising the whole listening test.  The results are quite interesting and I'm always pleased to see Ogg Vorbis at the front.   It is particularly interesting to see the scores so consistently high, which makes me wonder whether the samples used were perhaps too easy (ie. there wasn't a codec-killer like 'Waiting')?
[a href="index.php?act=findpost&pid=357479"][{POST_SNAPBACK}][/a]

I think the tested samples are average or above average in degree of difficulty.

I have though about so called killer samples too. However, I think it would be very difficult to gather a good killer sample collection that would be fair for different codecs in a multi-format test. Usually these killers are especially bad for one particular codec.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Alex B on 2006-01-16 10:20:24
Quote
Vorbis did not finish 'at the front'.  The difference between AoTuV and iTunes AAC is within the margin of error and is not significant.  Moreover, the slightly higher score needs to be weighed against the fact the AoTuV used a slightly higher bitrate than iTunes.

The fact that all that all the codecs scored highly is unsurprising given the high bitrate.[a href="index.php?act=findpost&pid=357498"][{POST_SNAPBACK}][/a]

As sehested quoted all codecs produced similar bitrates in preliminary testing with a large amount of various files. Besides my tests guruboolez tested the bitrates with a big collection of classical music. Some others posted their bitrate findings too.

The used Nero encoder was changed after I added it to the bitrate table, and the tested version produced also about 134 kbps with my test files and in Sebastian's personal testing. (I'll update the table now even Nero was disqualified from the test.)

Each codec has different methods for keeping the quality level constant. Usage of high momentary bitrates is a completely acceptable method. I suppose it would have been good to include some samples that would have produced very small average bitrates, but because iTunes has a hard-coded 128 kbps low limit and Nero was used in ABR mode I think that would not been fair for the three other encoders. Also, that kind of samples tend to be even easier for the encoders, so possibly the quality differences would have been indistinguishable.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-16 10:35:59
Quote
I suppose it would have been good to include some samples that would have produced very small average bitrates, but because iTunes has a hard-coded 128 kbps low limit and Nero was used in ABR mode I think that would not been fair for the three other encoders.
[a href="index.php?act=findpost&pid=357525"][{POST_SNAPBACK}][/a]

I wouldn't say that's unfair. "Low bitrate" moments are a true part of real-world encoding, and it wouldn't be unfair to test them. A complete test should in reality include such samples. But with 18 samples only, it's impossible to represent each encoding situation.
iTunes (bitrate floor at 128 kbps) and Nero Digital (ABR) both have a limited efficiency. As example, if you encode monophonic albums or albums including mono tracks, or low volume musical compositions, both AAC encoders will systematically waste a big amount of bitrate. I've recently tested it by encoding some Jazz oldies: LAME's bitrate (-V5 --vbr...athaa...) was around 85 kbps whereas iTunes was 128 and Nero Digital 130. Same for a very recent complete set of Beethoven's sonatas, recorded in the last years in stereo: ~100 kbps for lame and ~130 for AAC. The limited efficiency is the reverse side of the developer's choice.
But on the other side, such limitation may have a positive effect on quality. Not for mono, but for low volume moments. LAME tend to produce ringing, which become easier to hear with a higher volume playback and sometimes really irritating after ReplayGain/MP3gain. That's worrying and I appreciate the limitation of both iTunes and Nero which are a warranty against psycho-acoustic failure or optimstic choice.

Now what matter is how would react these encoders with low bitrate (corresponding usually to low volume tracks/moments). I performed recently a listening test with 150 classical music tracks including several samples corresponding to this situation. My results are available on the forum. From my experience, Vorbis has no problem to handle this situation; the bitrate doesn't sink to a ultra-low value (rarely less than 100 kbps) and there's no compromise in high frequencies (no ringing). Unfortunately, I can't say the same for LAME. It has real problems here, and low bitrate may correspond to low quality. Hence the usage of --athaa-sensitivity to lower this annoyance (which highers the bitrate on such situation).

Quote
Also, that kind of samples tend to be even easier for the encoders, so possibly the quality differences would have been indistinguishable.

Not necessary true. The Debussy.wav sample revealed in 2004 that such samples could lead to obvious artefacts for some encoders (LAME, Musepack) but not others
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Alex B on 2006-01-16 10:54:02
Of course you are correct, guruboolez. My in-depth opinion is quite similar. Perhaps I simplified my answer a bit too much.

Actually, in the pretest discussion I tried to find some "low bitrate" human voice samples that would have been difficult for the encoders too, but with my limited experience in this I wasn't very successful. I posted a couple of jazz and opera voice samples (one of them was mono), but no one was impressed about them.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Ivan Dimkovic on 2006-01-16 11:26:10
Quote
If my memory serves me, Ivan promised to publish results of PEAQ analyses of encoded sound samples from the listening test. May we hope to see them?
[a href="index.php?act=findpost&pid=357501"][{POST_SNAPBACK}][/a]


Yes,

Here they are (XLS file) - files were processed with the reference commercial implementation of the Advanced PEAQ analysis (ITU BS.1387 / Advanced PEAQ) with automatic delay and gain compensation (pretty much default options)

So - Advanced PEAQ rated encoders like this:

iTunes: 4.46
Nero: 4.49
LAME: 4.33
Vorbis: 4.45
WMA: 4.44
Shine: 2.48

Here are raw results, starting from the last sample, down to the first:

Code: [Select]
    iTunes    Nero    LAME    Vorbis    WMA    Shine
Sample 18    -0.291379    -0.22438    -0.551136    -0.410874    -0.464171    -2.89086
Sample 17    -0.485661    -0.625404    -0.558785    -0.477524    -0.517271    -3.32223
Sample 16    -0.562372    -0.474273    -0.82978    -0.430512    -0.62892    -1.97767
Sample 15    -0.536917    -0.397598    -0.684565    -0.604991    -0.757444    -3.09543
Sample 14    -0.55296    -0.601416    -0.639036    -0.59632    -0.557496    -2.68244
Sample 13    -0.523003    -0.533149    -0.504188    -0.646469    -0.395212    -2.82543
Sample 12    -0.584267    -0.626797    -0.710563    -0.615299    -0.559485    -1.73998
Sample 11    -0.539979    -0.455684    -0.604449    -0.470218    -0.518138    -1.71468
Sample 10    -0.414742    -0.469117    -0.701445    -0.31328    -0.710211    -2.08605
Sample 09    -0.698941    -0.720779    -0.729232    -0.743117    -0.515512    -3.49285
Sample 08    -0.595305    -0.66342    -0.615311    -0.57183    -0.566153    -3.1158
Sample 07    -0.430192    -0.422099    -0.765232    -0.668022    -0.545455    -2.10136
Sample 06    -0.763366    -0.405315    -0.626802    -0.467821    -0.594331    -1.00229
Sample 05    -0.648616    -0.682204    -0.512671    -0.625941    -0.458714    -3.0605
Sample 04    -0.573599    -0.545224    -0.625719    -0.605796    -0.493474    -3.22439
Sample 03    -0.454201    -0.418986    -1.21393    -0.531274    -0.833721    -1.35914
Sample 02    -0.497593    -0.431844    -0.524538    -0.578111    -0.453721    -2.72985
Sample 01    -0.583718    -0.501613    -0.654598    -0.565685    -0.567475    -2.96624
ANOVA    -0.54    -0.51    -0.67    -0.55    -0.56    -2.52
Worst Item    -0.763366    -0.720779    -1.21393    -0.743117    -0.833721    -3.49285
Best Item    -0.291379    -0.22438    -0.504188    -0.31328    -0.395212    -1.00229
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-16 11:35:55
Quote
Quote
Sorry, no.
You can do it yourself and state that it's your plot - I have no problem with that. I am not responsible for the way people post that image and what they do and don't do. I doubt that image is going to appear on a forum out of the sudden and without any comments - people who post the plot can also create a link below or above. The encoder version is more than enough in the graphics.
Discussion about this is over.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357432")

ok then, can you post the xls/openoffice/whatever you used?
[a href="index.php?act=findpost&pid=357440"][{POST_SNAPBACK}][/a]


[a href="http://maresweb.de/listening-tests/mf-128-1/pandts.xls]http://maresweb.de/listening-tests/mf-128-1/pandts.xls[/url]
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-16 11:40:05
Ivan> PEAQ results for LAME with sample 16 "SongForGuy" is unusually low. The listening test result for this sample doesn't show anything unusual:
http://www.maresweb.de/listening-tests/mf-128-1/image019.png (http://www.maresweb.de/listening-tests/mf-128-1/image019.png)

Do you have an idea about this big difference?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-16 11:40:09
Quote
Very big congratulations to Sebastian for managing to produce interesting results after all the hardships he had to go through to discuss his test and later conduct it.

And here's to your upcoming tests! May the next test (64kbps?) be as successfull (but less stressful!) than this one. I know for sure I can hardly wait for it...
[a href="index.php?act=findpost&pid=357444"][{POST_SNAPBACK}][/a]


Ivan is doing some pre-tests to decide which Nero encoder to use (AFAIK). I would love to conduct the next low-bitrate test (maybe with Shade[ST] since he's interested in running a test, too).

Quote
How I can decrypt the results with the key.key file?
[a href="index.php?act=findpost&pid=357456"][{POST_SNAPBACK}][/a]


ABC/HR has an option like Tools --> Process results or something (sorry, the school PC I am currently using does not have Java installed).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Gabriel on 2006-01-16 11:42:46
Quote
So - Advanced PEAQ rated encoders like this:

As you have access to both standard and advanced PEAQ, does the advanced version really provide more real life correlation?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-16 11:43:20
I think the PEAQ analysis is not reliable since it is based on psy-models like most lossy encoders. If an encoder has the same (or at least very similar) psy-model as the tool used for testing, the PEAQ analysis is going to rate the respective codec as excellent, even though the psy-model of the tested encoder might not achieve good results. Of course, PEAQ uses a very highly tuned model, but still, it is not as reliable as real-world testing with human beings.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Gabriel on 2006-01-16 11:49:45
Quote
If an encoder has the same (or at least very similar) psy-model as the tool used for testing, the PEAQ analysis is going to rate the respective codec as excellent, even though the psy-model of the tested encoder might not achieve good results.

Of course, and it seems clear that Ivan is using advanced PEAQ often, based on the PEAQ results of Nero.
However, even with the same psymodel, in a bitrate limited scenario, there is the bit allocation part that you will impact results.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Ivan Dimkovic on 2006-01-16 12:00:02
Quote
Gabriel
As you have access to both standard and advanced PEAQ, does the advanced version really provide more real life correlation?


I would have to re-do the test with the Basic PEAQ, but today I have absolutely no time for that - as Basic PEAQ is publically available (AFsp package / PQEvalAudio)  someone could also do it - if not, I'll try over the week.

As for the Advanced PEAQ and correlation - I cannot comment this, as I have not performed too much correlation tests, but judging from the ITU BS.1387 papers on AES conferences, they indeed achieved higher correlation with the real-world listening test data by usign Advanced PEAQ, than with the Basic (FFT) model.  Correlation was one of the factors for forming the "basic" and "advanced" PEAQ out of the so-called "toolbox" (set of algorithmic tools).

Quote
Sebastian MaresI think the PEAQ analysis is not reliable since it is based on psy-models like most lossy encoders. If an encoder has the same (or at least very similar) psy-model as the tool used for testing, the PEAQ analysis is going to rate the respective codec as excellent, even though the psy-model of the tested encoder might not achieve good results. Of course, PEAQ uses a very highly tuned model, but still, it is not as reliable as real-world testing with human beings.


Hmm - I think the purpose of PEAQ is not to replace the subjective listening tests - but to assist developers / engineers in situations where subjective tests would be tpp slow (even impossible) and/or too expensive - for example, where a lot of tests need to be made in the short time for many tools,  doing subjective tests for each setting/tool/combination would be impossible.

Therefore PEAQ can only be "more or less" correlated to the real-world data - by no means they could replace real human being and its judgment, especially a lot of high trained humans in a properly done test

As for the psymodel - as Gabriel said, there is much more in the codec that generates noise than just the psymodel's estimation - bit allocation over time and frequency, quantization properties, stereo coding, lossless (noiseless) coding step, etc...  So, tweaking the psymodel just to match PEAQ values probably won't lead to the best ODG result.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Ivan Dimkovic on 2006-01-16 12:13:59
Quote
Ivan> PEAQ results for LAME with sample 16 "SongForGuy" is unusually low. The listening test result for this sample doesn't show anything unusual:
http://www.maresweb.de/listening-tests/mf-128-1/image019.png (http://www.maresweb.de/listening-tests/mf-128-1/image019.png)

Do you have an idea about this big difference?
[a href="index.php?act=findpost&pid=357534"][{POST_SNAPBACK}][/a]


Hmm no idea at all  But I also have few samples exibiting similar behavior, but for AAC.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: loophole on 2006-01-16 12:17:14
Just quickly going back to AAC players - besides the ones already mentioned there is http://www.panasonic-europe.com/news_read.aspx?id=2091 (http://www.panasonic-europe.com/news_read.aspx?id=2091) this panasonic one (and an older model), the Sony Walkman W800 (and K750), and the PSP. If Sony are allowing AAC playback on their phones and PSP i wouldn't be suprised to see support on other stuff like their network walkmans or Hi-MD later on. I think the Hi-MD devices already playback mp3 natively (not that you'd ever want to buy one).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-16 12:54:33
Quote
http://maresweb.de/listening-tests/mf-128-1/pandts.xls (http://maresweb.de/listening-tests/mf-128-1/pandts.xls)
[a href="index.php?act=findpost&pid=357533"][{POST_SNAPBACK}][/a]

thanks
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Alex B on 2006-01-16 13:16:50
Quote
Ivan> PEAQ results for LAME with sample 16 "SongForGuy" is unusually low. The listening test result for this sample doesn't show anything unusual:
http://www.maresweb.de/listening-tests/mf-128-1/image019.png (http://www.maresweb.de/listening-tests/mf-128-1/image019.png)

Do you have an idea about this big difference?
[a href=\"index.php?act=findpost&pid=357534\"][{POST_SNAPBACK}][/a]
At first, I thought that sample 3 was similarly odd too:

Quote
iTunes Nero LAME Vorbis WMA Shine

Sample 03 -0.454201 -0.418986 -1.21393 -0.531274 -0.833721 -1.35914
but then I checked my test report and found out that I too ranked LAME and Shine similarly poorer than the others (which I couldn't differentiate):

Code: [Select]
ABC/HR for Java, Version 0.5b, 13 January 2006
Testname: Carbonelli

Tester:

1L = Sample03\Carbonelli_3.wav
2L = Sample03\Carbonelli_1.wav
3R = Sample03\Carbonelli_2.wav
4L = Sample03\Carbonelli_6.wav
5L = Sample03\Carbonelli_4.wav
6L = Sample03\Carbonelli_5.wav

---------------------------------------
General Comments:
---------------------------------------
3R File: Sample03\Carbonelli_2.wav
3R Rating: 3.8
3R Comment: Artifact in the very first long note
---------------------------------------
5L File: Sample03\Carbonelli_4.wav
5L Rating: 3.8
5L Comment: Sounds a little "unstable". Probably the low anchor.

Carbonelli is the only sample where low anchor is not obvious.
---------------------------------------

ABX Results:
Original vs Sample03\Carbonelli_4.wav
    11 out of 12, pval = 0.0030
Original vs Sample03\Carbonelli_2.wav
    17 out of 17, pval < 0.001


---- Detailed ABX results ----
Original vs Sample03\Carbonelli_4.wav
Playback Range: 14.227 to 15.868
    3:12:42 AM f 0/1 pval = 1.0
    3:12:45 AM p 1/2 pval = 0.75
    3:12:48 AM p 2/3 pval = 0.5
    3:12:51 AM p 3/4 pval = 0.312
    3:12:54 AM p 4/5 pval = 0.187
    3:12:57 AM p 5/6 pval = 0.109
    3:12:59 AM p 6/7 pval = 0.062
    3:13:02 AM p 7/8 pval = 0.035
    3:13:05 AM p 8/9 pval = 0.019
    3:13:08 AM p 9/10 pval = 0.01
    3:13:11 AM p 10/11 pval = 0.0050
    3:13:15 AM p 11/12 pval = 0.0030

Original vs Sample03\Carbonelli_2.wav
Playback Range: 00.000 to 01.329
    3:20:40 AM p 1/1 pval = 0.5
    3:20:48 AM p 2/2 pval = 0.25
    3:20:51 AM p 3/3 pval = 0.125
    3:20:54 AM p 4/4 pval = 0.062
    3:21:05 AM p 5/5 pval = 0.031
    3:21:32 AM p 6/6 pval = 0.015
    3:21:35 AM p 7/7 pval = 0.0070
    3:21:37 AM p 8/8 pval = 0.0030
    3:21:42 AM p 9/9 pval = 0.0010
    3:21:46 AM p 10/10 pval < 0.001
    3:21:54 AM p 11/11 pval < 0.001
    3:21:57 AM p 12/12 pval < 0.001
    3:22:01 AM p 13/13 pval < 0.001
    3:22:03 AM p 14/14 pval < 0.001
    3:22:06 AM p 15/15 pval < 0.001
    3:22:09 AM p 16/16 pval < 0.001
    3:22:12 AM p 17/17 pval < 0.001

As a comparison, guruboolez's results were these:

Code: [Select]
ABC/HR for Java, Version 0.5b, 07 décembre 2005
Testname: Carbonelli

Tester: guruboolez

1L = Sample03\Carbonelli_5.wav
2L = Sample03\Carbonelli_1.wav
3L = Sample03\Carbonelli_2.wav
4L = Sample03\Carbonelli_6.wav
5R = Sample03\Carbonelli_4.wav
6R = Sample03\Carbonelli_3.wav

---------------------------------------
General Comments:
---------------------------------------
1L File: Sample03\Carbonelli_5.wav
1L Rating: 4.8
1L Comment: very very small ringing (I'm surprised myself by the ABX score I get)
---------------------------------------
2L File: Sample03\Carbonelli_1.wav
2L Rating: 3.8
2L Comment: little ringing
---------------------------------------
3L File: Sample03\Carbonelli_2.wav
3L Rating: 2.0
3L Comment: tremolo effect; very minor kind of warbling also
---------------------------------------
4L File: Sample03\Carbonelli_6.wav
4L Rating: 4.3
4L Comment: minor distortion (unsure -> need ABXing)
---------------------------------------
5R File: Sample03\Carbonelli_4.wav
5R Rating: 1.3
5R Comment: severe ringing
---------------------------------------
6R File: Sample03\Carbonelli_3.wav
6R Rating: 4.3
6R Comment: same kind of problem than 4L
---------------------------------------

ABX Results:
Original vs Sample03\Carbonelli_6.wav
    8 out of 8, pval = 0.0030
Original vs Sample03\Carbonelli_5.wav
    7 out of 8, pval = 0.035
Original vs Sample03\Carbonelli_3.wav
    7 out of 8, pval = 0.035


---- Detailed ABX results ----
Original vs Sample03\Carbonelli_6.wav
Playback Range: 00.000 to 02.695
    8:15:57 PM p 1/1 pval = 0.5
    8:16:01 PM p 2/2 pval = 0.25
    8:16:07 PM p 3/3 pval = 0.125
    8:16:09 PM p 4/4 pval = 0.062
    8:16:12 PM p 5/5 pval = 0.031
    8:16:16 PM p 6/6 pval = 0.015
    8:16:21 PM p 7/7 pval = 0.0070
    8:16:25 PM p 8/8 pval = 0.0030

Original vs Sample03\Carbonelli_5.wav
Playback Range: 00.000 to 02.695
    8:18:32 PM p 1/1 pval = 0.5
    8:18:36 PM p 2/2 pval = 0.25
    8:18:41 PM p 3/3 pval = 0.125
    8:18:51 PM p 4/4 pval = 0.062
    8:18:55 PM p 5/5 pval = 0.031
    8:18:59 PM p 6/6 pval = 0.015
    8:19:03 PM p 7/7 pval = 0.0070
    8:19:16 PM f 7/8 pval = 0.035

Original vs Sample03\Carbonelli_3.wav
Playback Range: 00.000 to 02.695
    8:17:05 PM p 1/1 pval = 0.5
    8:17:09 PM p 2/2 pval = 0.25
    8:17:26 PM p 3/3 pval = 0.125
    8:17:31 PM p 4/4 pval = 0.062
    8:17:34 PM p 5/5 pval = 0.031
    8:17:38 PM f 5/6 pval = 0.109
    8:17:45 PM p 6/7 pval = 0.062
    8:17:49 PM p 7/8 pval = 0.035

Here's a summary:

PEAQ _ Alex B _ guruboolez

iTunes: -0.45 (~4.5) _ 5.0 _ 3.8
Lame: -1.2 (~3.8) _ 3.8 _ 2.0
Nero: -0.42 (~4.6) _ 5.0 _ 4.3
Shine: -1.36 (~3.6) _ 3.8 _ 1.3
Vorbis: -0.53 (~4.5) _ 5.0 _ 4.8
WMA: -0.83 (~4.2) _ 5.0 _ 4.3

All three testers found LAME and Shine clearly worse than the others with this sample. Advanced PEAQ found WMA Pro slightly worse than the AAC codecs or Vorbis. Guru found that iTunes had a bit more problems than Nero or WMA and ranked Vorbis to be the best.

EDIT

The overall results for this sample:

(http://www.maresweb.de/listening-tests/mf-128-1/image006.png)

Artist: Giovanni Stefano Carbonelli
Title: Sonata Settima In La Minore
Genre: Baroque Chamber Music
Sumbitted by: guruboolez

Nero score: 4.90

iTunes, AoTuV and WMA Professional are tied on first place. LAME and Shine are tied on last place.

EDIT 2: added PEAQ resuts in scale 0.0 - 5.0
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-16 13:50:04
Quote
Ivan> PEAQ results for LAME with sample 16 "SongForGuy" is unusually low. The listening test result for this sample doesn't show anything unusual:
http://www.maresweb.de/listening-tests/mf-128-1/image019.png (http://www.maresweb.de/listening-tests/mf-128-1/image019.png)

Do you have an idea about this big difference?
[a href="index.php?act=findpost&pid=357534"][{POST_SNAPBACK}][/a]
Several individuals have mistaken LAME for the low anchor on this sample. Could be by coincidence or... LAME has an issue with SongForGuy.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: CoRoNe on 2006-01-16 13:51:44
Perhaps this question has been asked many times, but I'd like to know what it is what makes this aoTuVb4.51 version so extremely good compared to the libvorbis 1.12 version? I even hear "the resurrection of Vorbis" is due to aoTuV!
If the aoTuVb4.51 has a score of 4,79 in this test, what average score would the libvorbis 1.12 have compared to the aoTuV and all the others!??
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Ivan Dimkovic on 2006-01-16 13:55:09
One note about Advanced PEAQ, and the ratings - its neural network was trained by the large listening test database up to 1997.

In addition,  in its tool set, PEAQ itself does not take into account stereo effects too much, and it might be the reason of the difference in real rating between, say, Vorbis - which is using lossy stereo above (if I am not mistaken) 10 kHZ and PEAQ estimation.

It could be that PEAQ's model treats distortion introduced by the lossy stereo as much more relevant as it is to the most of the human listeners due to a HF stereo masking not taken into account.

What would be interesting, if someone starts the project of extending the PEAQ by more elaborate binaural hearing models, as well as training on the more recent listening tests.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-16 17:03:43
Quote
Quote
Let me rephrase: How many of the grades given were 5.0? And how much if you re-add the ranked references as meaning that codec got a 5.0 for that sample?

So, there's 403 valid test results times 5 codecs (Shine doesn't count), or about 2015 grades. How many of those are 5.0, i.e. perfectly transparent?
[a href="index.php?act=findpost&pid=357297"][{POST_SNAPBACK}][/a]


Geez, no idea. That would take too much time - time that I don't have right now. I could send you all results if you really want to do it yourself.
[a href="index.php?act=findpost&pid=357306"][{POST_SNAPBACK}][/a]

@Sebastian: I like to have a go at it.  Just tell me how to obtain the files.

[span style='font-size:8pt;line-height:100%']Edit: Speliing[/span]
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-16 18:58:31
Quote
Quote
Quote
Let me rephrase: How many of the grades given were 5.0? And how much if you re-add the ranked references as meaning that codec got a 5.0 for that sample?

So, there's 403 valid test results times 5 codecs (Shine doesn't count), or about 2015 grades. How many of those are 5.0, i.e. perfectly transparent?
[a href="index.php?act=findpost&pid=357297"][{POST_SNAPBACK}][/a]


Geez, no idea. That would take too much time - time that I don't have right now. I could send you all results if you really want to do it yourself.
[a href="index.php?act=findpost&pid=357306"][{POST_SNAPBACK}][/a]

@Sebastian: I like to have a go at it.  Just tell me how to obtain the files.

[span style='font-size:8pt;line-height:100%']Edit: Speliing[/span]
[a href="index.php?act=findpost&pid=357613"][{POST_SNAPBACK}][/a]


Damn, while re-sorting all files for you now, I noticed a mistake caused by an improper naming of a result. Sample 8 (eric_clapton) has a result called "anon16-result08" which is in fact for Elizabeth. "anon16-result07" is identical to "anon16-result08". I am going to re-screen the results for sample 8 and then re-do the final plots. Sorry folks! I am sure all other results are in order.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-16 19:18:42
The plot for sample 8 was updated - only minor changes, nothing that would change the overall results.

Still working on the final plots now. The only difference is the changed Tukey's HSD value (change of 0.001) and the lower ranking of Shine only.

Edit: All problems solved now. As stated, only the non-zoomed plot was affected (and there, only Shine is affected - it lost 0.01 points).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-17 00:13:46
Quote
Vorbis did not finish 'at the front'.  The difference between AoTuV and iTunes AAC is within the margin of error and is not significant.
[a href="index.php?act=findpost&pid=357498"][{POST_SNAPBACK}][/a]


I didn't say it was alone at the front, did I?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-17 01:17:32
Quote
Edit: All problems solved now. As stated, only the non-zoomed plot was affected (and there, only Shine is affected - it lost 0.01 points).

updated xls?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Caroliano on 2006-01-17 02:58:38
Quote
ABC/HR has an option like Tools --> Process results or something (sorry, the school PC I am currently using does not have Java installed).

Thanks, it worked. It shoud be explaned and linked in somewere, because with search is dificult to find that.

The decoded sample12 result that I don't send:
Code: [Select]
ABC/HR for Java, Version 0.5b, 13 Janeiro 2006
Testname: MysteriousTimes

Tester: Caroliano

1R = Sample12\MysteriousTimes_5.wav
2R = Sample12\MysteriousTimes_6.wav
3L = Sample12\MysteriousTimes_2.wav
4R = Sample12\MysteriousTimes_3.wav
5L = Sample12\MysteriousTimes_4.wav
6R = Sample12\MysteriousTimes_1.wav

---------------------------------------
General Comments:
---------------------------------------
1R File: Sample12\MysteriousTimes_5.wav
1R Rating: 4.0
1R Comment:
---------------------------------------
2R File: Sample12\MysteriousTimes_6.wav
2R Rating: 4.2
2R Comment:
---------------------------------------
3L File: Sample12\MysteriousTimes_2.wav
3L Rating: 4.2
3L Comment:
---------------------------------------
5L File: Sample12\MysteriousTimes_4.wav
5L Rating: 2.5
5L Comment: low-anchor
---------------------------------------

ABX Results:

Again I rated vorbis as the worst sounding. I must be sensible for Vorbis artfacts...

PS: I was anon26, but is ok to display my name. I only forgot to put my name in them...
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-17 07:00:52
Quote
Quote
Edit: All problems solved now. As stated, only the non-zoomed plot was affected (and there, only Shine is affected - it lost 0.01 points).

updated xls?
[a href="index.php?act=findpost&pid=357706"][{POST_SNAPBACK}][/a]


Everything - plots, the results page, the results RAR...
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-17 18:05:25
Quote
How many of the grades given were 5.0? And how much if you re-add the ranked references as meaning that codec got a 5.0 for that sample?

So, there's 403 valid test results times 5 codecs (Shine doesn't count), or about 2015 grades. How many of those are 5.0, i.e. perfectly transparent?
[a href="index.php?act=findpost&pid=357297"][{POST_SNAPBACK}][/a]

@Sebastian: Thanks for the test results. 

I managed to proces the results and produce this table:
Code: [Select]
Ranked refs      24     14     18      6     19     25
5.0's           304    260    299     36    313    302
5.0's %          75%    65%    74%     9%    78%    75%
4.0 and above   361    334    358     60    375    355
4.0 and above %  90%    83%    89%    15%    93%    88%
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-17 18:24:46
Quote from: Sebastian Mares,Jan 17 2006, 07:00 AM

updated xls?


Everything - plots, the results page, the results RAR... [/quote]
i mean, do you have an updated xls to download?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: HotshotGG on 2006-01-17 19:55:33
Quote
Perhaps this question has been asked many times, but I'd like to know what it is what makes this aoTuVb4.51 version so extremely good compared to the libvorbis 1.12 version? I even hear "the resurrection of Vorbis" is due to aoTuV!
If the aoTuVb4.51 has a score of 4,79 in this test, what average score would the libvorbis 1.12 have compared to the aoTuV and all the others!??


Aoyumi does a terrific job tweaking the Noise Normalization code and bitrate allocation scheme.  This is not just for the community, but it's also a Xiph bounty don't forget.  In terms of streaming a lot of people perceptual prefer the Noise Normalization which more natural to SBR or PNS.  I think both AAC and Vorbis Psychoacoustics models are unique and different from a technical perspective, but what can be seen is they are more perceptually advanced today than we would have seen 5 years ago. These listenings tests are a clear example of this. You are never going to achieve 100% transparency in any case, but a lot samples give you a great indication of the performance of the encoder.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-17 20:11:27
Quote
i mean, do you have an updated xls to download?
[a href="index.php?act=findpost&pid=357844"][{POST_SNAPBACK}][/a]


I wonder how you always manage to screw the quotes...

Anyways, the RAR and the XLS were updated and have the same names, so just redownload the files.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Lyx on 2006-01-17 20:14:04
Quote
I even hear "the resurrection of Vorbis" is due to aoTuV!


Besides of promoting hardware-support, xiph did almost nothing since v1.00. Neither bugfixes nor improvements. The reason for vorbis' "resurrection" imho is because of two 3rd party devs: primarily "Aoyumi", and secondarily QuantumKnot.

- Lyx
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-17 20:14:40
Quote
Quote
How many of the grades given were 5.0? And how much if you re-add the ranked references as meaning that codec got a 5.0 for that sample?

So, there's 403 valid test results times 5 codecs (Shine doesn't count), or about 2015 grades. How many of those are 5.0, i.e. perfectly transparent?
[a href="index.php?act=findpost&pid=357297"][{POST_SNAPBACK}][/a]

@Sebastian: Thanks for the test results. 

I managed to proces the results and produce this table:
Code: [Select]
Ranked refs      24     14     18      6     19     25
5.0's           304    260    299     36    313    302
5.0's %          75%    65%    74%     9%    78%    75%
4.0 and above   361    334    358     60    375    355
4.0 and above %  90%    83%    89%    15%    93%    88%

[a href="index.php?act=findpost&pid=357840"][{POST_SNAPBACK}][/a]


Since one of the results was invalid (for sample 8 as you can read in my previous posts and on the updated results page), the new number of valid results is 402 - your XLS (which you sent to me via e-mail) shows the old number 403. This is because Shade[ST] sent two different results for the same sample. Both results were valid (didn't contain ranked references IIRC), but only one of them was used.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kwanbis on 2006-01-17 20:37:42
Quote
I wonder how you always manage to screw the quotes...

i wonder that too 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: HotshotGG on 2006-01-17 20:42:23
Quote
Besides of promoting hardware-support, xiph did almost nothing since v1.00. Neither bugfixes nor improvements. The reason for vorbis' "resurrection" imho is because of two 3rd party devs: primarily "Aoyumi", and secondarily QuantumKnot.


Yeah, because they were assigned projects from now to 2015 (that's an exegeration). Besides it's a community project anyway.  In that time I think I have seen all of, but two of the bounty's completed.  5.1 and bitrate peeling. The reason being is that has more to do with the scope of the low level libraries and the encoder then other things. Our good friend John33 did a good job fixing the channel mapping though so it's halfway there.  Anyway ignore my rantings and continue on with the listening test discussion.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Gecko on 2006-01-17 22:22:31
I'm having some trouble with the comments I see on the results page.
On top it says: "One codec can be said to be better than another with 95% confidence if the bottom of its segment is at or above the top of the competing codec's line segment." In other words, they don't overlap. With which I agree, but right on the first plot it says: "iTunes is not as good as AoTuV", but the confidence intervals do overlap. What is correct?
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-17 23:47:28
Quote
Quote
I even hear "the resurrection of Vorbis" is due to aoTuV!


Besides of promoting hardware-support, xiph did almost nothing since v1.00. Neither bugfixes nor improvements. The reason for vorbis' "resurrection" imho is because of two 3rd party devs: primarily "Aoyumi", and secondarily QuantumKnot.

- Lyx
[a href="index.php?act=findpost&pid=357863"][{POST_SNAPBACK}][/a]


We should mention Nyaochi too and his "modest tuning".
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-17 23:50:47
Quote
Quote
Perhaps this question has been asked many times, but I'd like to know what it is what makes this aoTuVb4.51 version so extremely good compared to the libvorbis 1.12 version? I even hear "the resurrection of Vorbis" is due to aoTuV!
If the aoTuVb4.51 has a score of 4,79 in this test, what average score would the libvorbis 1.12 have compared to the aoTuV and all the others!??


Aoyumi does a terrific job tweaking the Noise Normalization code and bitrate allocation scheme.  This is not just for the community, but it's also a Xiph bounty don't forget.  In terms of streaming a lot of people perceptual prefer the Noise Normalization which more natural to SBR or PNS.  I think both AAC and Vorbis Psychoacoustics models are unique and different from a technical perspective, but what can be seen is they are more perceptually advanced today than we would have seen 5 years ago. These listenings tests are a clear example of this. You are never going to achieve 100% transparency in any case, but a lot samples give you a great indication of the performance of the encoder.
[a href="index.php?act=findpost&pid=357859"][{POST_SNAPBACK}][/a]


I haven't been following developments in aoTuV lately, but has Aoyumi improved the block switching algorithm to reduce smearing on microattacks?  If not, then I might have a look at it again if I have time.  But if it's been fixed, then I won't have to worry.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: HotshotGG on 2006-01-18 00:01:21
Quote
I haven't been following developments in aoTuV lately, but has Aoyumi improved the block switching algorithm to reduce smearing on microattacks? If not, then I might have a look at it again if I have time. But if it's been fixed, then I won't have to worry.


He mentioned how he was considering adjusting the masking threshold for long blocks once in past, but he couldn't touch it for "other" reasons. Most of his tunings now go into Noise Normalization and bitrate allocation. He made a few significant changes in the past as you know to psychoacoustics model to deal with the "HF boost" issue.  He made some simple additions into code that somehow adjust the MDCT in conjunction with the psymodel? that part I really don't understand to much, seeing that trying to figure out how transform interacts with everything else is confusing. It is a bit of a learning experience reading through it though. The AoTuV Beta 3 tunings were also merged into latest Libvorbis I believe 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-18 00:10:05
Quote
Quote
I haven't been following developments in aoTuV lately, but has Aoyumi improved the block switching algorithm to reduce smearing on microattacks? If not, then I might have a look at it again if I have time. But if it's been fixed, then I won't have to worry.


He mentioned how he was considering adjusting the masking threshold for long blocks once in past, but he couldn't touch it for "other" reasons. Most of his tunings now go into Noise Normalization and bitrate allocation. He made a few significant changes in the past as you know to psychoacoustics model to deal with the "HF boost" issue.  He made some simple additions into code that somehow adjust the MDCT in conjunction with the psymodel? that part I really don't understand to much, seeing that trying to figure out how transform interacts with everything else is confusing. It is a bit of a learning experience reading through it though 
[a href="index.php?act=findpost&pid=357913"][{POST_SNAPBACK}][/a]


Cool, thanks.  Lately I've thought of an idea that may make block switching more accurate (hopefully it works better than my last attempt at fixing it, which only solved two problem samples  ) and hope it doesn't infringe on any patents, as Monty warned in the source.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: HotshotGG on 2006-01-18 00:19:59
Quote
Cool, thanks. Lately I've thought of an idea that may make block switching more accurate (hopefully it works better than my last attempt at fixing it, which only solved two problem samples lalala.gif ) and hope it doesn't infringe on any patents, as Monty warned in the source.


He was only using threshold-by-band masking for some reason (why not experiment though?). Have you experimented with a Wavelet filterbanks or anything of that nature? I know you mentioned 9/7 biorthorthogonal experiments, etc. If had more coding experience I would try to help out, but I have only written simple data algorithms and a lot of the structures in here are immense 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-18 00:33:01
Quote
Quote
Cool, thanks. Lately I've thought of an idea that may make block switching more accurate (hopefully it works better than my last attempt at fixing it, which only solved two problem samples lalala.gif ) and hope it doesn't infringe on any patents, as Monty warned in the source.


He was only using threshold-by-band masking for some reason (why not experiment though?). Have you experimented with a Wavelet filterbanks or anything of that nature? I know you mentioned 9/7 biorthorthogonal experiments, etc.
[a href="index.php?act=findpost&pid=357918"][{POST_SNAPBACK}][/a]


Initially I had a play with wavelets, but I then moved onto other techniques.  The 9/7 biorthogonal wavelets seemed useless for audio since they were more suited to image coding, where the signal is predominantly smooth.

Oops, we're going OT here.  Sorry.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: HotshotGG on 2006-01-18 00:40:16
Quote
Initially I had a play with wavelets, but I then moved onto other techniques. The 9/7 biorthogonal wavelets seemed useless for audio since they were more suited to image coding, where the signal is predominantly smooth.


Yeah you don't find to many publications or Research papers with Wavelets in reguard to audio processing. There basis functions are more suited image and video coding. They also look 10x smoother than DCT based implimentations I was impressed. I am sure there is some filterbanks that would work though.  Okedoke back to testing
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: IgorC on 2006-01-18 00:52:21
Quote
Initially I had a play with wavelets, but I then moved onto other techniques.  The 9/7 biorthogonal wavelets seemed useless for audio since they were more suited to image coding, where the signal is predominantly smooth.

Sorry for offtopic
Even for image coding there is still without result.  For example Snow wavelet-based videocodec. It was promising new tec codec.  However the development of this codec was and is too slow. It is hard to say reason. Maybe  devs do not hurry with it or wavelets are not enough powerfull.  However x264 (with its great RDO) was and is too fast developing.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-18 01:47:44
Quote
Quote
Initially I had a play with wavelets, but I then moved onto other techniques.  The 9/7 biorthogonal wavelets seemed useless for audio since they were more suited to image coding, where the signal is predominantly smooth.

Sorry for offtopic
Even for image coding there is still without result.  For example Snow wavelet-based videocodec. It was promising new tec codec.  However the development of this codec was and is too slow. It is hard to say reason. Maybe  devs do not hurry with it or wavelets are not enough powerfull.  However x264 (with its great RDO) was and is too fast developing.
[a href="index.php?act=findpost&pid=357924"][{POST_SNAPBACK}][/a]


Well, there is that extra dimension in video.  Plus there is often a gap between research and actual implementation I guess.

But for image coding, wavelets are definitely the best.  They outperform the best block transform coders by a mile, plus it's been shown mathematically why they do better too (for images at least).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: nyaochi on 2006-01-18 03:00:14
First of all, thank Sebastian and the participants for conducting/contributing this listening test. I couldn't contribute to this test, but it was interesting to see many encoders are reaching near-transparent quality at 128kbps.

Quote
We should mention Nyaochi too and his "modest tuning".
[a href="index.php?act=findpost&pid=357907"][{POST_SNAPBACK}][/a]

QuantumKnot (and guruboolez too), thanks for mentioning it although it's not so much alive in aoTuV's code.  QKTune's HF noise compensation (or the similar technique) still plays an important role in aoTuV.

Quote
Cool, thanks.  Lately I've thought of an idea that may make block switching more accurate (hopefully it works better than my last attempt at fixing it, which only solved two problem samples  ) and hope it doesn't infringe on any patents, as Monty warned in the source.
[a href="index.php?act=findpost&pid=357914"][{POST_SNAPBACK}][/a]

I'd love to listen it. 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: nyaochi on 2006-01-18 03:34:22
As Sehested generated a table in post #149, 60%-80% evaluation trials in the test could not distinguish the compressed samples from the original. However, I found a Japanese blog extracting guruboolez's listening result from the whole result and analyzing it:
http://anonymousriver.hp.infoseek.co.jp/#2...15-1-guruboolez (http://anonymousriver.hp.infoseek.co.jp/#20060115-1-guruboolez)
Although this blog is written in Japanese, you will easily find a score table and the results from various stasistical analyses including Friedman. 
Parametric Turkey's HSD: http://anonymousriver.hp.infoseek.co.jp/12...ruboolez_PT.txt (http://anonymousriver.hp.infoseek.co.jp/128kbps_200512-200601/guruboolez_PT.txt)
Blocked ANOVA / Fisher's LSD: http://anonymousriver.hp.infoseek.co.jp/12...ruboolez_BA.txt (http://anonymousriver.hp.infoseek.co.jp/128kbps_200512-200601/guruboolez_BA.txt)
Non-parametric Turkey's HSD: http://anonymousriver.hp.infoseek.co.jp/12...ruboolez_NT.txt (http://anonymousriver.hp.infoseek.co.jp/128kbps_200512-200601/guruboolez_NT.txt)
Friedman / Nonparametric Fisher's LSD: http://anonymousriver.hp.infoseek.co.jp/12...uruboolez_F.txt (http://anonymousriver.hp.infoseek.co.jp/128kbps_200512-200601/guruboolez_F.txt)

guruboolez rarely rated 5.0 for the compressed samples. His scoring is also different especially for LAME, Nero, and WMA Pro. Just not to give a false impression, I don't mean to disrespect the overall result of this test. But it was interesting to see how brilliantly he listens to samples. 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Shade[ST] on 2006-01-18 03:46:37
Quote
As Sehested generated a table in post #149, 60%-80% evaluation trials in the test could not distinguish the compressed samples from the original. However, I found a Japanese blog extracting guruboolez's listening result from the whole result and analyzing it [...]
guruboolez rarely rated 5.0 for the compressed samples. His scoring is also different especially for LAME, Nero, and WMA Pro. Just not to give a false impression, I don't mean to disrespect the overall result of this test. But it was interesting to see how brilliantly he listens to samples. 


I always knew Francis was gifted ;-)

Continue comme ça!
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: QuantumKnot on 2006-01-18 04:57:54
Quote
I'd love to listen it. 
[a href="index.php?act=findpost&pid=357948"][{POST_SNAPBACK}][/a]


That's assuming I will find time to experiment with it.  If you are interested, we could discuss about the idea I had.  I'll give you an e-mail when the time comes.

Oh I better stay on topic....hmm....looks like someone beat me to compiling guru's results.  I was also interested in his results.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Sebastian Mares on 2006-01-18 06:15:16
Quote
I'm having some trouble with the comments I see on the results page.
On top it says: "One codec can be said to be better than another with 95% confidence if the bottom of its segment is at or above the top of the competing codec's line segment." In other words, they don't overlap. With which I agree, but right on the first plot it says: "iTunes is not as good as AoTuV", but the confidence intervals do overlap. What is correct?
[a href="index.php?act=findpost&pid=357889"][{POST_SNAPBACK}][/a]


Well, what I wanted to say is that they are tied, but the difference between iTunes and AoTuV is bigger than the difference between AoTuV and WMA Professional for example.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: ckjnigel on 2006-01-18 08:40:35
[As a help in following discussions about Ogg Vorbis throughout HydrogenAudio, I'd like to know which of the posters reads Japanese and communicates fairly reguarly with aoyumi .  I'm guessing that QuantumKnot and nyaochi do ... ]

I think this a great test (special kudos to Sebastian!), but the moral seems to be that quality is high and differences between the best codecs are now quite small at these rates.  At least, that's so for short music samples ...
What I still wonder about is whether there is an additional quality standard of exhaustion in prolonged listening, and whether this type of test necessarily relates to that.  We know that extended listening to music with distortion, especially intermodulation, will somehow just wear down and exhaust the listener. The way the listener copes is to stop listening, not necessarily even aware that the distortion frayed his nerves. (Some will recall the puzzlement about early CDs which were claimed to have imperceptible distortion, but nonetheless sounded appalling!)
The endurance test I'm thinking of might require subjects to wear the best, most comfortable headphones listening to music encoded with these same codecs until the last subject yanks off the headphones screaming for mercy.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Aoyumi on 2006-01-18 12:39:06
Thank you for Sebastian, and the persons concerned and the participant of a test.
I am especially interested in each result. 

Quote
I haven't been following developments in aoTuV lately, but has Aoyumi improved the block switching algorithm to reduce smearing on microattacks? If not, then I might have a look at it again if I have time. But if it's been fixed, then I won't have to worry.
I have not changed block switching algorithm. I am looking forward to your research. 

Quote
He made a few significant changes in the past as you know to psychoacoustics model to deal with the "HF boost" issue.
This portion will change in the following version again. 

Quote
The AoTuV Beta 3 tunings were also merged into latest Libvorbis I believe
The aoTuV beta3 is not merged into libvorbis. Please check the source code.  Formal libvorbis does not include change which has influence on encode quality after 1.1.


EDIT: TYPO&Addition
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: sehested on 2006-01-18 15:48:55
What if...

Most samples where rated 5.0 and some testers even ranked the references. A few of the more experienced testers where able to avoid using 5.0 at all.

The criteria for valid results is currently no ranked references. However this causes some tester to be very conservative and rating 5.0 when in doubt.

What if the ranked references where used?
What if only results that had no 5.0 rankings where used?
How would the overall results then look?

In the second line in the table below I have converted references ranked 4.0 and above to 5.0 ratings and included these "invalid" results.

In the third line I have reduced the number of test results to only include results without 5.0 rankings.

Code: [Select]
                   iTunes    LAME     Nero    Shine    AuTuV   WMA pro
Official result    4.74     4.60     4.68     2.35     4.79     4.70     (402)
Ranked references  4.74     4.60     4.70     2.38     4.78     4.72     (464)
No 5.0 ratings     3.90     3.74     3.57     1.51     3.91     3.69      (54)


ANOVA analysis for "No 5.0 ratings":
Code: [Select]
FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Blocked ANOVA analysis

Number of listeners: 54
Critical significance:  0.05
Significance of data: 0.00E+00 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of         Degrees     Sum of    Mean
variation         of Freedom  squares   Square    F      p

Total              323         368.45
Testers (blocks)    53          57.63
Codecs eval'd        5         233.42   46.68   159.82  0.00E+00
Error              265          77.40    0.29
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:   0.205

Means:

AuTuV    iTunes   LAME     WMA-pro  Nero     Shine    
 3.91     3.90     3.74     3.69     3.57     1.51  

---------------------------- p-value Matrix ---------------------------

        iTunes   LAME     WMA-pro  Nero     Shine    
AuTuV    0.943    0.099    0.042*   0.001*   0.000*  
iTunes            0.114    0.049*   0.002*   0.000*  
LAME                       0.696    0.110    0.000*  
WMA-pro                             0.227    0.000*  
Nero                                         0.000*  
-----------------------------------------------------------------------

AuTuV is better than WMA-pro, Nero, Shine
iTunes is better than WMA-pro, Nero, Shine
LAME is better than Shine
WMA-pro is better than Shine
Nero is better than Shine


Edit: Added ANOVA analysis
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: pepoluan on 2006-01-19 16:42:26
Okay links to this thread and the complete results have been placed in the wiki page, which you can see here (http://wiki.hydrogenaudio.org/index.php?title=Listening_Tests).

Fill the page, guys! (and gals!)
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: guruboolez on 2006-01-23 13:52:26
Quote
I haven't been following developments in aoTuV lately, but has Aoyumi improved the block switching algorithm to reduce smearing on microattacks?  If not, then I might have a look at it again if I have time.  But if it's been fixed, then I won't have to worry.
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=357909")

There's still headroom for progress I'd say. During my [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=38792]last listening evaluation[/url], Vorbis was still unsharp and noisy on micro-attacks/short-impulses samples. Most often bitrate doesn't go really high on such samples (Vorbis tends to inflate the bitrate when needed - but not here).
Try with this sample (http://gurusamples.free.fr/samples/A01_etching.wv) for a good start
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: kurtnoise on 2006-02-16 07:38:47
Some statistical analysis (http://forum.doom9.org/showthread.php?t=107393) by AMTuring about this listening test...


@Guru:: very nice your avatar btw...
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Garf on 2006-02-16 09:04:32
Quote
Some statistical analysis (http://forum.doom9.org/showthread.php?t=107393) by AMTuring about this listening test...
[{POST_SNAPBACK}][/a] (http://index.php?act=findpost&pid=364680\")

Oh please, can we keep the crackpot science out of this forum? This person wasn't banned here for no reason. Just juggling around scientific words doesn't magically make anything you say sensible, lest alone correct.

Quote
It is well known that some songs are more difficult to encode than others, and they result in lower quality encoded files regardless of the encoder used. So the assumption of equal means amongst experiments is violated.

Bzzzt. This was a VBR test. Meaning, although the average was 128kbps (or slightly more), the codecs could spend as much bits as necessary to keep all clips at a constant quality. This means you cannot immediately assume the means aren't equal, in fact it should be the opposite.

So, what happens if we actually look at the data? (note that he provides many graphcs to 'illustrate' his points, except the ones where, well, the data doesn't support his claims anywhere) The variance of the means of the samples is much less than the difference between the codecs themselves. (exluding Shine, which is CBR)

In other words, VBR works. I would have thought that that was "well known" by now.

Quote
The following table shows the Tukey HSD applied to the ranks.

Say what? You cannot apply plain Tukey HSD to rank scores, it's a parametric test. Now, I'm willing to argue that we shouldn't use parametric analysis (because the top end of the results clips at 5.0, and you can see this by observing that the lower rated the codec, the higher the variance). However, if anything parametric analysis gives stronger results. If you use rank scores, let's actually use the rank score version of Tukey HSD to analyze the results:

Code: [Select]
FRIEDMAN version 1.24 (Jan 17, 2002) [a href=\"http://ff123.net/]http://ff123.net/[/url]
Nonparametric Tukey HSD analysis

Number of listeners: 18
Critical significance:  0.05
Nonparametric Tukey's HSD:  25.894

Ranksums:

Vorbis  iTunes  WMA      Nero    LAME   
 73.00    62.50    49.50    48.50    36.50 

-------------------------- Difference Matrix --------------------------

        iTunes  WMA      Nero    LAME   
Vorbis    10.500  23.500  24.500  36.500*
iTunes            13.000  14.000  26.000*
WMA                          1.000  13.000 
Nero                                12.000 
-----------------------------------------------------------------------

Vorbis is better than LAME
iTunes is better than LAME

Gee, where did those "extra" conclusions go?
Let's compare this to the means with parametric Tukey HSD:

Code: [Select]
FRIEDMAN version 1.24 (Jan 17, 2002) [url=http://ff123.net/]http://ff123.net/[/url]
Tukey HSD analysis

Number of listeners: 18
Critical significance:  0.05
Tukey's HSD:  0.110

Means:

Vorbis  iTunes  WMA      Nero    LAME   
  4.79    4.74    4.70    4.68    4.60 

-------------------------- Difference Matrix --------------------------

        iTunes  WMA      Nero    LAME   
Vorbis    0.049    0.090    0.106    0.193*
iTunes              0.041    0.056    0.143*
WMA                          0.016    0.103 
Nero                                  0.087 
-----------------------------------------------------------------------

Vorbis is better than LAME
iTunes is better than LAME

Coincidence? Hardly. If you derive the rank scores from the means, how can you expect a different conclusion? How would you expect throwing away information to increase the significance? It won't, unless you use a completely wrong analysis method.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: bug80 on 2006-02-16 10:00:41
Maybe it is a good idea to post your reply on Doom9 also, Garf.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: ff123 on 2006-02-16 13:56:02
I posted a reply on doom9, suggesting he use bootstrap resampling if he's hard up to analyze the data some more.

ff123
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: detokaal on 2006-02-16 16:45:55
Quote
Quote
Some statistical analysis (http://forum.doom9.org/showthread.php?t=107393) by AMTuring about this listening test...
[a href="index.php?act=findpost&pid=364680"][{POST_SNAPBACK}][/a]

Bzzzt.


I like this 
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: torok on 2006-03-22 18:35:07
Did anyone else notice that when you normalize for bitrate, you get this:

iTunes = 4.83
AoTuV = 4.55
lame = 4.5
WMA = 4.84

Which looks like it makes iTunes and WMA tied for first and the rest in second. Now, I'm not sure if it's reasonable to normalize like that (it's asssuming that quality of a codec is directly 1:1 with bitrate), but it's an interesting thought.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Busemann on 2006-03-22 18:37:37
Quote
Did anyone else notice that when you normalize for bitrate, you get this:

iTunes = 4.83
AoTuV = 4.55
lame = 4.5
WMA = 4.84

Which looks like it makes iTunes and WMA tied for first and the rest in second. Now, I'm not sure if it's reasonable to normalize like that (it's asssuming that quality of a codec is directly 1:1 with bitrate), but it's an interesting thought.
[a href="index.php?act=findpost&pid=373975"][{POST_SNAPBACK}][/a]


The only fair way to test this would be to use CBR.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: Mardel on 2011-09-24 22:51:45
Hi!

I look at the 128 kbps listening test on your site, and i cant accept that aoTuV vorbis worse than aac.
And then I realize that vorbis or aac q settings are bad. First thing that is the vorbis use comma and not dot for decimal.
Second thing the ogg file is smaller than aac approx. 0,5 - 0,9 MB lesser.
Then i looked the average bitrates in foobar2000 the ogg's bitrate 20 kbit lesser than the aac.

e.g.: 158 kbps vs 138 kpbs

I encoded same track and another track with vorbis -q 4,99 settings and the bitrate difference was 1-2 kbit and the file size difference was 0-0,1 MB.

Overall your tests doesn't give a direction for which codec is better if the codec settings are wrong.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: [JAZ] on 2011-09-24 23:36:22
The vorbis commandline uses comma or dot depending on the localization of the OS where it is running.
As you've found, if the setting being used was wrong, the bitrate would have been much different.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: db1989 on 2011-09-25 11:30:55
Second thing the ogg file is smaller than aac approx. 0,5 - 0,9 MB lesser.
Then i looked the average bitrates in foobar2000 the ogg's bitrate 20 kbit lesser than the aac.

e.g.: 158 kbps vs 138 kpbs

Overall your tests doesn't give a direction for which codec is better if the codec settings are wrong.
Given that many modern codecs perform better, or only, in VBR or ABR modes, listening tests reporting a bitrate such as this one are generally based on the principle of tuning the codec settings to obtain that bitrate, or as close as possible to it, as the mean bitrate over the set of audio files being tested. Thus, variation between different encodes of one file are to be expected and do not invalidate the methodology (inasmuch as there doesn’t seem to be another solution; this is as close to fair as is possible).
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: C.R.Helmrich on 2011-09-25 13:06:24
I look at the 128 kbps listening test on your site, and i cant accept that aoTuV vorbis worse than aac.

Sorry, but which test are we talking about? The test discussed in this thread is MP3 only.

Edit: Thanks lvqcl! Must be http://soundexpert.org/encoders-128-kbps (http://soundexpert.org/encoders-128-kbps) then, which is from mid 2006.

Chris
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: lvqcl on 2011-09-25 13:13:07
Sorry, but which test are we talking about? The test discussed in this thread is MP3 only.


IIRC that post was moved from http://www.hydrogenaudio.org/forums/index....showtopic=77708 (http://www.hydrogenaudio.org/forums/index.php?showtopic=77708) to this thread.
Title: Multiformat Listening Test @ 128 kbps - FINISHED
Post by: db1989 on 2011-09-25 13:31:33
Thanks for pointing out my silly mistake; the irony is not lost on me…  That’s what I get for barely reading! Re(re)located to the correct thread.