Skip to main content

Topic: Pre-Test thread (Read 42094 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • tigre
  • [*][*][*][*][*]
Pre-Test thread
Reply #50
Quote
Quote
*I think that Guru's idea of 2 sets is interesting but unfortunately probably bad in our case. I am afraid that only experienced listeners would pick the second group. As we know that those listeners are using lower ranking (as demonstrated in the 128kbps test), ranking of both groups would probably not be comparable.


Good point.

A solution could be to "normalize" the rankings of each listener by calculating his/hers total average of rankings of the 1st (smaller) group of samples. Then all rankings (also for the 2nd (extended) group of samples should be multiplied with a certain factor so the average (1st group) becomes e.g. 3. Hopefully this doesn't mess up the results - another thing that dologan could try to find out.  B)
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

  • phong
  • [*][*][*][*]
Pre-Test thread
Reply #51
Another problem with using lame as an anchor is that it might not serve its function.  A lower anchor should rank last for each sample.  Disreguarding the blade anchor, lame did not do that in the 128kbps test.  If a lower anchor is to be used (and I tend to think that would be a good idea, but I am not a statistics expert), I would be in favor of using blade again.  Unlike a simple lowpass, it produces a spectrum of different kinds of artifacts, which I think is a more realistic baseline to work from.

Also, since there haven't been any respnoses about my Linux ABC/HR clone, I assume there's no interest.  Anyone who's interested can let me know, otherwise I'll probably not devote as much time to it as I would otherwise.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

  • elmar3rd
  • [*][*][*]
Pre-Test thread
Reply #52
In statistics, an anchor can be a middle value, not the highest and not the lowest.
I think, in a listening test an anchor is a weighting for the results of every listener to make them more comparable. It is to prevent that some listeners only use the 4-5 range while others take the full range.
It is also useful to check that a participant submits serious results.

Therefore, Blade at 64 kbps would do it.

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #53
I think we're making a mess out of this test...


OK, people, please give me suggestions. If you want a lower anchor, and lame 128 shouldn't be an anchor, which would be the codecs featured, in your opinion, including anchors?

Remember we're limited to 8 codecs.
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #54
Quote
A solution could be to "normalize" the rankings of each listener by calculating his/hers total average of rankings of the 1st (smaller) group of samples. Then all rankings (also for the 2nd (extended) group of samples should be multiplied with a certain factor so the average (1st group) becomes e.g. 3. Hopefully this doesn't mess up the results - another thing that dologan could try to find out.  B)

My fear is that such a messy way of calculation would open lots of possibilities for critics and the like to flame my test.

Not mentioning that calculating the resulting scores will be nightmarish (I mean, it'll be very hard, and then human errors might creep in, since I won't be using only ff123's tools anymore to do the calculation, I would have to do several calculations myself)
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #55
Quote
EDIT: And what about PNS for nero? With quick test i heard that it is useful option to use low bitrates.

Ok. Just talked with Ivan, he said PNS is automagically disabled when you use HE AAC. Probably because it actually decreases quality, I guess.
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • tigre
  • [*][*][*][*][*]
Pre-Test thread
Reply #56
Quote
Well, as already explained somewhere, the Anchor isn't there only to protect rankings, but also to put things into perspective across the entire sample suite.


Probably it does more good than bad to define fixed rankings for anchors but to put things into perspective it would be good IMO to suggest at least a range for the ranking of the anchors.

Taking this into account the codecs tested should be

higher anchor:
1. lame --preset 128; suggested ranking arround "4" (arround could mean e.g. +/-1)

lower anchor:
2. lame --preset 64; suggested ranking arround "1" - "2" (I don't know how reallistic this suggested ranking is as I haven't tested --preset 64 much so far.)
OR:
2. something transcoded, e.g. WMA9@64kbps -> MP3Pro@64kbps (could have some educational value)

3.Ahead HE-AAC

4.Ogg Vorbis

5.MP3pro

6.WMAV9

7.Real Audio Cook

8.AAC? ATRAC3Plus? WMA8? ...? I'd say AAC because of hardware support.
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

  • phong
  • [*][*][*][*]
Pre-Test thread
Reply #57
Ok, these are my personal feelings:

I think that having lame at 128k for the upper anchor is critical.  Lots of these codecs are claiming to be as good at 64k as mp3 is at 128k.  I'm going to wager that such a claim is a big fat lie and I think one of the goals of this test should be testing that claim.

I don't care if there is or is not a lower anchor.  I guess I don't fully understand the significance of multiple anchors, or their importance.  I'm also perfectly happy doing 8 codecs.  With the 128 test, that would have been exhausting.  No codec is that close to transparent at 64k so it will be much easer and less fatiguing to do more samples.  If there is going to be a lower anchor, I would prefer blade.  I think lame at 64k would be surprisingly competitive.

As far as WMA vs. WMA pro, you're screwed any way you do it.  If you use WMA only, people will complain that you didn't include the best version.  If you use only WMA pro, people will claim that it's not representative of the WMA that everyone uses and actually has support in hardware.  Including both seems like a waste.

Sony's claiming that Atrac3 sounds really good and are marketing the crap out of it, which is a claim that should be tested.  However, apparently the software is so horrible that nobody's gonna use it no matter how good it sounds.  I'd say the same is true of Real Audio, but it's actually got some popularity somehow.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #58
Quote
I think that having lame at 128k for the upper anchor is critical.  Lots of these codecs are claiming to be as good at 64k as mp3 is at 128k.  I'm going to wager that such a claim is a big fat lie and I think one of the goals of this test should be testing that claim.

Right, so let's try to make things easier: This is definitely in:

-Lame --ap 128
-Ahead HE AAC Streaming :: Medium
-Vorbis -q 0 (or -q 0.2?)
-Adobe Audition MP3pro quality 40

This is discusseable:

-Real Audio Cook/Gecko 64kbps
-WMA (std or pro? CBR or VBR?)
-Bottom anchor (lowpass? blade? lame?)

-This is probably out, but can also be discussed:

-Atrac3plus
-QuickTime AAC LC

I think a good compromise between those that went to test as much as possible, and participants that don't want to waste too much time taking the test, is taking what's definitely in and what's discusseable. And leave out LC AAC and Atrac3+.

That would also make the statistical calculation of the resuls much easier and less prone to criticism. There would be no more odd packages, with a different sample in each of them, or doing special packages for those that want to test more.

Quote
As far as WMA vs. WMA pro, you're screwed any way you do it. If you use WMA only, people will complain that you didn't include the best version. If you use only WMA pro, people will claim that it's not representative of the WMA that everyone uses and actually has support in hardware. Including both seems like a waste.


True. I think another point would be that WMA std was already tested at ff123's test. Yes, it was v8, but I'm not very confident that v9 got much improved. Anyone caring to try? If it's nearly the same as v8, I might as well go with Pro.

Quote
Sony's claiming that Atrac3 sounds really good and are marketing the crap out of it, which is a claim that should be tested. However, apparently the software is so horrible that nobody's gonna use it no matter how good it sounds.


Haha. Right.

I created a special VirtualPC Win98 partition to install SonicStage, in case someone really wants it. But I'm inclined to let it alone.

Regards;

Roberto.
  • Last Edit: 25 August, 2003, 03:49:43 PM by rjamorim
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • Gabriel
  • [*][*][*][*][*]
  • Developer
Pre-Test thread
Reply #59
Real: I think that it is not needed
wma: I think that v9 std should be included, as it is marketed for portable devices
atrac3plus: do not think that we need it
aac-lc: I think that it should be included, as it can be decoded by portable devices.

Even if some codecs (wma and aac) were already tested in previous tests, I think that they should be included.

If an higher anchor is included (mp3-128), I think that a lower anchor should be included. Otherwise, there is a risk that most of the ranking would be in the bottom range. With a lower anchor, the interesting competitors will probably be more balanced.

As a lower anchor, I think that lame 64 would be good: mp3 is still used for streaming, and it is a something that is really used (Blade64 and lowpass are probably not used that much...)
I the exact version number and parameters are mentionned, this lower anchor is still reproducible.

So my choice would be:
lame 128 (higher anchor)
lame 64 (lower anchor)
vorbis
mp3pro
he-aac
aac
wma

  • Digga
  • [*][*][*][*][*]
Pre-Test thread
Reply #60
Quote
If an higher anchor is included (mp3-128), I think that a lower anchor should be included. Otherwise, there is a risk that most of the ranking would be in the bottom range.

That thought sounds wise to my ears. IMO, either put one lower and one higher anchor in, or make explicitly clear that the results may be a little down the ladder.

I also would to like see wma included, as it is one of the advertised two main formats in portable music (...so ms would say, it's as good at 32kbps as mp3 at 64...) let's see if they right :-)
Nothing but a Heartache - Since I found my Baby ;)

  • Digga
  • [*][*][*][*][*]
Pre-Test thread
Reply #61
uups, double post...
  • Last Edit: 26 August, 2003, 04:52:41 AM by Digga
Nothing but a Heartache - Since I found my Baby ;)

Pre-Test thread
Reply #62
Roberto, given the MS claims about WMA9 (not WMA8), and its current semi-support (most existing hardware devices support the capabilities of the WMA8 bit-stream, which is more constrained), and increasing industry support, it has to be included. 

Also, the MS claims about quality are VBR/ABR based, not CBR, and most of the other codec configurations are VBR/ABR configurations.  I suggest ABR (VBR 2-pass) with the standard codec (the pro codec will only realize 64k VBR through quality based configuration which would be more difficult to constrain than ABR).

By the way, thanks for doing all this, Roberto.  Whilst many folk will criticise, few will bother doing anything worth criticising.

Doug
  • Last Edit: 26 August, 2003, 06:20:48 AM by LadFromDownUnder

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #63
OK, so let's try this:

-HE AAC
-Vorbis
-MP3pro
-WMA Std
-AAC-LC
-Real Audio
-Lame 128 as high anchor
-Lame/Blade 64 as bottom anchor

Any criticism?
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • Digga
  • [*][*][*][*][*]
Pre-Test thread
Reply #64
That combination looks realy good. This way

- you can look at the differences btw he-aac and lc-aac at the given bitrate
- compare mp3 and wma in detail
- see how vorbis puts up with all of them and how it is possibly beaten by aac...
- and in the end get an impression if mp3(pro!!) realy still is the medium of choise in a low bitrate scenario

The anchors are well choosen, as almost anybody know how mp3 'should' sound and are well set.

I'll say, let's do it that way. Though the codec-combination looks good to me, I realy have no big clue bout all the special 'treatments' for each codec that may be chosen... (but that has been discussed before in this thread, if my mem. doesn't let me down).
Nothing but a Heartache - Since I found my Baby ;)

  • elmar3rd
  • [*][*][*]
Pre-Test thread
Reply #65
I agree.

Of course, 8 codecs will take some time (time to listen, time to relax the ears during the test).
But if someones time is short, it's better to reduce the number of testet samples in this personal case, than to reduce the number of codecs for the whole test.

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #66
Quote
I realy have no big clue bout all the special 'treatments' for each codec that may be chosen... (but that has been discussed before in this thread, if my mem. doesn't let me down).

What treatments? Preprocessing?

Preprocessing IST VERBOTTEN!

Quote
But if someones time is short, it's better to reduce the number of testet samples in this personal case, than to reduce the number of codecs for the whole test.


True. Besides, the test will last for 11 days and two whole weekends. I reckon people will need less time to listen to this test's 8 samples than the 128kbps test's 6 samples.

Anyway, still looking for criticism before I officialize that as the sample suite.

(Also, I'll wait some time for a reply I'm expecting from a codec developer)

Thanks for all the suggestions and criticism.

Best regards;

Roberto.
  • Last Edit: 26 August, 2003, 02:57:41 PM by rjamorim
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #67
BTW, some more questions that need to be answered:

-What AAC LC codec we'll use, Apple (ABR 64kbps) or Ahead (VBR, Radio/Tape (I don't remember))
-Blade or Lame at 64kbps for the bottom lowpass? Or maybe even FhG?

IMO, it would be interesting to use MP3 at it's best at 64kbps too, to see how well (bad?) it compares to other codecs (specially WMA). And at 64kbps the best is surely FhG, since neither Lame nor Blade have Intensity Stereo coding.

Regards;

Roberto.
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • music_man_mpc
  • [*][*][*][*][*]
  • Members (Donating)
Pre-Test thread
Reply #68
Quote
IMO, it would be interesting to use MP3 at it's best at 64kbps too, to see how well (bad?) it compares to other codecs (specially WMA). And at 64kbps the best is surely FhG, since neither Lame nor Blade have Intensity Stereo coding.

I'm interested too.  I say go with FhG for the bottom anchor.
gentoo ~amd64 + layman | ncmpcpp/mpd | wavpack + vorbis + lame

  • phong
  • [*][*][*][*]
Pre-Test thread
Reply #69
The list looks good from where I'm standing.  I'll also agree that featuring 64k mp3 at its best is going to provide the most interesting results.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

  • n68
  • [*][*][*][*][*]
  • Banned
Pre-Test thread
Reply #70
Ciao...

read @ the portal.
sounds familiar.: *general phong*

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #71
Quote
read @ the portal.
sounds familiar.: *general phong*

Too late. You replied, and now it's gone. 
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • phong
  • [*][*][*][*]
Pre-Test thread
Reply #72
Huh?
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

  • brett
  • [*]
Pre-Test thread
Reply #73
rj -- as for which aac to include, i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt -- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment. however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw.

brett.

  • rjamorim
  • [*][*][*][*][*]
Pre-Test thread
Reply #74
Quote
rj -- as for which aac to include, i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt

Actually, I'm leaning more towards QT. First, because Ahead is already featuring a codec in this test, and second because QT fared so well in the AAC test.

Quote
-- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment.


I agree

Quote
however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw.


I don't take listening tests. But if someone wants to test Ahead 64 vs. QT 64, the results would be very welcome.

Regards;

Roberto.
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org