Multiformat @ 128kbps - test discussion

Topic: Multiformat @ 128kbps - test discussion (Read 55304 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Multiformat @ 128kbps - test discussion

Reply #50 – 2004-04-02 18:41:48

Quote

I think that we should lower the bitrate of MPC compared to the last test! In the last test MPC got a bitrate of 146.1 and that's a big difference to 128 I think. I think that the average bitrate of all test samples should be between 125 and 130!

I assume the tests are done on problem samples. So if the bitrate is inflated for these samples it shouldnt matter, as long as the majority of music is around the correct bitrate for the setting.

Multiformat @ 128kbps - test discussion

Reply #51 – 2004-04-02 18:42:23

Quote

Quote

Quote
Edit: The text says 'big tie for first place', but I assume that was written before the error margins were corrected.

Aree!?! When was this update done? And how could there be such a big change? Were the analysis that messed up before?

http://www.hydrogenaudio.org/forums/index....ndpost&p=190675

Edit, more here:
http://www.hydrogenaudio.org/forums/index....ndpost&p=190827

First I've heard about this.

That many mistakes and corrections should have been posted as a seperate thread, rather than being burried in a couple of long threads.

I can understand making the mistake. That happens. I'm just surprised it wasn't a thread all by itself so everybody would hear about it.

Multiformat @ 128kbps - test discussion

Reply #52 – 2004-04-02 18:55:06

The "how to choose quality settings for VBR codecs" topic has been discussed countless times in almost every thread related to rjamorim's tests (pretest, annoucement, results threads).

The posts by Big_Berny and FatBoyFin can be regarded as summaries of the 2 standpoints on this question.

After all these discussions, rjamorim's decision on how to choose settings for VBR codecs hasn't been made by flipping a coin for sure So to anyone who wants to add something related to this: Please read the old threads and make sure that you have to say something new before posting.

Multiformat @ 128kbps - test discussion

Reply #53 – 2004-04-03 05:12:02

As long as the same criteria for choosing quality settings is applied consistently to all VBR codecs, then that should be ok, I think.

Multiformat @ 128kbps - test discussion

Reply #54 – 2004-04-03 12:09:12

Quote

As long as the same criteria for choosing quality settings is applied consistently to all VBR codecs, then that should be ok, I think.

the problem is that not all codecs are VBR! For example the AAC-codec of Itunes is CBR AFAIK.

Big_Berny

Multiformat @ 128kbps - test discussion

Reply #55 – 2004-04-03 13:53:28

That's why the average quality ratings in the test results must be represented along with average efficiency to accommodate variances in average bitrate. That way all factors are accounted for to make the results more meaningful. For instance, it won't be so surprising when a format gets the highest quality rating if it's also known that it averaged a 14% higher bitrate than the test target. (One of many factors, of course, that determine sound quality in an encoding format.)

Multiformat @ 128kbps - test discussion

Reply #56 – 2004-04-03 13:59:49

If a codec's used setting averages 128kbps over a wide range of music, yet uses a higher bitrate on these samples because it understands they are though, it is _more_ effifcient not less, even though it averages a higher bitrate in the test.

Multiformat @ 128kbps - test discussion

Reply #57 – 2004-04-03 14:03:06

Ok, I changed my mind! I think too that it should be the average bitrate ovver multiple albums and not the testsamples. But I think it would also be interesting to compare a song where MPC gets a bitrate of 100 and the others a higher one!

Big_Berny

Multiformat @ 128kbps - test discussion

Reply #58 – 2004-04-03 17:43:02

Just a small note that the Vorbis listening test, which will determine the best encoder Vorbis has to offer for this multiformat test, has started.

Multiformat @ 128kbps - test discussion

Reply #59 – 2004-04-03 23:35:03

@guruboolez: Thanks for the tips on using SonicStage

About featuring more than 12 samples: Even thought this test MIGHT be a popular one, we must remember we're testing each codec at it's best: the best AAC encoder (iTunes), the best MP3 encoder (Lame), the best Vorbis branch... so, it'll still be a quite hard one. Probably the hardest I conduced to this date.

For that reason, I'm not very confident about using several samples. It would be perfectly fine for the 48kbps test that is coming next, but I'm not sure it'll be a good idea for this one.

Regards;

Roberto.

Multiformat @ 128kbps - test discussion

Reply #60 – 2004-04-04 01:29:20

Quote

About featuring more than 12 samples: Even thought this test MIGHT be a popular one, we must remember we're testing each codec at it's best: the best AAC encoder (iTunes), the best MP3 encoder (Lame), the best Vorbis branch... so, it'll still be a quite hard one. Probably the hardest I conduced to this date.

For that reason, I'm not very confident about using several samples. It would be perfectly fine for the 48kbps test that is coming next, but I'm not sure it'll be a good idea for this one.

It's a matter of how best the listeners can be distributed to give the best results. I agree that it's a tough choice given the number of listeners who participated in the past. I wouldn't expect much more than 30 per sample if there are 12 samples, even on a very popular test. Maybe 25 is more realistic and if you're pessimistic maybe less than 20 per sample.

If you're pessimistic, it's better to keep the number of samples at 12.

ff123

Multiformat @ 128kbps - test discussion

Reply #61 – 2004-04-04 03:07:32

Quote

If a codec's used setting averages 128kbps over a wide range of music, yet uses a higher bitrate on these samples because it understands they are though, it is _more_ effifcient not less, even though it averages a higher bitrate in the test.

I may have used the wrong term, but what I meant by "efficiency" is compression rate. An encoder using more bits on problem samples is, to me, the encoder using bits wisely to maintain good sound quality, at the expense of efficiency.

Multiformat @ 128kbps - test discussion

Reply #62 – 2004-04-04 04:01:50

HERE'S MORE FUEL TO THE DEBATE :-)

Isn't incorporating a totally proprietory format such as Atrac3 into the listening test, kind of like bringing RealAudio into the fold? Yikes!!

I really like the idea of utilizing a grandfather codec to demonstrate how years of research have hopefully improved upon THE old original standard from Fraunhoffer - however my twist on this would be to use the most "mature" example of the FhG codec available from "AudioActive".

This codec had such extensive fine tuning done at 128kbps that it would make an excellent competitor to the newest codecs available. It would truly be interesting to see just how far everyone else has come by comparison. The final Mp3 codec from Fraunhoffer was the benchmark that everyone looked to -- as it had so much research and developement in its favor.

In Roberto's Mp3 Tests at 128kbps - this is the codec that gave everybody quite a run for the money - beating out all others more than once in the test! Even though work on the codec has ceased - it may still make for some interesting competion five years later.

FOR THE 6TH CODEC SAMPLE - I VOTE FOR FHG PRO (AudioActive)

Multiformat @ 128kbps - test discussion

Reply #63 – 2004-04-04 04:15:12

Quote

Isn't incorporating a totally proprietory format such as Atrac3 into the listening test, kind of like bringing RealAudio into the fold? Yikes!!

There are several considerations to determine what formats to include in the listening test, and one of them is the general popularity of each format, not whether they are proprietary, in my opinion. I've heard of lots of people that use ATRAC3 (mostly MiniDisc users).

The results of this test will be meaningful to more people if the tested formats are ones they actually use, have used in the past, are considering, or at least have heard of.

In my opinion, FhG Pro belongs in an MP3-specific test, as it has already had a showing in.

Multiformat @ 128kbps - test discussion

Reply #64 – 2004-04-04 06:55:36

QUOTE (nite @ Apr 3 2004, 10:01 PM)

Quote

Isn't incorporating a totally proprietory format such as Atrac3 into the listening test, kind of like bringing RealAudio into the fold? Yikes!!

[/i]

REPLY:

Quote

QUOTE (ScorLibran)
In my opinion, FhG Pro belongs in an MP3-specific test, as it has already had a showing in.

[/i]

You may be totally right that FhG Pro has already had its day. The forerunner to AAC may not hold much relevance except as an anchor in the test.

I would love to see Sony's ATRAC3 held up to the quality standards of the other codecs which are definately part of the new testing.

Am I correct in believing that ATRAC3 was recently abandoned by RealPlayer as thier proprietary encoding format. I thought they had made a move to AAC. I have long been a Sony fan - but.... they do have a history of beating a dead horse......remember "Betamax".

No matter...If the jury is still out?? Go ahead and put ATRAC3 to the challenge!

Multiformat @ 128kbps - test discussion

Reply #65 – 2004-04-04 07:18:07

Quote

HERE'S MORE FUEL TO THE DEBATE :-)

Isn't incorporating a totally proprietory format such as Atrac3 into the listening test, kind of like bringing RealAudio into the fold? Yikes!!

I really like the idea of utilizing a grandfather codec to demonstrate how years of research have hopefully improved upon THE old original standard from Fraunhoffer - however my twist on this would be to use the most "mature" example of the FhG codec available from "AudioActive".

This codec had such extensive fine tuning done at 128kbps that it would make an excellent competitor to the newest codecs available. It would truly be interesting to see just how far everyone else has come by comparison. The final Mp3 codec from Fraunhoffer was the benchmark that everyone looked to -- as it had so much research and developement in its favor.

In Roberto's Mp3 Tests at 128kbps - this is the codec that gave everybody quite a run for the money - beating out all others more than once in the test! Even though work on the codec has ceased - it may still make for some interesting competion five years later.

FOR THE 6TH CODEC SAMPLE - I VOTE FOR FHG PRO (AudioActive)

Erm, FhG is every bit as proprietary as RealAudio. What are you talking about? The best formats will be included period. Whether they are proprietary or open source or whatever is completely irrelevant in this case.

Multiformat @ 128kbps - test discussion

Reply #66 – 2004-04-04 07:24:20

Quote

Am I correct in believing that ATRAC3 was recently abandoned by RealPlayer as thier proprietary encoding format. I thought they had made a move to AAC. I have long been a Sony fan - but.... they do have a history of beating a dead horse......remember "Betamax".

Atrac3 will be featured at Sony's online music store, that should be launched soon.

Actually, that's the feature of this test that is thrilling me the most: It'll be a big shootout among online music stores: Sony's (Atrac3), iTMS (Apple AAC) and almost everything else (WMA std). I hope the Vorbis, MPC and MP3 entrusiasts forgive me, but what is really interesting me in this test is to see wether AAC, WMA or Atrac3 will win. That will point which music store offers the files of highest quality. (not considering the Real music store, unfortunately, since it's using a different bitrate range).

Multiformat @ 128kbps - test discussion

Reply #67 – 2004-04-04 07:38:07

I agree that it'll be of great interest to see these music store formats tested side-by-side.

Maybe you should rename it to the "Online Music Store Format Listening Test".

Multiformat @ 128kbps - test discussion

Reply #68 – 2004-04-04 09:24:40

IMO more than 12 samples would be a good thing, no matter how many people participate. Reasons:

In the tests before, listening closely to some of the samples for ABXing had been a torture for me, because I don't like the music. I'd like to test as many samples as possible, but I'd prefer to have a choice and only test the music I like.
Additionally, I believe that for music you like (i.e. instruments you're used to) it's easier to spot artifacts, especially the more subtle ones.
This is just an assumption, but the overal results (especially the size of error bars) depend on the number of results and the distribution of rankings, as far as I understand. If you give people the possibility to choose the samples they feel comfortable with for testing, you'll get more results and the error bars become smaller.

Multiformat @ 128kbps - test discussion

Reply #69 – 2004-04-04 10:34:42

Quote

It'll be a big shootout among online music stores: Sony's (Atrac3), iTMS (Apple AAC) and almost everything else (WMA std).

great that you like my arguments

Multiformat @ 128kbps - test discussion

Reply #70 – 2004-04-04 12:17:12

Quote

It makes no sense whatsoever to include WMA Pro. Since MusePack is in both tests and also unchanged since, you can compare the results with the previous test and see how WMA Pro compares.

(And if you wonder, it makes more sense to have MusePack as reference since it won the last test)

This comparison would be somewhat possible only if the samples would remain the same between the two tests, otherwise every sense of comparison is lost, since WMA9 Pro was tested in CBR mode (two pass vbr) and MPC is a highly adaptive intrinsic VBR encoder. Assuming the case, that both were tested in VBR mode, the comparison between tests with different samples, would still be difficult (although not impossible, taking into account some error margins) since the capabilities and the number of listeners between the tests would vary and the listening conditions for the same listeners would surely be different (the claim that VBR encoders offer constant quality over a wide range of music, should also be taken with a grain of salt).

Concerning now the number of samples, I think that the question that arises is: "How many listeners per sample are needed to create accurate and statistically valid results?". If the answer to that question is "around 15 listeners" then I don't think that increasing the number of the samples would help all that much (although it would be more than helpful under different testing conditions), taking into consideration the fact, that the distribution of the listeners among the different samples won't be ideal, to the effect that some samples would be evaluated from too few listeners, making results for that specific sample, statistically invalid. On the other hand, if the number of participants in this test is expected to be quite high (highly doubtful, since the tops of modern encoders are being tested, making the test very difficult) the increase of the samples makes perfect sense. So IMHO, 12 carefully chosen and representative of each genre samples, should be enough.

What do you think about using the same samples that were used in the AAC test? Something like that would help comparisons (with a margin of error of course) between encoders that will not be directly compared (e.g. Nero AAC vs. MPC, Lame vs. FAAC, Vorbis vs. FAAC and so forth).

Just my 0.02 euros

Kind Regards;

-George.

Multiformat @ 128kbps - test discussion

Reply #71 – 2004-04-04 13:35:59

Quote

Just a small note that the Vorbis listening test, which will determine the best encoder Vorbis has to offer for this multiformat test, has started.

i suggest everyone to join the vorbis listening test.

it really makes a difference!!!
i tested now 4 samples and on 3 the tunings were significantly better!

Multiformat @ 128kbps - test discussion

Reply #72 – 2004-04-04 14:40:30

Quote

Concerning now the number of samples, I think that the question that arises is: "How many listeners per sample are needed to create accurate and statistically valid results?". If the answer to that question is "around 15 listeners" then I don't think that increasing the number of the samples would help all that much (although it would be more than helpful under different testing conditions), taking into consideration the fact, that the distribution of the listeners among the different samples won't be ideal, to the effect that some samples would be evaluated from too few listeners, making results for that specific sample, statistically invalid.

IMO it's much more important to get a meaningful overall result than meaningful results for every single sample. I might be wrong, but in my understanding a low number of listeners for a single sample just leads to bigger error bars. If all error bars overlap, the result won't be meaningful for that sample but still statistically valid. When calculating the total result (average rankings + error bars), 18 samples with 20 results on average should be as good as 12 samples with 30 results on average. I believe that there will be more total results (i.e. 18*20 or 12*30 in the example) submitted if people can choose from more samples.

Pleas someone knowledgable correct me if my assumptions about the way results are calculated are wrong.

Multiformat @ 128kbps - test discussion

Reply #73 – 2004-04-04 16:32:14

Quote

IMO it's much more important to get a meaningful overall result than meaningful results for every single sample.

This is the main reason why I'd like to see more samples. One can think of the number of samples in the test as analagous to the number of listeners per sample. I think it's probably better to have 12 listeners per sample and 24 samples than 24 listeners and 12 samples. That way the particular weaknesses/strengths of each codec can be better explored.

It would also be possible to carefully choose overlapping subsets of samples to answer criticisms of previous tests. As an example:

1. average bitrate of each codec in subsample chosen to come as close as possible to 128 kbs. Both samples which produce high bitrates and low bitrates in vbr codecs included.

2. classical genre emphasized more.

3. problem samples used.

ff123

Multiformat @ 128kbps - test discussion

Reply #74 – 2004-04-04 18:34:56

Well, if you guys say more samples will be better...

I will go for 18 samples then. The 12 samples from the AAC test will remain pretty much the same, and the remaining 6 samples could be:

-2 classical ones
-2 problem samples
-2 normal samples of styles not featured among the 12 from last test.

What do you think?

I considered adding some voice-only sample (from a movie, maybe) since that would be interesting for people doing DVD rips. But I guess that would be too transparent at 128kbps...

Regards;

Roberto.

Notice