Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Grand Test Audio: 9 codecs, 10 samples, 31 presets (Read 9268 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #25
OH, sorry i didn't realize it was already too late :/....
Hmm, too bad....
Well, yeah, that kind of comment would certainly help .
Don't be so hard on yourself though, such a comment will ease your sin ...

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #26
Quote
There is no mistery in my choices - there were not enough knowledge for making right decisions.

That's really strange. Form is perfect : blind test, ABC/HR, APEd files. Good and rare formats / codecs (mpc, Psytel). Good samples (from ff123...). Material is perfectible.
That's too bad   

Have a peaceful night

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #27
Next time, it's gonna be much better.
I must say that it's really nice to see this kind of test done by a big site like this. It sure is a step ahead! Great work, respect. (thumb icon, i miss it )

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #28
Quote
Quote
There is no mistery in my choices - there were not enough knowledge for making right decisions.

That's really strange. Form is perfect : blind test, ABC/HR, APEd files. Good and rare formats / codecs (mpc, Psytel). Good samples (from ff123...). Material is perfectible.
That's too bad   

Have a peaceful night

The worl.. hrrr. is not...hrrrr.. perfhhrrect. As me... hrrrr ...not

But I will do hrrrr.. my best! ..hrr. r..r.r



It's good that at least the form is perfect... or near perfect. There are possibilities to grow. Always. Hrr!

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #29
Quote
Yes, we're ashame about this fact, but all test-files were prepared a long time ago (then we created the site), so we just didn't have time to prepare --alt-preset variation to the test base. One day we'll make another test I hope, and this preset will be there.

Long ago? It couldn't be before the release of Ogg Vorbis 1.0
--r3mix was long dead before that.


Whatever happened to the rule-of-thumb of no more than 6 encodings per sample for any one listening test?

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #30
It's good to see Radzishevsky et all finally coming to their senses and trying to set up a DBT test. Those godawful websound codec comparisons with flashy graphs and sonograms have got to go :)

One thing I'd be concerned about is the insecure format of the test results. You simply cannot know if a person purposely alters them, thus basing the test purely on the assumption that your participants are honest.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #31
Quote
It's good to see Radzishevsky et all finally coming to their senses and trying to set up a DBT test. Those godawful websound codec comparisons with flashy graphs and sonograms have got to go

One thing I'd be concerned about is the insecure format of the test results. You simply cannot know if a person purposely alters them, thus basing the test purely on the assumption that your participants are honest.

We'll make a blind test & graphics test = both are better, when pnly anything one

Of course, one can alter the results, but he have to understand HOW codecs work. And why does he need it, if his report will be just a part of others work? I see no point for such alterations.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #32
Quote
Quote
Yes, we're ashame about this fact, but all test-files were prepared a long time ago (then we created the site), so we just didn't have time to prepare --alt-preset variation to the test base. One day we'll make another test I hope, and this preset will be there.

Long ago? It couldn't be before the release of Ogg Vorbis 1.0
--r3mix was long dead before that.


Whatever happened to the rule-of-thumb of no more than 6 encodings per sample for any one listening test?

Where can I read about this 6-codecs rule?

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #33
Zombiek:

Quote
Quote
Dibrom wrote:
Er.... Now I'm a little worried about this part. What's the point in performing a blind listening test if you're also going to be making use of sonograms and frequency analysis and stuff like that?


That's the point! IMHO the best way to test codecs is to COMPARE blind test results versus pure mathematics analytics!
Imagine the page with graphics, tables, sonogramms, lots of numbers, bottomlined not only by the author's opinion, but with comments of users & their marks.


Wait a minute. Are you now talking about frequency graphs or graphs that illustrate users' ratings of codecs?

Forget the frequency graphs, *please*. You're going to start a new wave of misinformation if you don't. The look of a frequency graph has nothing to do with the actual sound of the codec. (As Dibrom said, there's plenty of information on the forum on this subject.)

Dominic

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #34
The ultimate purpose of a codec is to remove the most possible frequencies with the smallest possible audible effect.
So a graph showing completely messed up frequencies, but rated as transparent in blind test must come from a very good encoder. From this point of view, I really wouldn't know what to think about a frequency analysis, looking good or bad.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #35
Quote
Where can I read about this 6-codecs rule?

ff123 will know the details, as he has mentioned this a couple of times before. They were quite long ago and should be somewhere in the forum. Not sure how I would find it with a search query though...

Consider this: To properly rate the encodings, the listener should also make comparisons between 2 different encoding to decide which one he prefers and rate accordingly. This should be done for every possible pair of encoding which is 15 pairs for 6 encodings. Going beyond that increases the comparisons required even further. (not exactly exponentially, but definitely steeper than linearly). This is actually required for accurate rating (and optimally should be ABXed too - if you cannot ABX that A is better than B then you must rate A and B the same) because going through a long list of encoding, a listeners's idea of "annoying" may vary or a person may decide that "good" is 3.0 and then decide later that "good" is 4.0

I believe ff123 also mentioned that more samples dilute the significance of the statistics tests too. Oh well, let's see what he says if he's going to make a comment here soon.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #36
Quote
So a graph showing completely messed up frequencies, but rated as transparent in blind test must come from a very good encoder.

Well.. the "completely messed up frequencies" is a bit exaggeration if you compare the situation to real encoders.

Anyway, the biggest problem with graphs is that people don't really understand that conclusions about audible quality can't be based on graphs.
Maybe the worst example is the sine sweep test, second worst could be simple FFT frequency response graph.
Sonogram/spectral view is the most interesting, but even it can't show what is really audible and what is not.
Graphs showing the difference signal (overall quantization noise): again can be useful especially in tweaking, but conclusions can't be based on it without listening.
Juha Laaksonheimo

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #37
Quote
Quote
But I also wonder why you did not include the standard 64 kbps bitrate for mp3PRO (or 80 kbps, CBR or VBR), because that's where it's very good at, especially if you are also considering other codecs with low bitrates like AACEnc -tape or WMA9 VBR 25%. The higher you go with the bitrate in mp3PRO, the less you will gain in quality compared to other codecs that are meant to be used at ~128 kbps and above. This has to do with SBR and how it works and where it works best.

I've discluded mp3pro at lower bitrates, cause there were lots of tests of this bitrate/codec.. And you have to remember, that by the start of our project the main target of it was the finding the best possible codec/bitrate for online games music, and mp3pro@64 is definitly not good for such purposes.

Well, I only know two other tests with mp3PRO, and both did not use 64 kbps with VBR or even 80 kbps. I also wonder why you think that mp3PRO at 64 kbps "is definitely not good for such purposes" and at the same time consider to include PsyTEL AACEnc with the -tape preset, because I know for sure that this will sound worse than mp3PRO at 64 kbps. But if the main aspect of your test was to use codecs that have never been tested with certain settings before - OK...   

Quote
Quote
And I would like to know if you used the resampling option with both of the low PsyTEL settings. Because at these bitrates it is necessary to do this with -resample 32000, otherwise it will probably sound not too good, at least not as good as it could.

Concerning AAC. You can check the strings for coding by moving the mouse cursor over the preset in the table. I used standart presets for AAC, so if there is no resampling in it by default, all files won't be resampled. I'm just a rookie - so I don't know lots of things out there


That's OK, but maybe you should have asked here before...    And the resample option is explicitely mentioned in the index.pdf within the complete PsyTEL package. By the way, using -profile 2 with AACEnc (which is no standard setting at all) had no effect in my recent tests, so I'm not sure if Long Term Prediction even works in v2.15, just like Intensity Stereo.

But don't get me wrong, I think it's a good idea to "roll out" a listening test on a large-scale basis. I also would like to ask if you are planning to keep all of these files (800 MB) on your webspace server, so that it could be used as a library or archive for later (then with their correct names, perhaps)?
ZZee ya, Hans-Jürgen
BLUEZZ BASTARDZZ - "That lil' ol' ZZ Top cover band from Hamburg..."
INDIGO ROCKS - "Down home rockin' blues. Tasty as strudel."

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #38
What about SoundExpert ?

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #39
Quote
Where can I read about this 6-codecs rule?

A book which I highly recommend (but it's not cheap):

http://www.amazon.com/exec/obidos/tg/detai...=books&n=507846

From Sensory Evaluation Techniques, 3rd Ed., describing a test which is the closest thing to ABC/HR:  "Rating of Several Samples,  used to rate three to six, certainly no more than eight, samples on a numerical intensity scale according to one attribute; it is a requirement that all samples be compared in one large set; generally 16 or more subjects."

However, one can perform a Balanced Incomplete Block design, which is used when there are too many samples (e.g., 7 to 15) to be presented together in one sitting.  "Balanced" means that all samples are evaluated an equal number of times and all pairs of samples are evaluated together an equal number of times.

There are a couple of concerns with a large number of samples for testing:  one is tester fatigue, and the other is statistical insensitivity.  The first concern is self-evident, but the second might not be obvious.  As the number of samples increases, it becomes much harder to find significant differences in the data.

Sensory Evaluation Techniques has a procedure for statistically analyzing incomplete block designs, although it will produce too many "false differences," or type 1 errors, because it appears not to adjust for having multiple samples (i.e., it uses a Fisher's LSD at the end).  However, the alternative to Fisher's LSD may well be too insensitive for a large number of samples.  Hence the reason for limiting sample size when I have tried to perform comparisons:  the fewer samples to compare, the more sensitive the statistical results.

ff123

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #40
Zombiek, do away with the graphs. Regardless of what the consensus of the websound crowd on them may be, they do NOT make a test objective. Your reluctance is understandable - ixbt and websound have been basing their codec reviews mostly on comparing graphical representations of samples for a long time now, but it's about time to bury that moronic and pseudoscientific methodology.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #41
Quote
There are a couple of concerns with a large number of samples for testing:  one is tester fatigue, and the other is statistical insensitivity.

Don't forget the dreadful download size. Many people won't take part in the test if the download is over 15 MB.

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #42
OOPS !

I was going to download just one sample, (JM Jarre), but the download size, 200 MB, is too much for me !

Grand Test Audio: 9 codecs, 10 samples, 31 presets

Reply #43
Quote
OOPS !

I was going to download just one sample, (JM Jarre), but the download size, 200 MB, is too much for me !

It's not necessary to download ALL files. You can test any group in any order. Only group 1A is a must have, all others are optional