New 128 kbit/s listening test at SoundExpert

Topic: New 128 kbit/s listening test at SoundExpert (Read 47925 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

New 128 kbit/s listening test at SoundExpert

Reply #25 – 2006-04-16 12:41:11

This is not a traditional listening test, if I am not mistaken you only hear the difference between the encoded sample and the reference.

New 128 kbit/s listening test at SoundExpert

Reply #26 – 2006-04-16 13:12:50

Quote from: Sebastian Mares on 2006-04-16 12:41:11

This is not a traditional listening test, if I am not mistaken you only hear the difference between the encoded sample and the reference.

Not exactly. In SoundExpert listening tests you listen to artificial test items with artifacts amplified to some controlled extent. To be more specific – for each natural (produced by a coder) test item at least three artificial test items are generated with artifacts amplified to different extent. The final score is calculated analytically on the basis of results of these artificial test items listening tests. More details in AES paper in pro zone.

Figuratively speaking, you have some “sound magnifying glass” with magnification coefficient that you know. And your final conclusion about the sound you are “looking” at is formed of what you see and what the coefficient is.

New 128 kbit/s listening test at SoundExpert

Reply #27 – 2006-04-16 13:32:53

One issue is of course that if the coding error is below the perceptual theshold, from the point of view of the codec and from the point of view of the user, the encoding is perfect, but an amplification of the coding error will break things again.

Now, the more the amplification is, the more what the listener hears in this test will differ from what the actual encoding is. I would guess that at 320kbps, the amplification is very strong, and the intrinsic problems with it start dominating the results, i.e. people are grading on "distortions" that are really impossible to hear, and you suddenly get very illogical and wrong results.

New 128 kbit/s listening test at SoundExpert

Reply #28 – 2006-04-16 17:19:24

Quote from: Garf on 2006-04-16 13:32:53

Now, the more the amplification is, the more what the listener hears in this test will differ from what the actual encoding is. I would guess that at 320kbps, the amplification is very strong, and the intrinsic problems with it start dominating the results, i.e. people are grading on "distortions" that are really impossible to hear, and you suddenly get very illogical and wrong results.

That’s exactly how this technology works. The amplification can be strong, but listeners still hear three (or more) variants of test items – with just noticeable differences, with slightly annoying differences and annoying ones. There are no any special intrinsic problems with amplification and high amplification in particular. The less artifacts, the more amplification is needed to make them “just noticeable”.

Furthermore, it’s not likely that the same methodology works for wma@320, mp3@320, aac@320, atrac@256 and doesn’t work for he-aac@320.

In any case validity of the methodology can be proved only by comparing its results with the ones of standard listening tests. And you know what it means to organize reliable listening test with @320 encoded material. Without support from a serious research institution it’s hardly possible. But I still hope. If somebody knows the organization which could be interested in verification of the method, please, let me know .

New 128 kbit/s listening test at SoundExpert

Reply #29 – 2006-04-16 18:17:37

Quote from: Serge Smirnoff on 2006-04-16 17:19:24

There are no any special intrinsic problems with amplification and high amplification in particular. The less artifacts, the more amplification is needed to make them "just noticeable".

This is a statement that I would very strongly disagree with. It either artifacts, and it's audible, or it doesn't artifact, and it's not audible even in the best conditions, because of psychoacoustic reasons, which again boil down to physiological limitations of the human hearing system. Amplification of an artifact can change this, and hence, it's flawed, because you are making something audible, which is in fact, not ever going to be audible if it's not amplified.

I would expect that if you set up a listening test of modern codecs at 320kbps, you will in fact conclude that a) the codecs are transparent in almost all of the cases b) the listeners can't discriminate between the codecs. (I base this on the results of the 128kbps test, which already came very close to this result).

Now, if that would be the result, what would be your conclusion about your method that does make differences suddenly appear there?

I'm of the opinion that methodologies like yours, even if they have limitations, can be useful for furthering the development of audio coding technologies, just like for example EAQUAL has limitations and flaws, but we still use it for some things. However, let's not pull ostrich tactics and ignore the flaws, please, because that will reduce the usefullness of the method.

Quote from: Serge Smirnoff on 2006-04-16 17:19:24

Furthermore, it's not likely that the same methodology works for wma@320, mp3@320, aac@320, atrac@256 and doesn't work for he-aac@320.

By the way, on what data are you basing your claim that "the methodology works" for these codecs? I for sure am not aware of any relevant result.

New 128 kbit/s listening test at SoundExpert

Reply #30 – 2006-04-17 14:14:42

Quote from: Garf on 2006-04-16 18:17:37

... It either artifacts, and it's audible, or it doesn't artifact, and it's not audible even in the best conditions, because of psychoacoustic reasons, which again boil down to physiological limitations of the human hearing system. Amplification of an artifact can change this, and hence, it's flawed, because you are making something audible, which is in fact, not ever going to be audible if it's not amplified.

Ok. Let me try to make things simple.

1.   Any lossy audio processing alters a signal in some way. Psychoacoustic coding/decoding is not exclusion.

2.   The process of alteration of sound (input) signal could be conveniently represented as adding of some extrinsic signal which is usually called difference signal (just mathematics).

3.   If this extrinsic signal is beyond the threshold of human audibility (taking into account all psychoacoustic effects), you don’t hear it. If it is not, you hear it and call it sound artifact*.

4.   In general case the threshold of human audibility is unsubstantial term because it depends on the nature of input signal, the nature of extrinsic signal, personal hearing abilities and listening environment.

5.   Listening test is a research experiment with the latter two variables fixed. It is aimed to investigate how this group of individuals under these listening circumstances reacts to a set of sound stimuli. This set of stimuli ~~is the full permutation set of the first two variables – input signals (test samples) and extrinsic signals (introduced by coders, for example)~~ consists of all input (test) samples processed by all coders to be researched**. Results of such experiments have to be generalized with great caution as you can’t use unlimited number of test samples and especially if listening environment was not perfect and listeners not trained enough. In other words you can’t be sure that another thoroughly organized listening test will not reveal more artifacts. This process is asymptotically infinite and you need some conventional limits of defining transparency. ITU-R recommendations concerning listening environment, procedure of training, percentage of listeners discovered impairments play the role of such conventional limits.

6.   SoundExpert amplification method just increases the level of extrinsic signal up to the threshold of human audibility making it to be perceived (making artifact out of it, if you like). The more amplification, the more perceived audio quality margin.

7.   Because of the infinite nature of defining transparency some quality margin is useful. Most people use this intuitively clear statement in their practice when compressing music with a bit higher quality settings (or even lossless) than recommended.

*Sound artifact is a perceptual entity and signal artifact is a measurable entity.
** EDIT: Seems I was too clever by half.

Quote from: Garf on 2006-04-16 18:17:37

I would expect that if you set up a listening test of modern codecs at 320kbps, you will in fact conclude that a) the codecs are transparent in almost all of the cases b) the listeners can't discriminate between the codecs. (I base this on the results of the 128kbps test, which already came very close to this result).

Now, if that would be the result, what would be your conclusion about your method that does make differences suddenly appear there?

Perfect listening environment, trained listening experts and most critical sound material will reveal sound artifacts (spatial ones in particular) @320 for sure. In some cases the artifacts is clearly audible without any training (iTunes LC-AAC@320, glockenspiel, ringing extra-sound). Sound artifacts amplification (SARTAMP) just makes it easier to reveal that. So the comparison of results of two listening tests – with and without amplification is, I believe, possible and has sense.

Quote from: Garf on 2006-04-16 18:17:37

I'm of the opinion that methodologies like yours, even if they have limitations, can be useful for furthering the development of audio coding technologies, just like for example EAQUAL has limitations and flaws, but we still use it for some things. However, let's not pull ostrich tactics and ignore the flaws, please, because that will reduce the usefullness of the method.

Agree. But I think that major flaws of this methodology refer not to amplification of artifacts itself but the way it is implemented and completely unresearched question of inherent accuracy of the method. That’s why SoundExpert project is, of course, highly experimental.

Quote from: Garf on 2006-04-16 18:17:37

By the way, on what data are you basing your claim that "the methodology works" for these codecs? I for sure am not aware of any relevant result.

Apart from Hi-HE-AAC@320 the results (ratings) are theoretically expected. But, once again, comparison with standard listening tests is necessary. To be honest I don't have reliable proof. Situation with Hi-HE-AAC@320 have to be researched additionally. For now I’m given to trust the results.

New 128 kbit/s listening test at SoundExpert

Reply #31 – 2006-04-20 23:24:20

Finally I decided to include the following codecs:

1. aac LC CBR (Winamp 5.2)
2. aac HE CBR (Winamp 5.2)
3. ATRAC3 (LP2)

Nero AAC will be added as soon as new version released (end of April - early May 2006) and WMA10 - when public beta testing begins (also May 2006). ATRAC3 (LP2) is my initiative, MiniDisk Community is still silent.

Thank you all for the advices. Special thanks to Garf for interesting discussion.

Serge.

New 128 kbit/s listening test at SoundExpert

Reply #32 – 2006-04-21 09:58:00

Quote

Perfect listening environment, trained listening experts and most critical sound material will reveal sound artifacts (spatial ones in particular) @320 for sure.

Do we have any proof for that? Especially for the "spatial ones" (for sure) - as I can't see a reason why a modern codec should alter the stereo field at 320 kbps.

Quote

In some cases the artifacts is clearly audible without any training (iTunes LC-AAC@320, glockenspiel, ringing extra-sound).

This is obviously a codec bug, AAC @320 kbps has enough SNR for the spectral bands, to be far better (less noisy) than a conservative JND.

Quote

Sound artifacts amplification (SARTAMP) just makes it easier to reveal that. So the comparison of results of two listening tests – with and without amplification is, I believe, possible and has sense.

I don't think it does - if you have a clearly transparent signal, say AAC @320 kbps - and you amplify the artifacts, what you get is actually distortion of the real judgment - because you amplified something that is inaudible before.

Suppose you have two codecs - codec A and codec B - encoding at 320 kbps. Both codecs have enough room for transparent coding, so they have to overcode.

Codec A, overcodes by linear scalefactor attenuation
Codec B, overcodes by employing some equal-loudness noise-shaping filter prior to attenuation

Both signals are perceptually indistinguishable from the original.

Now, your tool performs artificial noise amplification, if I am not mistaken? What will happen - codec B will have noise pattern "equalized" to match equal-loudness curve, and it would, of course - sound better to the human ears than artificially noise-amplified Codec A

How fair is that? Both would be clearly transparent - but your method would clearly favor Codec B.

New 128 kbit/s listening test at SoundExpert

Reply #33 – 2006-04-21 23:11:43

Quote from: Serge Smirnoff on 2006-04-17 14:14:42

Perfect listening environment, trained listening experts and most critical sound material will reveal sound artifacts (spatial ones in particular) @320 for sure.

I do agree it's possible to construct some sound signal that is a pathological case for any given transform codec, but what does that mean? Not really anything relevant, I fear.

Quote

6. SoundExpert amplification method just increases the level of extrinsic signal up to the threshold of human audibility making it to be perceived (making artifact out of it, if you like). The more amplification, the more perceived audio quality margin.

But what kind of amplification? In what circumstances? Ivan made a specific point regarding two pratical coder approaches, but the problem is really that more generally the above assumption IMHO doesn't hold because it trivializes the amplification. If you could accurately model this, then it would probably vastly improve existing coders

New 128 kbit/s listening test at SoundExpert

Reply #34 – 2006-04-22 11:29:38

I'd like to remind you of techniques similar to PNS, Intensity Stereo, Vorbis' Noise Normalization. Sure, they introduce a (huge) error but their intent is to retain certein properties (like energy) which we are more sensitive to. Amplification of the difference will also result in huge changes of the these properties which breaks the ideas behind these techniques.

So, something with a low SNR could sound as good as something with a high SNR. By amplyfing the error you make the first sample sound pretty bad while the second samples won't change much due to the high SNR.

Your intention is clear (make artifacts more audible/testing simpler). But you need to make the following sure to keep the results meaningful: Consider a and b to be the output of two contenders. This amplification step maps a to a' and b to b'.
a sounds better than b <=> a' sounds better than b'
(better = more close to the original)
If you can't guarantee that (and I seriously doubt it being true because of the mentioned reasons) this amplification step is pointless.

Sebi

New 128 kbit/s listening test at SoundExpert

Reply #35 – 2006-04-23 02:18:01

Quote from: Ivan Dimkovic on 2006-04-21 09:58:00

This is obviously a codec bug, AAC @320 kbps has enough SNR for the spectral bands, to be far better (less noisy) than a conservative JND.

Is it the only bug of the coder or there are some other less obvious ones? It seems that it doesn’t matter for the end user what is it, a bug or a feature that produces artifacts. He needs a measurement scheme to discover all kind of artifacts – either obvious or less audible.

Quote from: Ivan Dimkovic on 2006-04-21 09:58:00

I don't think it does - if you have a clearly transparent signal, say AAC @320 kbps - and you amplify the artifacts, what you get is actually distortion of the real judgment - because you amplified something that is inaudible before.

There is a problem with the expression “real judgment”. What do you mean by this - PC speakers, headphones, audio installation in dedicated room or may be some midi stereo system with equalizer and “wide stereo” turned on? You probably noticed that post processing usually “amplifies something that was inaudible before”. And if artifacts are too close to the threshold of audibility they will be discovered easier in some of those cases. Thus it’s useful to know how much the quality margin is.

Quote from: Ivan Dimkovic on 2006-04-21 09:58:00

Codec A, overcodes by linear scalefactor attenuation
Codec B, overcodes by employing some equal-loudness noise-shaping filter prior to attenuation

Both signals are perceptually indistinguishable from the original.

Now, your tool performs artificial noise amplification, if I am not mistaken? What will happen - codec B will have noise pattern "equalized" to match equal-loudness curve, and it would, of course - sound better to the human ears than artificially noise-amplified Codec A

How fair is that? Both would be clearly transparent - but your method would clearly favor Codec B.

The method will definitely reveal that Codec B has better quality margin. I suppose, that is what noise-shaping is for.

Quote from: Garf on 2006-04-21 23:11:43

I do agree it's possible to construct some sound signal that is a pathological case for any given transform codec, but what does that mean? Not really anything relevant, I fear.

Initially it was my proposal to compare results of two listening tests – with and without artifacts amplification. Not necessarily @320 kbit/s, let it be @256 or even @192 kbit/s. The first one to be performed by ordinary listeners and the other one – by experts in perfect listening environment (in order to reveal artifacts).

Quote from: Garf on 2006-04-21 23:11:43

But what kind of amplification? In what circumstances? Ivan made a specific point regarding two pratical coder approaches, but the problem is really that more generally the above assumption IMHO doesn't hold because it trivializes the amplification. If you could accurately model this, then it would probably vastly improve existing coders smile.gif

It’s just a measurement scheme which combines objective distortion measurement with listening test. It can’t show the way how to make difference signal less audible.

Quote from: SebastianG on 2006-04-22 11:29:38

So, something with a low SNR could sound as good as something with a high SNR. By amplyfing the error you make the first sample sound pretty bad while the second samples won't change much due to the high SNR.

Not necessarily so. It depends on specific SN ratios and spectral structures of those noises. Indeed, artifact amplification is only an element of measurement scheme. Another important element is psychometric curve which shows relationship between level of difference signal and perceived quality (some of them are in the papers). Usually audibility of correctly shaped noise increases more slowly than of not shaped one. So in your case it depends.

Quote from: SebastianG on 2006-04-22 11:29:38

Your intention is clear (make artifacts more audible/testing simpler). But you need to make the following sure to keep the results meaningful: Consider a and b to be the output of two contenders. This amplification step maps a to a' and b to b'.
a sounds better than b <=> a' sounds better than b'

Yes, this statement is correct (mechanism is shown above). The early works on “coding margin” even proposed to measure this margin as amplification level of difference signal necessary to the latter became “just noticeable”.

New 128 kbit/s listening test at SoundExpert

Reply #36 – 2006-04-23 09:10:33

Quote from: Serge Smirnoff on 2006-04-23 02:18:01

Quote from: Ivan Dimkovic on 2006-04-21 09:58:00
This is obviously a codec bug, AAC @320 kbps has enough SNR for the spectral bands, to be far better (less noisy) than a conservative JND.

Is it the only bug of the coder or there are some other less obvious ones? It seems that it doesn’t matter for the end user what is it, a bug or a feature that produces artifacts. He needs a measurement scheme to discover all kind of artifacts – either obvious or less audible.

But, we are comparing apples and oranges here - for particular bug of Apple AAC, you don't need any tool as the bug is clearly audible - what your tool does, is amplification of the noise that would most likely otherwise be masked, especially at 320 kbps.

So we are basically comparing "artifacts that are not audible" and it is a completely different matter if your particular scaling /amplification strategy is good or not.

Quote

Quote from: Ivan Dimkovic on 2006-04-21 09:58:00
I don't think it does - if you have a clearly transparent signal, say AAC @320 kbps - and you amplify the artifacts, what you get is actually distortion of the real judgment - because you amplified something that is inaudible before.

There is a problem with the expression “real judgment”. What do you mean by this - PC speakers, headphones, audio installation in dedicated room or may be some midi stereo system with equalizer and “wide stereo” turned on? You probably noticed that post processing usually “amplifies something that was inaudible before”. And if artifacts are too close to the threshold of audibility they will be discovered easier in some of those cases. Thus it’s useful to know how much the quality margin is.

Most listening tests are done with proper set-up including headphones. It is known that speakers introduce additional echoes that mask artifacts - most modern codecs (AAC, WMA, Vorbis) are quite transparent at 320 kbps, even for the most critical material, and for most likely statistical majority of the listeners.

How can you be sure that amplification of the artifacts - and evaluating them in some normalized loudness scale would reveal which codec is better? Especially when you ignored the temporal and frequency spreading masking of the original signal (you amplified the noise) - frequency spreading is a psychoacoustic property that is quite constant among the users.

Quote

Quote from: Ivan Dimkovic on 2006-04-21 09:58:00
Codec A, overcodes by linear scalefactor attenuation
Codec B, overcodes by employing some equal-loudness noise-shaping filter prior to attenuation

Both signals are perceptually indistinguishable from the original.

Now, your tool performs artificial noise amplification, if I am not mistaken? What will happen - codec B will have noise pattern "equalized" to match equal-loudness curve, and it would, of course - sound better to the human ears than artificially noise-amplified Codec A

How fair is that? Both would be clearly transparent - but your method would clearly favor Codec B.

The method will definitely reveal that Codec B has better quality margin. I suppose, that is what noise-shaping is for.

My question is - how can you be sure?

I think your method makes perfect sense for testing which codec is better for tandem-coding (transcoding) , but I think it is a big mistake to rank one transparent codec "better" than other just because its artificially amplified masking pattern sounds subjectively "worse"

Add to that what SebastianG (I think?) said about other types of artifacts than quantization noise - like PNS, or SBR, or noise-normalization, etc... and we are in the real minefield

New 128 kbit/s listening test at SoundExpert

Reply #37 – 2006-04-23 12:38:17

We all know how bad SNR measurements are in judging audio quality and comparing codecs. But the test itself is repeatable and reliable.
I see this as an effort to make a test more reliable but still have the "weighting" of the average listeners when it comes to how bad or good artifacts sounds like.

The only argument against this would be if codec A is judged better than Codec B when the artifacts are at or close to the treshold of hearing,
but later the score is reversed when the artifacts are amplified say 20dB.

My expirience and intuition tells me that if artifact A sounds worse than artifact B , amplifying it will not change the winner. And here lies the dispute.
BUT this should be possible to verify with listening tests. The percetion/masking etc may change with amplification, but does the relative difference between artifacts(codecs) change ?
I may be wrong but I think the codec with less annoying artifacts and lower weighted noise level will win, regardless of amplification. And this is the goal, to accuratly judge which one is the best, which one has the most margin. As we all know peoples hearing are not homogenous, and most audio people would like to know which one has margin for that hard to encode song, or for that time in your life you can afford headphones in the 10k dollar range..

New 128 kbit/s listening test at SoundExpert

Reply #38 – 2006-04-23 14:05:20

I didn’t say anywhere that some high bitrate codec is “better”. I say it has higher quality margin which could be useful in some cases, such as high definition audio installation, heavy post processing of decoded signal, transcoding, etc. May be in some cases this quality margin is senseless but this doesn’t mean that the difference between, say, @320 and @256 encoded material is not existent. It does exist but it’s up to you weather it is essential in your particular case.

When doing some listening tests we are talking about probabilities. This is where all that “quite transparent”, “most likely”, “statistical majority” came from. In fact we are trying to predict probabilities of getting artifacts in real life on the basis of those listening tests results. And “perceived quality margin” is a direct indicator of that probability: the higher margin, the less probability of getting artifacts.

And, please, don’t oversimplify the method. It measures the level of difference signal (objective part) and estimates how good this signal is masked by useful signal (subjective part). Final score (on infinite grade impairment scale) is calculated with the help of psychometric curve which is plotted each time for the particular useful signal and difference signal. So it doesn’t matter by what technology (PNS, SBR, rounding error, filtering, harmonic distortion, with or without application of psychoacoustic effects, etc.) the difference signal was created. The method objectively measures it, subjectively estimates it and computes the score.

New 128 kbit/s listening test at SoundExpert

Reply #39 – 2006-04-24 12:20:51

Quote from: Serge Smirnoff on 2006-04-23 02:18:01

Quote from: SebastianG on 2006-04-22 11:29:38
Your intention is clear (make artifacts more audible/testing simpler). But you need to make the following sure to keep the results meaningful: Consider a and b to be the output of two contenders. This amplification step maps a to a' and b to b'.
a sounds better than b <=> a' sounds better than b'

Yes, this statement is correct (mechanism is shown above). The early works on “coding margin” even proposed to measure this margin as amplification level of difference signal necessary to the latter became “just noticeable”.

I don't think so! Where is it shown?
You're oversimplifying the perception of "artefacts".

Quote from: Serge Smirnoff on 2006-04-23 02:18:01

Quote from: SebastianG on 2006-04-22 11:29:38
So, something with a low SNR could sound as good as something with a high SNR. By amplyfing the error you make the first sample sound pretty bad while the second samples won't change much due to the high SNR.

Not necessarily so. It depends on specific SN ratios and spectral structures of those noises. Indeed, artifact amplification is only an element of measurement scheme. Another important element is psychometric curve which shows relationship between level of difference signal and perceived quality (some of them are in the papers). Usually audibility of correctly shaped noise increases more slowly than of not shaped one. So in your case it depends.

No, let's put the spectral/temporal shape of "noise" aside. Please reread what I've written and think of operating locally in a small time/frequency region. An encoding with PNS enabled for a certain time/frequency region can sound better compared to an encoding with coarsly quantized spectral coefficients (metallic sound) but the latter will have a higher SNR (lower "error" signal). If you amplify the difference then it's likely that the first encoding will sound worse than the second one, thus
a sounds better than b <=> a' sounds better than b'
would be violated which makes the amplification pointless.

Sebi

New 128 kbit/s listening test at SoundExpert

Reply #40 – 2006-04-24 16:08:59

The method doesn’t rely solely on SNR (or level of difference signal). In your example the first encoding may need more amplification to make diff. signal “just noticeable” and most likely it will sound more “pleasant” to human ear (it will be revealed in listening tests). The second encoding, on the contrary, will need less amplification because such kind of artifacts (metallic sound) is very noticeable. So the final score will definitely be higher for the first one. The score also depends on the type of psychometric curve which shows how fast diff. signal become annoying with its increase. But now, without listening tests, it’s hard to predict how these curves will look like for both encodings.

Once again, SNR is only a half of the story, if not one third. At least the method correctly “grades” dithering with higher scores than simple bit depth reduction in spite of higher SNR of the latter. That was the first that I checked when started to play with the method two years ago.

New 128 kbit/s listening test at SoundExpert

Reply #41 – 2006-04-24 18:47:30

I checked IGIS.pdf. But I can't find any basis which justifies your approach. You just tell us what you do (more or less clearly). BTW: Given that p(X,Y) is the plain old normalized cross correlation value (non-mean-removed) the difference value should be dependent on the DC offsets if I'm not mistaken.

Edit: I just found this old but related thread.

Sebi

New 128 kbit/s listening test at SoundExpert

Reply #42 – 2006-04-24 19:25:55

The best evaluation of this method would be doing two listening tests at the same time, with same listeners - one using original decoded sequences - and second one, using Serge's amplification method.

If the number of samples is fairly big as well as number of listeners - perhaps some correlation score could be calculated, and compared to, say, score of advanced PEAQ.

New 128 kbit/s listening test at SoundExpert

Reply #43 – 2006-04-24 21:42:17

Quote from: Ivan Dimkovic on 2006-04-24 19:25:55

The best evaluation of this method would be doing two listening tests at the same time, with same listeners - one using original decoded sequences - and second one, using Serge's amplification method.

If the number of samples is fairly big as well as number of listeners - perhaps some correlation score could be calculated, and compared to, say, score of advanced PEAQ.

Just so. The method is pretty similar to PEAQ and has to be proved the same way. The proof would be even more reliable if participants of those tests is different (not the same) and sound material has really small impairments.

@SebastianG: I used standard definition of correlation coefficient as zeroth lag of covariance function. It is DC offset independent exactly.

New 128 kbit/s listening test at SoundExpert

Reply #44 – 2006-04-27 18:30:55

I have a question can someone send me some sample of music encoded with new codec WMA10 at 48 kb/s stereo? - I dont have Vista and WMP11

New 128 kbit/s listening test at SoundExpert

Reply #45 – 2006-05-04 21:50:09

128 kbit/s listening test STARTED! Finally the following codecs were included in the test:

• AAC LC CBR (Winamp 5.21)
• AAC HE CBR (Winamp 5.21)
• ATRAC3 (LP2)
• MPC 1.15v

New Nero Digital Audio Reference Quality MPEG-4 & 3GPP Audio Codec will be added till the end of this week.

New Microsoft codec from Windows Vista will join the competition as soon as public beta testing of the new OS is opened (till the end of May).

During this listening test ALL test files from “128 kbit/s” group will be available for downloading (NOT only new ones). But probability of getting the old ones is substantially smaller as they have been already graded in the previous 128 kbit/s test. Here is direct link to a test file for your convenience. Thank you in advance for participation!

New 128 kbit/s listening test at SoundExpert

Reply #46 – 2006-05-09 12:57:21

I just saw that the test favors:

128 kbps SBR over 128 kbps LC...
320 kbps SBR (!?!?!?!!!!!)... independent stereo (maaan)

May I point out to ITU BS.1116 conforming double-blind listening tests done in AES papers that deal with oversampled SBR, that even on 96 kbps, where LC-AAC has big trouble with finding enough bits to mask the noise, SBR was in fact NOT better.

At 128 kbps I find it quite hard to believe that LC-AAC would behave worse than it behaves at 96 kbps with regards to comparison to SBR.

So I don't really see the way how SBR could be better than LC at 128 kbps, unless your artifact amplification method does something which obviously favors SBR and most likely contradicts what the properly-set blind listening test would show.

Regarding 320 kbps stereo.. and SBR + "Independent Stereo" - no comment

I guess this above will be a new source of information for braindead and plain wrong configured encodings

How about testing this properly with new Nero AAC Codec (LC) @320 kbps which does 320 kbps quite very well?

New 128 kbit/s listening test at SoundExpert

Reply #47 – 2006-05-09 23:21:11

Let me explain. I’m not a codec developer. Actually I don’t know how all those SBR, PNS, PS etc. work in detail. And, to be honest, do not even want to. All codecs are almost the same for me. When some new one releases I usually make a short pretest and if the results are promising I add it to SE service. The procedure of preparing test files is the same for all codecs – whether they are 128 or 320 or 793…

Concerning new codecs from Winamp. I understand what you are talking about comparing LC and HE modes. But if you make some tests with both ones from Winamp @128 kbit/s you’ll see that HE is slightly better. I don’t know why – may be CT codecs have better SBR implementation or may be they have, on the contrary, weak LC implementation. I can add only that new Nero free encoder (according to my preliminary tests) with –q 0.462 (128.0 kbit/s) performs even better than both winamp ones. Test files are almost ready and tomorrow free aac codec will be added to 128 listening test.

aacPlus (HE-AAC) High Bitrate Encoder from Winamp 5.21 is definitely optimized for high bitrates as it performs poorly @128, very good @192 (according to halb27's tests) and “too good” @320. I don’t except possible flaws of my methodology but for now I don’t see any proofs of that except various explanations of why this codec shouldn’t be so good. Does anybody know what SBR and how exactly it was implemented in that codec and was it really done at all? I see only the name of the codec, no other tech. details. And finally, there are two stereo modes for this codec @320 kbit/s - “independent stereo” and “dual channel”. I assume that only the first one takes into account interdependency between channels or am I wrong?

Quote from: Ivan Dimkovic on 2006-05-09 12:57:21

Regarding 320 kbps stereo.. and SBR + "Independent Stereo" - no comment

I guess this above will be a new source of information for braindead and plain wrong configured encodings

May be. But I’m not sure these encodings will be worth than “–q -2pass” ones

Quote

How about testing this properly with new Nero AAC Codec (LC) @320 kbps which does 320 kbps quite very well?

What do you mean “properly”? I can add it to SE. What settings?

New 128 kbit/s listening test at SoundExpert

Reply #48 – 2006-05-09 23:46:08

There are 3 possibilities:

1) 128kbps and 320kbps HE-AAC is really great and we should use SBR at those bitrates, meaning also that Coding Technologies (inventors of SBR) were completely wrong in their real listening test which was published in AES, all of our internal testers are completely wrong too, the common sense of everybody who works on developing coding stuff is wrong, and your methodology is right.

2) The codec is in fact NOT running in SBR/HE-AAC mode.

3) Your methodology, which has been explained by me, Ivan and SebastianG to have serious issues, is in fact so flawed that it gives nonsensial results.

New 128 kbit/s listening test at SoundExpert

Reply #49 – 2006-05-10 14:17:57

@128: I said only that this particular HE-AAC encoder from Winamp 5.21 sounds better than his LC brother. It was revealed during my short preliminary listening tests. First SE results also prove that (they may change in future though). In order to prove or deny this, it is necessary to make independent ABX tests of these coders. Plane simple. Could someone do this for the sake of science? Please, no more explanations, AES papers, previous experience, common sense etc. Just listen. And I wouldn’t do any conclusions about usefulness of SBR on the basis of this listening test, because you don’t know exactly what these codecs are made of.

@320: May be the inventors of SBR have developed this technology further, or just adopted it to high bitrates, or may be they have invented something else completely new? I don’t know. Otherwise it was very illogical for CT to present their new high bitrate encoder with target bitrates of 128-320 in spite of their own listening tests.

@Garf: I think the truth is somewhere around your item #2.

Notice