Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: New Public Multiformat Listening Test (Jan 2014) (Read 144490 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

New Public Multiformat Listening Test (Jan 2014)

Reply #350
I don't get your reaction.

Nor I yours.

Does it make sense to take extra precautions to ensure samples do not clip during decoding to fixed-point?

New Public Multiformat Listening Test (Jan 2014)

Reply #351
Yes, please. Many samples have gone offline.

OK. I'm on vacation right now, I'll provide necessary samples around Jan 8. Btw, I found this thread and remembered that I also have a backup of the samples from the now defunct/hijacked website ff123.net/samples. I can provide those as well.

Regarding clipping during decoding: I tend to avoid clipping in any case in my own listening test. That's why my 2010 HA set peaks at around -1 dBFS.

Chris
If I don't reply to your reply, it means I agree with you.

New Public Multiformat Listening Test (Jan 2014)

Reply #352
As one of the organizers, I want to express (just express, not to be involved into endless debates) my opinion about some questions that were discussed here last time.

1. Quality settings selection. What is actually an indicator of encoding effectivity? It is a "quality/size" ratio of course, or better "quality/average bitrate" ratio. If we want to compare quality of some encoder for some bitrate value we need to get absolutely identical bitrates first of all. The problem is how to do it for VBR mode. Here we must firstly realize who and how uses encoding. So we must to know what is a target audience of our test. Then we must understand that people encode absolutely different kinds of music material. So to get ABR for some quality settings, ideally we must analyze all music that's included in music libraries of our target audience. Or we must make a selection that represents for example genre distribution of this libraries. This is what we actually trying to do. And here is the only right way: - 1. maximize the number of analyzed tracks 2. Maximize their diversity 3. Bring the genre distribution (I mean percentage ratio) as much as possible closer to the actual distribution in our target audience's libraries.

So the more is tracks number and their diversity, the closer is our result to the "true value". Following the logic, if we decrease size of analysed material, the random error rises and results become less accurate. So analyzing of resultant bitrate for just 20-40 samples isn't a way to go. Because we need a value that will be true in average for maximum large music library.

On other hand, I saw here some fair comments about sample selection. But that's another step. See below.

2. Sample selection. Again, ideally we must orientate on the basis of our target audience's taste. So, about that comment. We really need a set of samples that represents variety of music that's listened by people in real life. But in this case we must absolutely realize who is our target audience. Because if we will analyze the average musical taste of all people in the world we will get majority of pop music. So we need to think: who and for what could use this mostly "enthusiastic audio formats". Then, again, we must make a statistical analysis of their musical libraries, group recordings by genre (or some other attributes) and then make a random selection of small samples number from each group. But this is ideally. This way will lead to proper analyzing of encoders' effectivity in average for average statistical audio enthusiast's library. But actually this is unimplementable. So we need to go some other way.

Above I've mentioned "some other attributes" for grouping the music material. This is also reasonable decision: we group our samples by it's kind: sharp transients, pure tones, wide stereo. Note that we should select really complex samples (but not just killer samples, especially we should not use completely synthetic non-musical samples), because in real life most of music is easy to encode, and will give no audible differences after encoding.
So in this case we have next requirements:

1) Collecting large number of hard-to-encode (but musical) samples.
2) Grouping them by some attributes. For example by signal type (pure tone/transient/stereo), then in each group we may include representatives of each music genre to make testing even more objective. Also we must include into each group the samples with and without vocal, and so on.
3) After grouping we make a random selection of samples from each group (subgroup).

This way we will get not so objective results as in first case (with analysing of percentage ratio for each genre), but much more informative. After testing we can present not only average results for all samples, but also results for each group of samples, so we'll be able to evaluate behaviour of each codec with each group (for example to compare which codec is better for transients). That's why we need to increase number of samples - because we must have at least few samples in each subgroup.

Also we need to make sample selection by users, who won't test all of them, completely random.
🇺🇦 Glory to Ukraine!

New Public Multiformat Listening Test (Jan 2014)

Reply #353
3. And going back to the question of clipping prevention. I've rised this question, and now, after deliberation, I want to say what I think about it.  I think that we must completely eliminate clipping from our test. Because we test encoders and encoding. And in process of encoding level excess don't mean quality loses. Clipping is a problem of further decoding and processing, and we must test the maximum potential of codec which includes decoding to 32-bit float for example, and there will be no additional quality loses (clipping) in this cases.

So eventually I suggest encoding of original samples, than decoding to 32-bit float and peak normalization for every sample to 0 dBFS (equal gain level for each group of decoding result). For example, we encode Sample 1 to AAC, Opus and MP3. After decoding to 32-bit float we get maximum peak value on LAME: +2 dBFS. Then we decrease level for each decoding result by 2 dB and then convert data from 32-bit float to 16 bit (which is supported by ABC HR). This an example of how to play lossy audio with maximum quality (e.g. using foobar2000 RG/prevent clipping).
🇺🇦 Glory to Ukraine!

New Public Multiformat Listening Test (Jan 2014)

Reply #354
we must test the maximum potential of codec which includes decoding to 32-bit float for example

Providing some head room at the input to the coder seems safest. I hope this means that post decoding normalisation would not be needed, but if it is, loudness-based, rather than peak-based seems safer.

In case the input sample is not correctly band-limited (due to loudness-wars smashing, or due to synthesised waveforms), band-limiting the sample before coding is also a good precaution.

New Public Multiformat Listening Test (Jan 2014)

Reply #355
1) Collecting large number of hard-to-encode (but musical) samples.
2) Grouping them by some attributes. For example by signal type (pure tone/transient/stereo), then in each group we may include representatives of each music genre to make testing even more objective. Also we must include into each group the samples with and without vocal, and so on.
3) After grouping we make a random selection of samples from each group (subgroup).

This way we will get not so objective results as in first case (with analyzing of percentage ratio for each genre), but much more informative. After testing we can present not only average results for all samples, but also results for each group of samples, so we'll be able to evaluate behavior of each codec with each group (for example to compare which codec is better for transients). That's why we need to increase number of samples - because we must have at least few samples in each subgroup.

I like this idea too. Maintaining a properly segmented bank of sound samples would be helpful for many audio researchers and enthusiasts. Along with killer-samples such bank can contain "ordinary" ones of different types - genres, voices, noisy/dirty, clipped ... . Some system of tags could be sufficient for the purpose. Then depending on the goal of the test samples with appropriate tags can be randomly selected. The use of ordinary (usual) sound samples for listening tests is a common practice especially for testing of low bit-rate encoders. For tests @96+ usual sound material is helpless. And this is another reason for not using in 96+ tests sample sets that are representative to some population of music - this ends up with large set of usual samples which are very hard to test @96+. On the other hand this approach could be interesting for testing @24-64.
keeping audio clear together - soundexpert.org

New Public Multiformat Listening Test (Jan 2014)

Reply #356
we must test the maximum potential of codec which includes decoding to 32-bit float for example

Providing some head room at the input to the coder seems safest.


Maximizing this way requires 2 or more passes for tuning the gain value (we can't predict how far each encoder will go over the original peak value). These are just needless difficulties and give no advantages over after-decoding normalisation (which is performed in foobar2000's ABX for example).

Quote
I hope this means that post decoding normalisation would not be needed, but if it is, loudness-based, rather than peak-based seems safer.

Loudness-based normalizing will be done by ABC HR as well.

Quote
In case the input sample is not correctly band-limited (due to loudness-wars smashing, or due to synthesised waveforms), band-limiting the sample before coding is also a good precaution.

Don't really think this is needed, maybe it's even wrong (not corresponds to real encoding conditions). Because we test the whole encoding algorithm, including low-pass filtering. And must keep material for encoding in it's original form.
🇺🇦 Glory to Ukraine!

New Public Multiformat Listening Test (Jan 2014)

Reply #357
In case the input sample is not correctly band-limited (due to loudness-wars smashing, or due to synthesised waveforms), band-limiting the sample before coding is also a good precaution.

Don't really think this is needed, maybe it's even wrong (not corresponds to real encoding conditions). Because we test the whole encoding algorithm, including low-pass filtering. And must keep material for encoding in it's original form.


I agree that we should not be doing band-limiting.

New Public Multiformat Listening Test (Jan 2014)

Reply #358
Oops

New Public Multiformat Listening Test (Jan 2014)

Reply #359
3. And going back to the question of clipping prevention. I've rised this question, and now, after deliberation, I want to say what I think about it.  I think that we must completely eliminate clipping from our test. Because we test encoders and encoding. And in process of encoding level excess don't mean quality loses. Clipping is a problem of further decoding and processing, and we must test the maximum potential of codec which includes decoding to 32-bit float for example, and there will be no additional quality loses (clipping) in this cases.

If many decoders clip, my recommendation is to let it clip. LAME deal it by decreasing the volume (about 2%), so it won't be painfully bad. If you decode them to float 32bit and decrease the volume by 25%, almost all clipping can be avoided, but I want the test to be close to people's actual listening conditions, including portable players.

New Public Multiformat Listening Test (Jan 2014)

Reply #360
3. And going back to the question of clipping prevention. I've rised this question, and now, after deliberation, I want to say what I think about it.  I think that we must completely eliminate clipping from our test. Because we test encoders and encoding. And in process of encoding level excess don't mean quality loses. Clipping is a problem of further decoding and processing, and we must test the maximum potential of codec which includes decoding to 32-bit float for example, and there will be no additional quality loses (clipping) in this cases.

If many decoders clip, my recommendation is to let it clip. LAME deal it by decreasing the volume (about 2%), so it won't be painfully bad. If you decode them to float 32bit and decrease the volume by 25%, almost all clipping can be avoided, but I want the test to be close to people's actual listening conditions, including portable players.


In real conditions there are also cases when simplified decoders reduces playback quality, and many other things that could affect quality. These are the problems of device's design.

Also for portable players I recommend to use MP3/AACGain utility. If users want to get maximum quality they have to use this utility before uploading tracks to their portable devices (or other hardware). And again, we must target on encoding quality and irreversible quality losses only.

At long last, clipping can introduce unneeded deviations in quality comparison, which will not be valid for cases of proper playback (with clipping prevention) in reality.

I think that clipping (and it's audibility) problem must be investigated in a separate research.
🇺🇦 Glory to Ukraine!

New Public Multiformat Listening Test (Jan 2014)

Reply #361
Hello, Guys.


Some of you have already recevied an invitation to a new forum "Audio Folks". 
It's a provisionary one until we get ready an official site. http://audiofolks.proboards.com/

It's good to have an alternative. Now we have more flexibility to organize different topics, more fluid talk etc.
We consider to move all staff there in these days.  So we're waiting You there. Please register and help us to do a good test as we've done it previously. 

STATUS of a Public Multiformat Listening Test

New Public Multiformat Listening Test (Jan 2014)

Reply #362
As people have started to register on a new place this thread  is considered obsolete.

No more discussion here. We bring it down.

New Public Multiformat Listening Test (Jan 2014)

Reply #363
A kind of OT, or not.

...
It's a provisionary one until we get ready an official site. http://audiofolks.proboards.com/

It's good to have an alternative. Now we have more flexibility to organize different topics, more fluid talk etc.
We consider to move all staff there in these days.  So we're waiting You there. Please register and help us to do a good test as we've done it previously. 
...

I'm confused. A new site to continue talking about the same listening test? What can't be done on HA and needs therefore a new site?
I haven't much time and am not eager to have to spend more time visiting another audio site.

Edit: IgorC posted while writing this post that this discussion thread will be ended right now. I guess I won't participate on the listening test. Very confusing 

New Public Multiformat Listening Test (Jan 2014)

Reply #364
Alexxander,

I have sent a PM  to You.

It is the end here.

New Public Multiformat Listening Test (Jan 2014)

Reply #365
Alexxander,

There is no change.
We're still the same people (participants, organizators etc.) who have conducted previous test.
Just a new place.

New Public Multiformat Listening Test (Jan 2014)

Reply #366
I will ask administrators to close this discussion.

Organization of this  test has moved to Audio Folks

Thank You.