Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lowpass reference file (Read 2599 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lowpass reference file

This is an email conversation I had with 2BDecided today:

I've had a thought about analysing your listening tests. As a man who is clearly getting into statistics, I doubt you'll like this... ;-)

The hypothesis: people fall into three groups:
1) those who can hear high frequencies
2) those who can't hear high frequencies
3) those who can't discern anything useful

or maybe
1) those who prefer extended bandwidth and more artefacts
2) those who prefer reduced bandwidth and less artefacts
3) those who can't discern anything useful

In your data, group (3) adds noise. Nothing else to say about them really!


Intuitively, groups 1 and 2 shouldn't be "groups" at all, since people will fall somewhere on a sliding scale, depending on their high frequency acuity. However, since some codecs exhibit a frequency cut off beyond which they act very differently (I'm thinking about mp3 codecs here), then this frequency cut-off actually splits your listening panel into two distinct groups. So, for the purpose of this experiment, the members of group 1 can hear over 16kHz, and the members of group 2 can't.

Consider this experiment. Take an audio codec, choose a fixed bitrate and a single audio sample, and encode it several times, varying the audio bandwidth (lowpass value) between versions. Plot a graph of MOS against bandwidth. This will enable you to find the optimum lowpass value at that bitrate (for that sample!), because that'll be where the MOS is greatest.

Now, I propose you'll find two peaks on that graph...

1 corresponds to the point where the bandwidth is at the a maximum value it can take WITHOUT introducing any audible artefacts. This peak comes from the opinions of group 2.

The other peak falls at a higher bandwidth. It corresponds to the point where either (a) increasing the bandwidth further doesn't add any more audible high frequencies for anyone, or (b) increasing the bandwidth further adds hideous artefacts that no-one can stand. This peak comes from the opinions of group 1.


This would be far fetched, if I hadn't seen it done. I can't find the reference, but I've heard it several times that some people prefer wide-bandwidth artefacted sound, while others prefer narrower bandwidth artefact-free sound. This is mainly at lower bitrates, but it seems apparent that (even at higher bitrates) a coding strategy that wrecks high frequencies but preserves lower frequencies will suit group 2, and a coding strategy that maintains higher frequencies (at the slight expense of lower frequency accuracy) will suit group 1. Some people hear only bandwidth - others perceive quality. But many of the ones who perceive quality don't have the ears to perceive bandwidth!


The point of all this? I think you need to segregate the groups. I may confuse two separate issues here (i.e. the ability to hear high
frequencies, and the preference of wider bandwidth or less artefacts) but I'm assuming that they are related (if not equivalent).

Hence, if you can remove group 3 (I suggest ABX testing would achieve this) then you may still have to separate groups 1 and 2, because (essentially) they're hearing (or preferring) completely different things, and to combine the results from both groups may be unhelpful.


Back to your current listening tests. Is there evidence that (in the
people who aren't in group 3!) some people prefer one thing, and others the other - falling into two groups? I don't know how you test this hypothesis - I fear I can only point to totally anecdotal evidence. But you can analyse the data - does it show this?


If this is all rubbish, then please ignore it! I fear the differences
between what people hear may segregate us into more than two groups - but I wondered if there was any way (and any use) in sorting people into those groups? Maybe (if it's not possible from the current data itself) you could add a pre-test to the next set of tests. This pre-test would check people's hearing (and, implicitly, their audio equipment). Then, in the analysis, you could see if there was a correlation between hearing ability and codec preference/ranking/whatever.

Just a thought.

-------------------------------------------------------

My response:

In fact, this sort of thing can be analyzed by statistics.  However, the first thing that is needed is a good way of splitting up listeners into the various groups.  Ideally, an audiogram of each person is run prior to the listening test.  Since that is impractical for Internet tests, the next best option is to encode another reference file, lowpassed at various frequencies: I would choose 13 kHz as a start (I am almost certain to fail to detect this, by the way), and tell people to report ABX results of this against the original.  This would distinguish groups 1 and 2 during post-screening, at the cost of adding yet another file to download (but a specialized audio tool could incorporate an ABX of a file with various lowpasses as pre-screening tests) and another arduous ABX test.  To weed out group 3, something like ABC/HR is also needed, so that the original is rated each time a sample is rated.  Those who consistently rate the original less than 5.0 would be eliminated.  I would prefer ABC/HR over ABX to reduce tester burden.

I think 16 kHz is a rather high distinguishing point.  I suggested 13 kHz because it seems that many people are unaware of, for instance, FhG's (buggy version of its) Alternate codec doing bad things between 13 and 16 kHz.

The problem with using ABX to post-screen is that currently I can't be sure it is performed, or performed properly.

There was anecdotal evidence of the dichotomy you mention even from the comments in the first test -- read what Hans wrote about WMA8, then read what I wrote.  I believe the split opinion is caused by the way WMA8 behaves at frequencies below 12 kHz (which is my limit in music), which is to say quite good most of the time at 128 kbit/s.  However, those who can hear high frequencies well report nasty metallic artifacts.  I have no way to analyze this split currently with statistics, though.  I think you have excellent comments and deserve a wider audience.

ff123