Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: MPC vs OGG (Read 30228 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

MPC vs OGG

Reply #50
Quote
Originally posted by layer3maniac
As I said before, I DID ABX the mpc decode 16 out of 16 times. What does that prove? Can you duplicate it? Can Garf? That proves NOTHING.
If you ABXed even with 15 out of 16 times it proves to everybody with 99.9% confidence that there's a audible difference.
It doesn't matter if anybody else can duplicate it or not.
The only other factor is, are you credible person or not (and not telling false results, although I'm not implying you are telling false results)..
Juha Laaksonheimo

MPC vs OGG

Reply #51
the three of you are beyond me in terms of knowledge, and for all I know you are good friends and the apparent ripping on each other is done with tongue in cheek.  But if that's not the case, and the disagreement has real unfriendly overtones (at least on one side), a conclusion will probably be reached more easily if the issues of disagreement are not personalized (e.g., "shows how much you know" and such comments).

My training is in social science, so I feel qualified to comment on bias and double-blindness and such.  ABX is pretty well set up as double blind.  As such, it cannot be affected by bias, to the extent that bias refers to a normative interest in one codec coming out as better than another.  But as hearing perception is individual, some people may pick up on difference that others are unable to, or may differentiate between samples using a different methodology (e.g., I may notice general brightness while someone else may notice a specific artifact).  But the setup of a double-blind test necessitates that a person must be able to differentiate between the samples (which is the whole point of the test) before any normative bias can be exercised.  Finally, phrases such as "audible difference" must be carefully specified to answer the question: audible to whom?
God kills a kitten every time you encode with CBR 320

MPC vs OGG

Reply #52
layer3maniac,

What sounds wrong with the MPC insane file?  Achieving 16/16 on an ABX is not meaningless to the outside world, obviously, because it proves with extremely high confidence that you heard a real difference -- which is the purpose of ABX.  To just say that you heard a difference without providing ABX results is far less convincing (unless I have a lot of experience with you as a listener).

As for the best way to evaluate near-transparent codec quality:  that would be taking a panel of trained and sensitive listeners (say 30 if you can get them), under controlled conditions, and conducting listening tests of many different samples.  The MPEG verification tests (AAC) that Ivan pointed to is an example of such a study.

ff123

MPC vs OGG

Reply #53
Quote
ABX is pretty well set up as double blind.  As such, it cannot be affected by bias, to the extent that bias refers to a normative interest in one codec coming out as better than another. 
Bias is OBVIOUSLY present in an ABX test to evaluate codecs. Think about this: If you are comparing a preferred codec to another one, the odds are that you're NOT going to hear the same problems that someone who DOESN'T prefer that codec does. It takes a certain amount of concentrated effort to do an accurate ABX, and it is only human nature to try harder to find problems on a sample from a codec you don't prefer. Now if you do a TRULY blind test, like ff123's listening test -then you have removed this bias.

MPC vs OGG

Reply #54
I also think the pro-mpc bias of some of the people on this board is evident in their response to THIS thread. Look where results favored mpc over wavpack lossy. Do you see all these SAME people whining about the unreliability of the "inferior psychmodel" used by eaqual and stating that these results mean nothing? No, it's only when THEIR preference comes out badly in a test. Then they get all bent out of shape. This is the sort of bias which leads me to place very little creedence in abx tests comparing codecs from these same people. You want the truth? You can't handle the truth!

MPC vs OGG

Reply #55
Quote
Originally posted by layer3maniac
Bias is OBVIOUSLY present in an ABX test to evaluate codecs. Think about this: If you are comparing a preferred codec to another one, the odds are that you're NOT going to hear the same problems that someone who DOESN'T prefer that codec does. It takes a certain amount of concentrated effort to do an accurate ABX, and it is only human nature to try harder to find problems on a sample from a codec you don't prefer. Now if you do a TRULY blind test, like ff123's listening test -then you have removed this bias.
Uhh? I gotta ask have you really done any ABX testing at all?? You are comparing unknown X-sample to A(lets say original) and B(lets say encoded). You don't know if X is original or encoded and you gotta choose if X is either A or B. If you get high confidence result, it proves everybody there's a difference. If you don't get high confidence result, it doesn't prove anything to anybody.
Excuse me but this does start to sound you haven't done any ABX testing at all?? Bias has nothing to do with high confidense result.
Juha Laaksonheimo

MPC vs OGG

Reply #56
good point, layer3maniac.  While ABX does make it more difficult for bias to come in because it is not known whether X is the original or the encoded version, normative bias (specific desire to see one codec "beat" another) may still work its way in, by the listener paying attention specifically for problem areas one codec.  In ff123's test you don't know what codec is being tested against the original wav, so you can't be listening for specific problem areas of any given codec.  However, the more general bias, unmotivated and simply dependent on the hearing ability of the individual, will be present in both ABX and a fully blind test like ff123's, and I think that this type of bias is the larger factor in the vast majority of cases.
God kills a kitten every time you encode with CBR 320

MPC vs OGG

Reply #57
Quote
Originally posted by timcupery
good point, layer3maniac.  While ABX does make it more difficult for bias to come in because it is not known whether X is the original or the encoded version, normative bias (specific desire to see one codec "beat" another) may still work its way in, by the listener paying attention specifically for problem areas one codec.
No, this is not so. Once again. The confidence of ABX is based on iterative nature of blind testing. You don't know whether X is original or encoded, thus if you get high confidence result like 15/16, it doesn't matter if you prefer or not prefer that codec, all it shows that you can hear a difference with 99.9% confidence.

Please FF123, you continue...
Juha Laaksonheimo

MPC vs OGG

Reply #58
Here's the only way that I can see bias affecting an ABX test:

Assume that a person wants to test the results of two different codecs against the original wav file.  The person wants, either consciously or subconsciously, codec a to win out and be more difficult to tell apart from the original wav.  This would result in the person listening more carefully for artifacts in codec b's problem areas and listening less carefully for problem areas of codec a.  But I admit that this is a far-fetched scenerio, and the basic "bias" (if it may be called bias) is simply the ability of different people's ears to pick up on different things.  And in this sense, a "fully blind test" - where the listener doesn't know the type of codec which is being compared to the original wav - is no different than ABX.  Someone would have to try very very hard to be motivationally biased in an ABX test.
God kills a kitten every time you encode with CBR 320

MPC vs OGG

Reply #59
JohnV,

layer3maniac's point is that in an ABX test, you know which codec you're testing.  So let's say a listener has a conscious or unconsious bias against WMA8 and a similar (but opposite) bias for MPC.  The contention is that such a person will try harder to find a difference in the WMA8 case than in the MPC case.  This is at least a plausible hypothesis, but I don't know of a way to prove/disprove it or how to estimate its possible effects.  A tool such as the one I described in another thread would remove this possible source of bias.

ff123

So layer3maniac, what was wrong with the MPC encode?

MPC vs OGG

Reply #60
Yeah, of course you can randomize the encoded sample, so then there's no that kind of "favorite codec" bias in ABX.

I don't think there's this kind of bias for me however, and I would think it would be the other way: I would like to know if my favorite codec sample is not transparent so I would ABX even more rigorously. This is definitely the case in codec tweaking.

I would not bother ABX not very good codec so well... but would probably get high confidence result anyway.

And as it's been said. Only thing that matter in ABX is high confidence result. Anything else doesn't prove anything (except for the individual at that moment that he could not hear a difference).
Juha Laaksonheimo

MPC vs OGG

Reply #61
I assume that the vast majority of people join you in wanting to critically test their favorite codecs.  I suspect that the listener fatigue is more of an issue and thus a reason to randomize the codecs than is any desire for one codec to be "proven" better than another.  So in general, I have no qualms about the ABX testing procedure, because X remains unknown.
God kills a kitten every time you encode with CBR 320

MPC vs OGG

Reply #62
timcupery took the words out of my keyboard...

But keep in mind that even though people with good ears favor good codecs (mainly mpc) they are striving to reach even better results perhaps leading to an open mind when comparing codecs. (it could happen, you know!  )

I don't know anything about the inner workings of EAQUAL but it seems like it uses some sort of psychoacoustic model and afaik every one of those models was developed using listening tests conducted by humans. So in the end you are not working with a tool with mathematical precision but with an estimate of how much some audible signal differs from the original.

I wonder how you can be less sceptical about the results of a computer program than about the results of a huge number of people.

If you could use for example the xing model to judge codec quality, would you trust it even though many have proven that it is not accurate?

If you developed a program to simulate the effects of a new type of medicine (and like in EAQUAL you know the simuilation's accuracy is limited) and the software would tell you it wouldn't be harmfull, but in a test group say 80% complain about illnes, would you still bring it on the market? With this piece of software you are able to rule out a large number of potentially harmfull substances but  it can not guarantee the substance is harmless. It is just worth a closer inspection.

To make things short: it has been said from the author to not make any final judgements with this tool. So don't.

How about this: someone encode and decode the sample with ogg and mpc and provide the wavs (lpac or whatever) so those trained ears out there can have a listen?

MPC vs OGG

Reply #63
Quote
  A tool such as the one I described in another thread would remove this possible source of bias.
Bingo! 
Quote
So layer3maniac, what was wrong with the MPC encode?
It has to do with the background scratching sound in the first ten seconds. With MPC it is scratchier, harsher sounding. If you focus on that, the difference becomes quite obvious. This has always been a difficult album to encode, you should see what it sounds like with Xing or Blade. Ugh. I was so unhappy with the way this particular file gets encoded that I scrapped all my lossy files and went completely lossless with my computer music collection over two years ago...

MPC vs OGG

Reply #64
Quote
Originally posted by JohnV
What? ABX IS blind test. Sure, you know A and B, but you don't know X. That's what makes it blind. The idea is to find out if you can hear any difference. The only usable result from ABX is that you can say with high confidence that there's a difference. It doesn't matter if some people has not very good hearing and can't get high confidence ABX result as long as even ONE people can provide high confidence result that there's a difference. It's enough to prove there's an audible difference, and that's the purpose of ABX.


That's still not really rigorously "blind."  If you are, for example, biased towards the codec, you might not try as hard to hear artifacts.  If you are biased against the codec, you might read up ahead of time on what sorts of artifacts the particular codec is most likely to produce and listen very carefully multiple times to try to hear them.  A truly blind test would be where the tester doesn't know what codec he is testing.

So I'd say if you *do* ABX a sample a significant number of times, it is "proof" that you can reliably hear a difference.  But if you *don't* ABX the sample (i.e. get 8/16 or something), it's not proof that the codec is transparent or nearly transparent - you could just not be listening hard enough or to the right things.  And thus ABX can't be used to say "codec A is better than codec B because I could ABX codec A 16/16 but I couldn't for codec B" - that could just be because you're biased towards codec B so didn't listen for artifacts hard enough (either consciously or subsconsciously).  The only thing it could be reliably used for then would be to find artifacts (i.e. distinguish between real and imagined artifacts).

MPC vs OGG

Reply #65
Quote
Originally posted by Delirium


That's still not really rigorously "blind."  If you are, for example, biased towards the codec, you might not try as hard to hear artifacts.
Well I already answered this consern, and ABX itself doesnt of course say that you have to know which codec the encoded is, so this is not ABX problem. Tester can randomize the encoded samples if wanted.

Quote
If you are biased against the codec, you might read up ahead of time on what sorts of artifacts the particular codec is most likely to produce and listen very carefully multiple times to try to hear them.  A truly blind test would be where the tester doesn't know what codec he is testing.
Sure, but this only matters if the tester doesn't hear a difference. If he does, then it doesn't matter if he's biased or not. But as I said, I personally test my favorite codecs even more rigorously than not so good codecs.

Quote
So I'd say if you *do* ABX a sample a significant number of times, it is "proof" that you can reliably hear a difference.  But if you *don't* ABX the sample (i.e. get 8/16 or something), it's not proof that the codec is transparent or nearly transparent - you could just not be listening hard enough or to the right things.
Yes, this is what I've said in this thread n+1 times already.

Quote
And thus ABX can't be used to say "codec A is better than codec B because I could ABX codec A 16/16 but I couldn't for codec B" - that could just be because you're biased towards codec B so didn't listen for artifacts hard enough (either consciously or subsconsciously).  The only thing it could be reliably used for then would be to find artifacts (i.e. distinguish between real and imagined artifacts).
Yes, this is true. I personally don't use ABX to judge which codec sample I prefer audibly more. For that I use blind iterative ABC testing. (A is always the original and B/C is either codec_1 or codec_2 changing randomly). The result is of course subjective.
Juha Laaksonheimo

MPC vs OGG

Reply #66
Quote
Neither a big listening test nor an objective measurement tool, no matter if good or bad, can yield to an objective proof of quality, because every proof of quality can only be *subjective*. Everyone has the right to say about any rating, listening test or measurement, that this doesn't reflect his opinion.


I wonder how you conclude that a big, well-conducted listening test is not an objective measure.

The individual results may not be objective in themselves, but the outcome of the test should be, provided that those who conducted it did not bias the test in any way.

You can take a large amount of subjective measurements and use that to make an objective conclusion about them.

You can argue towards the meaning of the result of a listening test, but not against its objectivity.

--
GCP

MPC vs OGG

Reply #67
Quote
At least with a tool like this we get completly unbiased results.


The results are biased towards a psymodel that corresponds better with that of the tool itself.

That should be a capital offense in perceptual coding.

For comparisations with a single psymodel, that should not matter as much (i.e. parameter finetuning)

--
GCP

MPC vs OGG

Reply #68
Quote
Originally posted by layer3maniac
I said that ON THIS SAMPLE, THIS TOOL says that ogg is 18.5 times better than mpc. And that's TRUE, like it or not...


If you define 'better' as 'produces output that corresponds more to what the very limited psymodel of the tool considers to be good output', then I'm fine with you.

I personally pefer a psymodel that produces more transparent output for human hearing. That's my definition of 'better'.

--
GCP

MPC vs OGG

Reply #69
Quote
Originally posted by layer3maniac
As I said before, I DID ABX the mpc decode 16 out of 16 times. What does that prove? Can you duplicate it? Can Garf? That proves NOTHING. That's why I DID the eaqual test on THIS sample. EVERYONE can duplicate the eaqual test, which proves more than one individual's skewed abx test.


First of all, it's perfectly possible to reproduce the ABX tests. Try the sample again, and you should get a significant score again. The same will be true for me. This is perfectly ok: we obviously have different hearing, and we just proved neither sample is transparent.

Obviously, for me the MPC is better, and for you the Ogg. This immediately means the tool will be wrong for one of us, no matter how accurate it is.(*)

Secondly, I want to point out that the tool cannot prove anything, and a big listening test can.

The reason is simple. After a listening test we can calculate the confidence in the results. Getting out who is 'best' from a listening test is easy, just add up the results. But as you have seen in the past, getting a confidence margin out of it is _hard_. I and especially ff123 worked hard to get anything usable out of it.

The tool has no such thing. It gives a result, no error margin, nothing. Anybody who does experimental sciences will understand this means it can never prove anything, since you have no idea what you can conclude from it's output.

Is it a coincidence, or within the tolerance of it's measurement? You cannot tell...

(*) note that if it would have a confidence level, it would be possible for the tool to be 'right' in this case

--
GCP

MPC vs OGG

Reply #70
ABX is meant to do a comparaison between A and B.
If you can't find a difference between A and B (let's say 15/20), that prove NOTHING.
If one person (assuming he is not a lyer) can find a difference between A and B (let's say 30/30 to reject luck, but 20/20 should be quite enough), that prove that at least one person on earth can make a difference, so THERE IS a difference.
There is no bias in here, the mind, the ears, the hardware, the sentiments from this person doesn't count.

Now, you can't say that codec foo is better that codec bar because you had 17/30 with codec foo versus original and 30/30 with codec bar versus original.

MPC vs OGG

Reply #71
Quote
Originally posted by layer3maniac
Bingo!   It has to do with the background scratching sound in the first ten seconds. With MPC it is scratchier, harsher sounding. If you focus on that, the difference becomes quite obvious. 


Reading porcelain.wav ...
Reading porcelain.mpc ...
2*16 bit 44100 Hz*15.468 sec
Listening  X    Vote for X:=B  OK    1/1    1.00000                     
Listening  X    Vote for X:=B  OK    2/2    2.00000  1.00 bit           
Listening  X    Vote for X:=B  OK    3/3    4.00000  1.00 bit           
Listening  X    Vote for X:=A  -      3/4    1.60000  0.22 bit           
Listening  X    Vote for X:=B  OK    4/5    2.66667  0.35 bit           
Listening  X    Vote for X:=B  -      4/6    1.45455  0.10 bit           
Listening  X    Vote for X:=A  OK    5/7    2.20690  0.19 bit           
Listening  X    Vote for X:=A  OK    6/8    3.45946  0.25 bit           
Listening  X    Vote for X:=B  OK    7/9    5.56522  0.30 bit           
Listening  X    Vote for X:=B  OK    8/10    9.14286  0.35 bit           
Listening  X    Vote for X:=B  OK    9/11    15.2836  0.39 bit           
Listening  X    Vote for X:=B  OK    10/12    25.9241  0.42 bit           
Listening  X    Vote for X:=B  OK    11/13    44.5217  0.45 bit           
Listening  X    Vote for X:=A  OK    12/14    77.2830  0.48 bit           
Listening  X    Vote for X:=B  -    12/15    28.4444  0.34 bit           
Listening  X    Vote for X:=B  OK    13/16    47.0129  0.37 bit

The difference is that in the original, the noisyness of the scratching is constant, while in the MPC, this increases as the string gets louder.

I cannot hear the difference itself (edit: in overall noise level I mean), but I can hear that it gets louder in the MPC.

--
GCP

MPC vs OGG

Reply #72
Quote
Originally posted by jojolapin
ABX is meant to do a comparaison between A and B.
If you can't find a difference between A and B (let's say 15/20), that prove NOTHING.


15 out of 20 is 98% confidence.

Unless you did lots of tests, that certainly proves something.

Quote
If one person (assuming he is not a lyer) can find a difference between A and B (let's say 30/30 to reject luck, but 20/20 should be quite enough), 


20 out of 20 has a luck probability of < 0.000%

No need to go to 30

I think I agree with the rest you wrote though. Doing 13/16 on one clip and 16/16 on another doesn't prove the second is better than the first.

--
GCP

MPC vs OGG

Reply #73
Garf, how do you get that cool formatted output on your ABX results?

MPC vs OGG

Reply #74
Quote
Originally posted by layer3maniac
Garf, how do you get that cool formatted output on your ABX results?


I use the abx.c from LAME, under Linux, and copy & paste directly from a terminal window.

--
GCP