Hydrogenaudio Forums

Hydrogenaudio Forum => General Audio => Topic started by: audioslut512 on 10 November, 2007, 12:59:47 PM

Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 12:59:47 PM
I recently did some ABX tests of music encoded in FLAC levels 0-8.

To my surprise I could tell the difference more often than not.

Yet I couldn't help but wonder, what does this really prove?

It seems that if an audible difference exists between two tracks, that difference wouldn't necessarily indicate anything as to which track is of higher quality, but rather, that those two tracks are merely different from one another...

Am I missing the point in all of this?
Title: What does ABX really prove?
Post by: user on 10 November, 2007, 01:05:27 PM
Please show details of your abx tests.
Do you have logs ?

And more important,

describe more exactly, what do you have abxed ?

flac -0 vs. flac -8 of the same wav source file ?

well, if you should have proven to listen a difference between these files, then you might suffer at strong imagination or your abx test was not blind, but nevertheless don#t conclude from your personal test to ABX tests.
it depends always on the specific setup, what an abx test can prove.
Title: What does ABX really prove?
Post by: tgoose on 10 November, 2007, 01:06:38 PM
An ABX test tells you how likely it is that you can tell the difference between two tracks. If by "more often than not" you mean that you got it right more than 50% of the time, that doesn't prove anything. If you get significantly above 50% over a large number of trials (more than 20) and it's repeatable then that might imply that there is something wrong with the FLAC encoder. Do you have the ABX log saved to post so I can see what's happened?

And yes, you're right, ABX can never tell you which is "better", only that there is a difference. So it's only useful for tuning something for transparency or to test for problems with encoding, not really many other things.
Title: What does ABX really prove?
Post by: maksm on 10 November, 2007, 01:07:24 PM
No, you're not.
If there is a difference, and the only source of differences is an encoding (because we encoded the same track), than it must be the encoding.
And we "know" (this isn't neceseraily true) that the encodings have a (mathematical) scale of their quality.
Therefore the lower quality one must be "worse".
Title: What does ABX really prove?
Post by: tgoose on 10 November, 2007, 01:11:05 PM
No, you're not.
If there is a difference, and the only source of differences is an encoding (because we encoded the same track), than it must be the encoding.
And we "know" (this isn't neceseraily true) that the encodings have a (mathematical) scale of their quality.
Therefore the lower quality one must be "worse".


No?

If you had a poorly tuned 320kbps MP3 and a well tuned VBR MP3 from a different encoder, and you ABXed the two then if you could tell a difference there's not necessarily a way of telling which is better without referencing the original uncompressed version.
Title: What does ABX really prove?
Post by: greynol on 10 November, 2007, 01:15:02 PM
There are other tests to rank differences.  The point of ABX is to determine if you can distinguish a difference.
Title: What does ABX really prove?
Post by: Nikaki on 10 November, 2007, 01:20:19 PM
Doing ABX on FLAC and actually being successful in it, while a binary comparison of the resulting wav files shows no difference, is an indication that you're imagining things
Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 01:36:21 PM
There are other tests to rank differences.  The point of ABX is to determine if you can distinguish a difference.


This is more along the lines of what I was asking.  It seems that ABX only helps you determine whether there is  or is not a difference between two tracks encoded from the same source, and not necessarily whether those differences make one track of higher quality than another...
Title: What does ABX really prove?
Post by: greynol on 10 November, 2007, 01:40:09 PM
An ABX test cannot prove that there isn't a difference.  It only has the ability to demonstrate that the specific person taking the test was able to determine that there was a difference.  These are not the same thing.
Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 01:43:17 PM
Actually, an ABX test cannot prove that there isn't a difference, it only has the ability to demonstrate that the specific person taking the test was able to determine there was a difference.


Agreed.
Title: What does ABX really prove?
Post by: greynol on 10 November, 2007, 01:50:37 PM
I concur with the first response you were given, btw.  Telling us that you can tell an audible difference between flac compression levels brings your methodology into serious question.
Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 02:03:26 PM
I concur with the first response you were given, btw.  Telling us that you can tell an audible difference between flac compression levels brings your methodology into serious question.



To clarify, I didn't mean to say I could tell the difference between FLAC 0-8 more often than not...

Rather, that of some of the tracks that I tested, which I had previously encoded from the same source (using the same FLAC 1.2.1 codec), difference to my own ear was indicated to me (and only me) by a probability that was statistically significant.

HOWEVER, I did NOT mean to start some sort of argument as to the means by which I performed the tests or for that matter, anything to do with my testing at all (I probably shouldn't have even mentioned those tests to begin with, it was just for context).

I was just trying to frame a much more general question, pertaining to the difference between two tracks sounding different as opposed to one track sounding better than another...

My apologies if I came across otherwise!
Title: What does ABX really prove?
Post by: greynol on 10 November, 2007, 02:08:25 PM
You're not understanding.

If you think you're able to tell the difference between two flac encodings and feel that the difference lies only in the encodings themselves, then you aren't conducting a proper ABX test.
Title: What does ABX really prove?
Post by: saratoga on 10 November, 2007, 02:08:48 PM
To my surprise I could tell the difference more often than not.


To clarify, I didn't mean to say I could tell the difference between FLAC 0-8 more often than not...


difference to my own ear was indicated to me (and only me) by a probability that was statistically significant.


Wait, so which is it?
Title: What does ABX really prove?
Post by: Cosmo on 10 November, 2007, 02:31:01 PM
It seems that if an audible difference exists between two tracks, that difference wouldn't necessarily indicate anything as to which track is of higher quality, but rather, that those two tracks are merely different from one another...

This is absolutely true.

But you shouldn't ever hear a difference between two lossless files, because regardless of the type or amount of compression, they are always decoded (bit identical to original) at playback. So if a difference IS audible, it must be due to improper test methodology or buggy software or somesuch reason...
Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 05:47:28 PM

To my surprise I could tell the difference more often than not.


To clarify, I didn't mean to say I could tell the difference between FLAC 0-8 more often than not...


difference to my own ear was indicated to me (and only me) by a probability that was statistically significant.


Wait, so which is it?


The ABX tests which I performed were on tracks which were contained within the set: FLAC levels 0-8.

If I performed an ABX test, the files tested were limited to FLAC 1,2,3,4,5,6,7,8.

That does NOT indicate that I tested ALL levels 0-8.

OF THE TRACKS WHICH I DID TEST, I was able to discern a difference more often than not.

Statistically significant is that which is unlikely to occur by chance.

A statistically significant difference in this case, merely implies that statistical evidence indicates that a discernible difference has a greater probability of existing than it does a probability of not existing.

Whether that difference is due to errors in encoding, playback, or whatever, is entirely irrelevant.
Title: What does ABX really prove?
Post by: eevan on 10 November, 2007, 06:17:19 PM
No offence, but having read all your posts, I'm not sure exactly what are you implying? Why should this be of any interest if you say that it's irrelevant to pinpoint the cause of your test result.
Title: What does ABX really prove?
Post by: AndyH-ha on 10 November, 2007, 06:31:23 PM
In the laboratory, and in most audiophile living rooms, people can be psyched into discerning a difference where none exists. The only thing that accounts for that is belief or expectation creating abnormal conditions inside the listener's head. A statistically significant score, even one approaching certainty, does not make any actual difference come into existence.

FLAC encoding/decoding can be shown to produce zero differences, and often has been so shown, both mathematically and empirically. Proper ABX tests disallow any possibility of expectation or belief effecting test scores. With no data differences, and no opportunity for perceptual bias, any indicated differences, no matter what the test score, indicates something wrong, something not normal in the data or the process.
Title: What does ABX really prove?
Post by: greynol on 10 November, 2007, 06:39:19 PM
What an incredible waste of time!

I propose that this guy submit samples and an ABX log indicating that he can distinguish a difference or the thread be closed.

audioslut512:
It is universally accepted that foobar2000 is able to perform proper ABX tests.  Make sure you use it and spare us the fluffy language about statistical evidence which is ringing quite hollow.

http://www.hydrogenaudio.org/forums/index....showtopic=16295 (http://www.hydrogenaudio.org/forums/index.php?showtopic=16295)
Title: What does ABX really prove?
Post by: odyssey on 10 November, 2007, 06:51:41 PM
How can you discuss this? 

Am I missing the point in all of this?

YES! Here is why:
I recently did some ABX tests of music encoded in FLAC levels 0-8.

You can't DO that! FLAC is a lossless format - That means the source are NOT degraded, no matter which level of compression you choose. The only differences the levels of a lossless encoder produce, is encoding and decoding speeds.

If you try to ABX these, you are actually comparing the same stuff, YES that's why you can't distinguish them.

You will usually use ABX tests to determine if you are able to distinguish a lossy encoded file against the original. It's very useful to make people shut up when they believe they can tell high bitrate encoded lossy files from the source
Title: What does ABX really prove?
Post by: kwanbis on 10 November, 2007, 07:11:50 PM
I recently did some ABX tests of music encoded in FLAC levels 0-8.

You know that FLAC is lossless right? It means that no matter what the compression level, you always get the same "sound", the only diference is the size of the file.

What you are saying is like saying you compressed a TXT file with ZIP at two compression levels, and when you uncompressed them, they where diferent.
Title: What does ABX really prove?
Post by: guruboolez on 10 November, 2007, 07:29:45 PM
OF THE TRACKS WHICH I DID TEST, I was able to discern a difference more often than not.

Statistically significant is that which is unlikely to occur by chance.


May I ask you the score? « More often than not » is not very precise. I can (and everybody should achive the same performance) also reach some 55% or 60% good trials while ABXing lossless from time to time. It also work with WAV vs WAV ABX test and even without listening to anything. You just need to be a bit lucky during the first trials and limit them to ~20...30, not more. The overall score will naturally tend to 50% on long term but if you only keep the first ones you may sometime get a probablity higher than 50% and maybe the illusion of success.

That's why I'd like to have some precisions: what are the score of all tests (not only the most favorable session)? Did you fix the number of trials before starting?
Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 08:59:00 PM
just forget it kids.
Title: What does ABX really prove?
Post by: saratoga on 10 November, 2007, 09:16:26 PM
I still don't really understand what you did.  Could you just post the logs?
Title: What does ABX really prove?
Post by: audioslut512 on 10 November, 2007, 10:56:20 PM
I still don't really understand what you did.  Could you just post the logs?



I would if I had them.  All I did was ABX Radiohead's Kid A, encoded in FLAC level 4 and level 8, both directly from the CD.  I matched the correct tracks to one another 9 of the 10 times I ran it.  That's all.  Maybe I screwed something up while encoding them that caused one to sound different from the other.  Ultimately I had no idea which one was which, I just managed to notice the differences and match the tracks accordingly.  Hence my post...
Title: What does ABX really prove?
Post by: shadowking on 10 November, 2007, 11:07:20 PM
I wouldn't even want to see any lossless abx logs - assuming all samples were bit identical.
Title: What does ABX really prove?
Post by: William on 11 November, 2007, 12:14:24 AM
audioslut512:

What is your definition of ABX? What is your testing method?
Title: What does ABX really prove?
Post by: audioslut512 on 11 November, 2007, 01:54:23 AM
audioslut512:

What is your definition of ABX? What is your testing method?


I just used the ABX in foobar2000.
Title: What does ABX really prove?
Post by: Bad Monkey on 11 November, 2007, 04:11:04 AM
Just run the bit-comparator (http://www.foobar2000.org/components/index.html) to check you are in fact comparing like for like... it will tell you if there is any difference already...

On 2nd thoughts, why is this nonsense being allowed on HA? I thought I was on Head-Fi for a moment...
Title: What does ABX really prove?
Post by: tgoose on 11 November, 2007, 04:18:07 AM
An ABX test cannot prove that there isn't a difference.  It only has the ability to demonstrate that the specific person taking the test was able to determine that there was a difference.  These are not the same thing.

Well in actuality, it can never even prove that the person was able to determine a difference, it can only show how likely it is that the listener can tell a difference. That may seem academic but it's important for the next point:



All I did was ABX Radiohead's Kid A, encoded in FLAC level 4 and level 8, both directly from the CD.  I matched the correct tracks to one another 9 of the 10 times I ran it.  That's all.  Maybe I screwed something up while encoding them that caused one to sound different from the other.  Ultimately I had no idea which one was which, I just managed to notice the differences and match the tracks accordingly.  Hence my post...

I can't remember exactly how the output of fb2k looks (or if it's even the same as last time I used it) but it should have had a percentage at the end which says how likely it is you were just guessing. 9/10 is quite high but if you conduct enough ABX tests clicking randomly you will get every result from 0/10 to 10/10. If you can get the same result again even with ten tests then it becomes much more likely that you have a problem somewhere in your encoding. Otherwise it's probably just chance.
Title: What does ABX really prove?
Post by: Alexxander on 11 November, 2007, 05:06:48 AM
I would if I had them.  All I did was ABX Radiohead's Kid A, encoded in FLAC level 4 and level 8, both directly from the CD.  I matched the correct tracks to one another 9 of the 10 times I ran it.  That's all.  Maybe I screwed something up while encoding them that caused one to sound different from the other.  Ultimately I had no idea which one was which, I just managed to notice the differences and match the tracks accordingly.  Hence my post...

You sure must have done something wrong somewhere between putting the CD in drive and clicking the ABX utility in foobar2000. As said before in an other post, bitcompare the decoded FLAC files and you'll see the resulting wav files are different.
Title: What does ABX really prove?
Post by: kjoonlee on 11 November, 2007, 06:34:20 AM
"More often than not" does not equal "statistically significant".

Let's say you got 6 out of 10 correct: this is usually written 6/10 at HA. 6/10 has a 37.6% chance of being sheer luck. This is no good.

2/3, 3/4, 3/5, 4/6, 4/7, 5/8, 5/9, 6/10, 6/11... these are all no good.

If you're saying something like 60/100 then that might mean something, but unless you give us those numbers, you're just being vague and might well be fooling yourself.
Title: What does ABX really prove?
Post by: robert on 11 November, 2007, 07:03:49 AM
All I did was ABX Radiohead's Kid A, encoded in FLAC level 4 and level 8, both directly from the CD.
Do I get you right, you ripped it twice from the CD? And what does the difference you hear sound like? Some popping noise, some click?
Title: What does ABX really prove?
Post by: audioslut512 on 11 November, 2007, 05:38:59 PM
All I did was ABX Radiohead's Kid A, encoded in FLAC level 4 and level 8, both directly from the CD.
Do I get you right, you ripped it twice from the CD? And what does the difference you hear sound like? Some popping noise, some click?



It was actually quite interesting...

When I listened to substantial chunks of each track I couldn't tell the difference at all.

I then came up with a little method, of switching between A,X,Y,B, in rapid succession, with about 1-2 seconds in between each switch.

I then noticed that if when I did this, cycling through the four buttons clockwise for example, I would hear shifts in the overall sound (im not sure how to desribe those shifts, I just referred to them in my mind as up and down).

So in doing so, if I heard upshift, downshift, upshift, downshift, I would then answer accordingly, and got it right 9 out of 10 times.

Like I said before though, I was never challenging ABX, and admitted that I might have screwed up the encoding processes of the tracks, and maybe that was what caused me to be able to discern the difference.

The tracks were both ripped from the same CD however, and both were ripped under the same conditions (i.e, no other programs running at the time).

I suppose I could and do it again to see if its repeatable, and to maybe stop so many people from bashing me, but my post was never about the ABX test I did on those tracks to begin with, it was about a question that those tests led me to...
Title: What does ABX really prove?
Post by: greynol on 11 November, 2007, 05:43:32 PM
The point is that you cannot make a proper conclusion from an ABX test if you don't know how conduct one correctly.

eevan really hammered the point home if you ask me:
Quote
Why should this be of any interest if you say that it's irrelevant to pinpoint the cause of your test result.
Title: What does ABX really prove?
Post by: audioslut512 on 11 November, 2007, 05:44:57 PM
The point is that you cannot make a proper conclusion from an ABX test if you don't know how correctly conduct one.


ok well what did I do wrong then?
Title: What does ABX really prove?
Post by: greynol on 11 November, 2007, 06:00:18 PM
You chose not to accept the fact that you couldn't ABX substantial chunks and you did not take the proper steps to ensure that there wasn't an error in the encodings or that they used the exact same source.

Some good may come out of this discussion after all since you may have identified a problem with the way foobar2000's ABX test.  Too bad it took all the way to the 34th post for some useful albeit still qualitative information to come out.

You are aware that foobar2000 decodes the files you wish to compare to wave and places them in a temporary directory?  Being that these were all lossless encodes, you could have done a binary comparison as a simple sanity check.

It's quite possible that subsequent rips of the same track can produce different results, even under seemingly similar circumstances; and yes, these would have been revealed with a binary comparison.  A ripping error is the only logical conclusion if a binary comparison revealed differences.  Lossless is lossless after all!
Title: What does ABX really prove?
Post by: Heresiarch on 11 November, 2007, 06:09:32 PM
There is just no point in ABXing lossless files. If you suspect an encoding error you should do a bit comparison. Either there is no difference or there is. No point in listening for a difference in either case.
Title: What does ABX really prove?
Post by: greynol on 11 November, 2007, 07:13:09 PM
There is just no point in ABXing lossless files.
It all depends on what you're trying to compare.  What if a track gives me ripping errors and I want to see if I can hear them?
Title: What does ABX really prove?
Post by: Bad Monkey on 11 November, 2007, 08:08:50 PM
He doesn't even know if there are errors. It's the first thing to do. Run the bit comparator already!
Title: What does ABX really prove?
Post by: sld on 11 November, 2007, 11:34:23 PM
just forget it kids.

You're tacitly admitting you're trolling this forum?
Title: What does ABX really prove?
Post by: audioslut512 on 12 November, 2007, 12:57:55 AM
You chose not to accept the fact that you couldn't ABX substantial chunks and you did not take the proper steps to ensure that there wasn't an error in the encodings or that they used the exact same source.

Some good may come out of this discussion after all since you may have identified a problem with the way foobar2000's ABX test.  Too bad it took all the way to the 34th post for some useful albeit still qualitative information to come out.

You are aware that foobar2000 decodes the files you wish to compare to wave and places them in a temporary directory?  Being that these were all lossless encodes, you could have done a binary comparison as a simple sanity check.

It's quite possible that subsequent rips of the same track can produce different results, even under seemingly similar circumstances; and yes, these would have been revealed with a binary comparison.  A ripping error is the only logical conclusion if a binary comparison revealed differences.  Lossless is lossless after all!



It's not that I chose not to accept that I couldn't ABX substantial chunks, it's just that I didn't know any better, and for that matter still don't.

Now that you mention binary comparison, I will look into it.  That is yet another thing I am trying to understand...
Title: What does ABX really prove?
Post by: audioslut512 on 12 November, 2007, 01:09:38 AM

just forget it kids.

You're tacitly admitting you're trolling this forum?



Trolling for whom?

I am just trying to understand how this stuff works and get crucified for it.

Thanks again.
Title: What does ABX really prove?
Post by: Light-Fire on 12 November, 2007, 02:04:54 AM

I still don't really understand what you did.  Could you just post the logs?



I would if I had them.  All I did was ABX Radiohead's Kid A, encoded in FLAC level 4 and level 8, both directly from the CD.  I matched the correct tracks to one another 9 of the 10 times I ran it.  That's all.  Maybe I screwed something up while encoding them that caused one to sound different from the other.  Ultimately I had no idea which one was which, I just managed to notice the differences and match the tracks accordingly.  Hence my post...


You more likely screwed up when testing and not when encoding. You are saying that you perceived a difference between two things that are EXACTLY the same.
Title: What does ABX really prove?
Post by: SiriusB on 12 November, 2007, 02:25:51 AM
Statistically significant is that which is unlikely to occur by chance.

A statistically significant difference in this case, merely implies that statistical evidence indicates that a discernible difference has a greater probability of existing than it does a probability of not existing.

Whether that difference is due to errors in encoding, playback, or whatever, is entirely irrelevant.



If you repeatedly, 'successfully' (>16 trials each, p<0.05) ABXd flac from flac, or flac from .wav, all using the same track, then your flac encoding/decoding is broken, or your ABX test was crap.  Seriously. If you understood what you are claiming, you'd know this must be true.
Title: What does ABX really prove?
Post by: Vitecs on 12 November, 2007, 03:14:42 AM
Not to mention what author is doing with his ABXing but to note people who says that ABX-ing lossless is pointless in any cases - I have one example (theoretical though): for testing decoder part. Can it happen in portable player that lack of processing power will lead to produce decoding errors in higer compression levels? Or software bugs will allow one lossless format sound better than another? You can't easily bit-compare DAP output...
Title: What does ABX really prove?
Post by: cliveb on 12 November, 2007, 03:54:20 AM
Can it happen in portable player that lack of processing power will lead to produce decoding errors in higer compression levels?

There's a heated argument that occasionally flares up over on the Slim Devices forum that's related to this speculation:

Some people claim that if they decode their FLACs on the server and stream uncompressed WAV to a Squeezebox, it sounds different than if they stream FLAC and have it decoded in the Squeezebox.

The hypothesis is that the additional CPU activity required to decode the FLAC in the Squeezebox might possibly generate some RFI that affects the analogue circuitry. (Note: there is absolutely no suggestion that the two processes result in different digital data). If you have an open mind, you have to concede that this is at least possible, however unlikely. (On the other hand, those who have taken the trouble to do a blind comparison failed to distinguish streaming FLAC and WAV).

I'm sure the same people who believe FLAC and WAV sound different in the above scenario would have no trouble speculating that the different CPU activity required to decode different levels of FLAC compression might result in different RFI which would in turn affect surrounding analogue circuitry in different ways.

My view? It seems unlikely in the extreme that any such effect (if it exists) will be audible. Life is certainly too short to bother testing it rigorously.
Title: What does ABX really prove?
Post by: sld on 12 November, 2007, 04:06:21 AM
Trolling for whom?

I am just trying to understand how this stuff works and get crucified for it.

Thanks again.

I'm sorry, then.what.was.the.quote.for?

just forget it kids.


Wasn't it on the part of your closed mind to refuse to accept the logical explanations provided by the more knowledgeable individuals in this forum? And you blame others for rightly 'crucifying' your obstinacy?

The entire forum is still waiting for your bit-comparisons and ABX logs with p values < 0.1%.
Title: What does ABX really prove?
Post by: MedO on 12 November, 2007, 04:33:11 AM
Well, I just noticed that even I *might* be able to ABX FLAC -4 from -8, given the CPU usage pattern explanation... you see, my CPU emits a (very low-volume) high-pitched whine when it's idle, that changes when the CPU is used. The problem is well known for the Core2-based MacBook IIRC, but also occurs on other systems including my laptop, apparently.

Well, if decoding the two files causes different CPU usage, it might very well be possible to hear a small difference because of this phenomenon. Because of its high pitch, I do hear the noise over music I play back when running Linux (it's worse there, don't ask me why), but I usually have something running in the background (Seti@Home, Prime95) to keep the processor occupied, which stops the whine.

If both files are decoded completely before playback however, I don't think it should be possible.
Title: What does ABX really prove?
Post by: SamHain86 on 12 November, 2007, 05:43:04 AM
@MedO-
-When you run anythingABX test, I always found the testing program to create two WAVs of the inputted files, eliminating any decoding lags or "hiccups." I have seen this with a stand alone ABX utility, and I believe FB2K does this. There is no way, to my understanding, that this ABX ws performed properly. EDIT: It is unlikely that the processor lag or hardware strain noise could give one an advantage when performing an ABX test on two lossless sources.
Title: What does ABX really prove?
Post by: sld on 12 November, 2007, 09:16:50 AM
MedO, it also happens on my Core Duo laptop, though I can use RMClock to disable it. Nevertheless, see post above, Foobar2000 ABX test doesn't have a problem with that.