Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Biophysics, Limitations of Shannon and Issues with ABX Testing (Read 54622 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #50
What I'm trying to discuss is not people suffering delusions, but the validity and reliability of a test based on human sensory impressions/conscious awareness/memory as the source of data.
...but we're testing whether people using their sensory impressions and conscious awareness* can remember hearing a difference.

That's relevant to the choice of audio codec or hi-fi equipment. 

* - I included "conscious awareness" just to copy your sentence, but I don't think it's necessarily a restriction of DBTs. Many people who do ABX testing say it reveals audible differences that they were barely if at all "consciously" aware of. They thought they were guessing throughout, but they could guess correctly with statistical significance.

Quote
I guess I like my data to be more raw and empirical.  Brains are really amazing biological devices, but they make for piss poor lab equipment.
...but we listen to audio codecs or hi-fi equipment with our ears and brains, not lab equipment.

I think what you are wanting is irrelevant to the task at hand. The point of hi-fi, audio coding, Hydrogenaudio etc is to listen to audio (music, speech, whatever).

The question you are asking seems to be quite different. It might make an interesting scientific study, but I don't think you've made a convincing argument that it has any relevance to listening to music.

Cheers,
David.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #51
I just hope you guys in the industry don't end up with the wrong codec because you're using a rubber yardstick.  (I kid.)  I have it easy.  I've been moved to tears by a highly compressed Beethoven's 9th on a table radio.

Thanks for the interesting discussion, and for not banning me for random questions from the peanut gallery!

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #52
In light of the limits of serial audio testing raised by the OP, I'm pondering two possible conclusions:

"We've reached the limits of differences the subjects can hear"

vs.

"We have reached the limits of resolution for using human memory as the measuring instrument in an AB test".

Though the OP's point is generating some discussion, it seems to be sidestepping his basic criticism (which seems logical, particularly when considering current knowledge about perception).



Do you have some means for comparing two sounds that does not depend on human memory?  Please do tell!

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #53
* - I included "conscious awareness" just to copy your sentence, but I don't think it's necessarily a restriction of DBTs. Many people who do ABX testing say it reveals audible differences that they were barely if at all "consciously" aware of. They thought they were guessing throughout, but they could guess correctly with statistical significance.


That is an interesting observation.  Lots more sensory data is received and processed than what actually get's through to conscious awareness.  I wonder if the subjects could maintain their batting average over a larger sample size. 

Do you have some means for comparing two sounds that does not depend on human memory?  Please do tell!


Can't really think of one.  But for that matter I am not trying to develop audibly transparent compression codecs. 

What I wonder is this: can it be done simply by the numbers?  We know thresholds of audibility, why not design to that, instead of the less precise ability of humans to differentiate?  Is it because of actual loss of data in the compression schemes, a matter of finding that happy place between the fat milk and the skim that folks will still find palatable?


Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #54
We know thresholds of audibility, why not design to that

What makes you think this isn't done?

Something is used to determine where to reduce precision in order to make a signal easier to compress, right?

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #55
Do you have some means for comparing two sounds that does not depend on human memory?  Please do tell!


Can't really think of one.  But for that matter I am not trying to develop audibly transparent compression codecs. 


What ever does what Arnold said have to do with what you said?
-----
J. D. (jj) Johnston

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #56
What ever does what Arnold said have to do with what you said?


We're discussing serial audio testing of human subjects, which necessarily relies on memory. 

Arny asked if I had means of comparing sound without using memory, I replied in the negative (assuming human subjects again, not the use of some sort of measurement equipment).  I also pointed out that I am not developing anything that requires discrimination testing of this sort.







   

 


 

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #58
The point is that ABX is useful for more than the development of lossy codecs.


I get that. 

The rather confrontational, terse one-line interrogatives and comments gives the distinct impression you guys are preparing to tee off on me, which is great if that sort of behavior meets your psychosocial needs, but it would be like a college professor demeaning a freshman.  I just visited to learn, not be treated with scorn.  Have a nice day, fellas.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #59
The point is that ABX is useful for more than the development of lossy codecs.


I get that. 


Then your reply above doesn't really make sense, or perhaps you have misunderstood what you've quoted above.

The rather confrontational, terse one-line interrogatives and comments gives the distinct impression you guys are preparing to tee off on me,


Hello and welcome to the internet where people are not always going to agree with you.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #60
I live on the West Coast so there's still time, and I definitely will.  Thanks.

My apologies for not noticing that you stopped carrying the torch for this discussion as of 8/6.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #61
What ever does what Arnold said have to do with what you said?


We're discussing serial audio testing of human subjects, which necessarily relies on memory. 

Arny asked if I had means of comparing sound without using memory, I replied in the negative (assuming human subjects again, not the use of some sort of measurement equipment).  I also pointed out that I am not developing anything that requires discrimination testing of this sort.


If you want to do an audio test, how else would you do this? By the way, sequential tests with proper windowing are documented as the best way to extract the most reliable answers from subjects.

If you're not interested in audio testing, why are we having this discussion? Seriously. I am confused.
-----
J. D. (jj) Johnston

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #62
Kees de Visser, who started that thread, mentions a difference in the click sounds (and gives some examples for us to listen to) in software ABX testing, in this post. His belief at the time, if I understood correctly, was that it was random in nature [a good thing] however his response to a question here seemed to suggest otherwise to me.
Sorry for being late. The artifacts I found are indeed random. They also appear when switching between identical audio, which is strange since this is a trivial task and should be 100% lossless, e.g. with a simple linear fade. The artifacts I was/am worried about will only appear when switching between different audio streams, which is not a lossless process, not a trivial task and might be audible, depending on many variables.
Hope this helps.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #63
What I wonder is this: can it be done simply by the numbers?  We know thresholds of audibility, why not design to that, instead of the less precise ability of humans to differentiate?  Is it because of actual loss of data in the compression schemes, a matter of finding that happy place between the fat milk and the skim that folks will still find palatable?


In the early 1980s I sat through an AES presentation about the development of lossy encoders based on the thresholds of hearing. It was tough going for the developers. Couldn't get worthwhile amounts of data compression. Later on masking became better understood, and development of lossy encoders shifted into high gear.

The point being that the thresholds of audibility overestimated the working sensitivity of the human ear.

While the golden ears rant and rave about how wrong Fletcher and Munson were, they were actually overly optimistic.



Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #64
Kees de Visser, who started that thread, mentions a difference in the click sounds (and gives some examples for us to listen to) in software ABX testing, in this post. His belief at the time, if I understood correctly, was that it was random in nature [a good thing] however his response to a question here seemed to suggest otherwise to me.
Sorry for being late. The artifacts I found are indeed random. They also appear when switching between identical audio, which is strange since this is a trivial task and should be 100% lossless, e.g. with a simple linear fade. The artifacts I was/am worried about will only appear when switching between different audio streams, which is not a lossless process, not a trivial task and might be audible, depending on many variables.
Hope this helps.


Two things come to mind:

1) bad time alignment, although with 2 identical files that would seem unlikely
2) bad crossfade window design

If the click when switching between A/X is more noticible than B/X, you can be sure subjects will latch on to that.
-----
J. D. (jj) Johnston

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #65
Kees de Visser, who started that thread, mentions a difference in the click sounds (and gives some examples for us to listen to) in software ABX testing, in this post. His belief at the time, if I understood correctly, was that it was random in nature [a good thing] however his response to a question here seemed to suggest otherwise to me.
Sorry for being late. The artifacts I found are indeed random. They also appear when switching between identical audio, which is strange since this is a trivial task and should be 100% lossless, e.g. with a simple linear fade. The artifacts I was/am worried about will only appear when switching between different audio streams, which is not a lossless process, not a trivial task and might be audible, depending on many variables.
Hope this helps.


Two things come to mind:

1) bad time alignment, although with 2 identical files that would seem unlikely
2) bad crossfade window design

If the click when switching between A/X is more noticeable than B/X, you can be sure subjects will latch on to that.


That's for sure!  One good test is to run an ABX test with no music playing.  If there are switching artifacts or background noises that relate to one alternative but not the other. a careful listener can detect them and do pretty well!

While none of the switchboxes that were sold by the ABX company had this problem, I did encounter a sample of a competitive product that I could score 16/16 with, with nothing attached to it at all. 

This sort of problem can be inherent in the switchbox, or it can be due to an error in the setup of the test.

It is also problem to have noises that are truly random.  They are a less severe problem but if noticeable enough they can distract the listener and reduce the probability of reliable detection when it is possible.

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #66
As a scientist (biophysics) and an audiophile, I have my feet in both worlds, and have thus enjoyed this discussion.


Here are some more details about the earliest paper I can find that describes sequential ABX testing:

http://scitation.aip.org/content/asa/journ....1121/1.1917190

"
An understanding of the over?all process of hearing depends upon proper interpretation of the results of many individual experiments. In the field of subjective experimentation the problem has been complicated by the wide variety of test procedures that characterize available data. If a common technique could be applied to the many different types of auditory tests, such as thresholds of acuity, masking tests, difference limens, etc., the organization of these data would be facilitated. The purpose of the present paper is to describe a test procedure which has shown promise in this direction and to give descriptions of equipment which have been found helpful in minimizing the variability of the test results. The procedure, which we have called the “ABX” test, is a modification of the method of paired comparisons. An observer is presented with a time sequence of three signals for each judgment he is asked to make. During the first time interval he hears signal A, during the second, signal B, and finally signal X. His task is to indicate whether the sound heard during the X interval was more like that during the A interval or more like that during the B interval. For a threshold test, the A interval is quiet, the B interval is signal, and the X interval is either quiet or signal. For a masking test, A is the masking signal, B is the masking signal plus the signal being masked, and X is either A or B repeated. The apparatus for the ABX test is mechanized so all details of the method can be duplicated for each observer, and the variability of manual operation eliminated. The entire test is coded on teletype tape to reduce the time and effort of collecting large quantities of data.
"

Just to reiterate something that many of us are way too aware of and is mentioned above, which is that the above methodology is not the same as the method described in Clark's 1972 JAES article which was designed to overcome many of the limitations of the method described in the 1950 JASA paper.

Conflating these two very different methodologies is a not infrequent  mistake that was recently repeated in a highly touted  recent AES conference paper 9174  "The audibility of typical digital audio lters in a high-fi delity playback system"

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #67
I don't know what you mean by "sequential" ABX testing. Any ABX test that I am aware of allows the subject to listen to any of the three samples, in any order, as many times as he/she pleases until ready to make a choice.

Also, you cannot "eliminate" false positives just as you cannot eliminate false negatives. You can only make the null hypothesis statistically improbable.


The point is that it's sequential regardless of the order, and thus relies on memory -- you can't listen to all three tracks simultaneously.    If you reread that section of my post carefully I think you'll see what I'm getting at.



Well, your original post here, having some very unoriginal opposition to ABX testing in it, and using a completely inappropriate comparison between vision and hearing (they are not the same, one can be static, one can not, for instance, making the "memory issue" for a properly subject-controlled switching system (with proper switching) completely moot), appears to have been guided by someone or something to play "god of the gaps" reasoning.

You can NEVER listen to 3 tracks simultaneously, not during the test, or ever, any time, any place. It's not how human perception works. So what you're really objecting to is how evolution designed our ability to sense atmospheric vibrations.  I'm not sure what the point of that is.
-----
J. D. (jj) Johnston

Biophysics, Limitations of Shannon and Issues with ABX Testing

Reply #68
I fear the battle is lost on the issue of honesty controls themselves (I guess due to 30+ yrs of derisive amusement), so the new war is to attack the most "common" method, ABX.
http://www.stereophile.com/content/listening-143

Purportedly it presents a "higher cognitive load" than ABC/HR et al, or good ol' LLT...Long Term "Listening". Evidence seems scant, but on we go.

cheers,

AJ
Loudspeaker manufacturer