Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Help with test control (Read 6571 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Help with test control

I’m wondering whether I could tap into the HA bank of experience in perceptual tests. Over on CA two members decided to carry out a test to see whether one of them could ABX two different settings on a software music player.

The two settings apparently have bit identical outputs.

As I understand it the setup had the listener alone in the room with only the power amps and speakers. The computer output to a dac via s/pdif. Both computer and dac were situated in the basement below the listening room.

As I understand it the test was conducted by one party (Mansr) operating the computer software using RDP from another room.

At first the test was conducted as an AB XXXXXXXXXX
The test subject was not able to identify x as A or B in two sets of 10 (4/10 and 4/10)
However, he then repeated it as ABX,  ABX..... and scored 9/10.

This has generated some excitement. Månsr is a very smart engineer and has been carrying out all sorts of analysis on the output of the dac with both software settings to identify what the difference might have been.

I have expressed a concern that what was being Identified in the test might have been switching glitch or similar tell.
The problem is that the test could not (prima facie, I assume) be done using the usual software ABx comparator one uses on audio files.

I was not there but in the light of the setup, it seems unlikely that a cue from the computer (say fan noise or something, or a light going on) would be perceptible. I understand that there was a delay when the software setting was changed. Måns said he tried to allow for it, but it is a concern that in an ABx (especially where A and B stay the same) some difference between playing the same setting again (BB) and changing (BA) might cause some tell. After all the test subject only has to be able to detect whether there is a software change from B or not.

With that long pre-amble (Sorry) I was hoping that some more experienced and wiser heads might be able to help me with the following
-am I crazy?. I was under the impression that this was the sort of thing which it was normal to control for in perceptual tests. If I’m barking up the wrong tree and all analysis should be confined to the hypothesis that the software changes the dac output when playing music, then let me know
- How would one go about testing the hypothesis that a switching difference may be responsible.
- How would one go about repeating the experiment while controlling for this possibility.
- Any other suggestions as to avenues of inquiry
I don’t know how to run a proper ABx for software settings of this sort. Obviously it would be great if a computer could randomise and allow the test subject to listen to A B and x.
One possibility which occurred to me was to have a recording of the dac output playing with both settings. One could then just use these as files in the normal way . I suspect that the objection might be made that there was an additional ADC/dac  stage involved
Is there any way of taking two live streams and putting them through a comparator program?

All constructive suggestions would be gratefully received.

Re: Help with test control

Reply #1
I’m wondering whether I could tap into the HA bank of experience in perceptual tests. Over on CA two members decided to carry out a test to see whether one of them could ABX two different settings on a software music player.

The two settings apparently have bit identical outputs.

As I understand it the setup had the listener alone in the room with only the power amps and speakers. The computer output to a dac via s/pdif. Both computer and dac were situated in the basement below the listening room.

As I understand it the test was conducted by one party (Mansr) operating the computer software using RDP from another room.

At first the test was conducted as an AB XXXXXXXXXX
The test subject was not able to identify x as A or B in two sets of 10 (4/10 and 4/10)
However, he then repeated it as ABX,  ABX..... and scored 9/10.

This has generated some excitement. Månsr is a very smart engineer and has been carrying out all sorts of analysis on the output of the dac with both software settings to identify what the difference might have been.

I have expressed a concern that what was being Identified in the test might have been switching glitch or similar tell.
The problem is that the test could not (prima facie, I assume) be done using the usual software ABx comparator one uses on audio files.

I was not there but in the light of the setup, it seems unlikely that a cue from the computer (say fan noise or something, or a light going on) would be perceptible. I understand that there was a delay when the software setting was changed. Måns said he tried to allow for it, but it is a concern that in an ABx (especially where A and B stay the same) some difference between playing the same setting again (BB) and changing (BA) might cause some tell. After all the test subject only has to be able to detect whether there is a software change from B or not.

With that long pre-amble (Sorry) I was hoping that some more experienced and wiser heads might be able to help me with the following
-am I crazy?. I was under the impression that this was the sort of thing which it was normal to control for in perceptual tests. If I’m barking up the wrong tree and all analysis should be confined to the hypothesis that the software changes the dac output when playing music, then let me know
- How would one go about testing the hypothesis that a switching difference may be responsible.
- How would one go about repeating the experiment while controlling for this possibility.
- Any other suggestions as to avenues of inquiry
I don’t know how to run a proper ABx for software settings of this sort. Obviously it would be great if a computer could randomise and allow the test subject to listen to A B and x.
One possibility which occurred to me was to have a recording of the dac output playing with both settings. One could then just use these as files in the normal way . I suspect that the objection might be made that there was an additional ADC/dac  stage involved
Is there any way of taking two live streams and putting them through a comparator program?

All constructive suggestions would be gratefully received.

I think that one of the best ways for learning about doing DBTs is hands-on experience with DBTs.

While you may not be able to do the tests of your dreams with FOOBAR2000 and the ABX plug in, you can use it to do tests that are designed to be as sensitive to any knowable artifact as you wish.  Once you are comfortable with tests on that basis you should be in a better place for marching out on your own.

The test as described seems to have a serious problem - you don't know for sure what you are comparing. You may think you know but this is the real world and everybody who is trying hard enough gets surprised.

The tool of choice for technical test validation is a PC with the best recording audio interface you can line up. For example, used M-Audio AP24192s can often be had for under $50 on eBay and when combined with analytical software can duplicate or exceed all but the very best audio test gear around. Focusrite 2i2 interfaces are external and use USB  that works with most laptops and are about 10 dB worse - but still over the 100 dB sweet spot.

Re: Help with test control

Reply #2
You don't need to fully quote the person immediately above.

Re: Help with test control

Reply #3
If it helps I am familiar with the foobar ABx comparator and Mansr is able to analyse the living daylights out of the dac output.
Unfortunately that is not the question(s). I appreciate that my OP may have been tiresomely long to read through, for which I apologise again.

Re: Help with test control

Reply #4
If you just keep doing an abx test over and over, sooner or later you'll get a "success" even with random guessing. Usually you do it only once, or if you have to rerun it, you at least do more trials the additional times so that you don't increase the odds of getting it by chance too much.

Re: Help with test control

Reply #5
AFAICT what adamdea is saying is that they got two 'no difference'  results with two blind A/B tests ( where A and B were two settings of the same software player) , then got a  highly robust  'difference' result with a an A/B/X test of the same settings.  So they were using two different protocols, meaning you can't say they got success simply by chance from doing the same test over and over in this case. 

Such paradoxical results suggest a protocol flaw.

It need not be necessary for the A and B outputs to go through A/D to compare using ABX software.   There are softwares that can record  output directly to a file, before D/A conversion, are there not?   e.g  Audacity, using WASAPI ?  If so , the two recorded outputs could be compared using e.g. fooABX, without constant switching of the actual player software.







Re: Help with test control

Reply #6
AFAICT what adamdea is saying is that they got two 'no difference'  results with two blind A/B tests ( where A and B were two settings of the same software player) , then got a  highly robust  'difference' result with a an A/B/X test of the same settings.  So they were using two different protocols, meaning you can't say they got success simply by chance from doing the same test over and over in this case. 
Such paradoxical results suggest a protocol flaw.
What they did first as far as I can tell is to play A and B  at the beginning and then (without repeating A and B) play a sequence of Xs. This seems like an odd methodology to me and I am not surprised that the results were negative [although it would probably be surprising to those who believe in long term listening tests and ignore audio memory] . I'm not sure if that was really a preference test. I was initially suspicious of cherry-picking but on balance I think this part of the test can be discounted for now.

Let's assume that the 9/10 is prima facie significant. Is it reasonable to investigate whether what is being (might be) detected is an artefact of the process of switching or not switching software settings in the final ABx test?
Quote
It need not be necessary for the A and B outputs to go through A/D to compare using ABX software.   There are softwares that can record  output directly to a file, before D/A conversion, are there not?   e.g  Audacity, using WASAPI ?  If so , the two recorded outputs could be compared using e.g. fooABX, without constant switching of the actual player software.
If I am understanding you correctly we could record the outputs of the computer's S/PDIF out and then see if the test subject can A/B them. This would potentially remove one variable (switching the computer software settings live) but I suspect it might not satisfy those who claim the software settings sound different becasue the mechanism they propose is via some jitter noise effect on the dac.* If a recorder can record the output of the software bit perfect then the mechanism will disappear*.  I can;t now remember but we may even have such a recording already. (can;t remember whether this was done with test tones)

I suspect that proponents of the sounds different hypothesis will only be satisfied by a test which operates on a live stream from the computer to a dac.

I wonder if there is a way to get two live streams [one computer running two version of the software or two computers each with one version] so that the streams could be switched in foobar abx before being sent to the dac.
Otherwise I perhaps we are stuck with having to record the outputs of the dac on each setting and trying to ABX that.

Anyway aside from my ramblings, I would welcome any suggestions as to how [starting from scratch] we might test the hypothesis that changing the settings in the music player software [with output remaining bit perfect] audibly affect the output of the test subject's dac++
.




-----------------------------------------
* please don't shoot the messenger here
++ one further hypothesis is that the test subject's dac has hopeless jitter tolerance; and it may be that the test can be repeated with   something like a benchmark dac, but for the time being let's leave that.

Re: Help with test control

Reply #7
Then record the DAC output, played by different software, with a transparent ADC. Whether the ADC is transparent or not can be verified by comparing the original file and the recorded file, with very careful timing and volume matching, then do some ABX tests.

Also, can you post the CA link about that test?


Re: Help with test control

Reply #9
Actually mansr accepts on the first page. But the test results come much later.

Re: Help with test control

Reply #10
It's a long long thread. It's many pages in before two members (confusingly called mansr and manis...) agree to do the test

https://www.computeraudiophile.com/forums/topic/38493-blue-or-red-pill
To much to read. Some things i gathered is that a software player with audiophile parameters can mess up the output enough to make a DIY DAC sound different without breaking the data.
This is what they call different sound of bit-identiocal files while i can imagine it is more about broken DAC designs can behave strange without buffer :)
Would be nice they do a thread one day with the most important parameters in one place.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Re: Help with test control

Reply #11
It's a long long thread. It's many pages in before two members (confusingly called mansr and manis...) agree to do the test

https://www.computeraudiophile.com/forums/topic/38493-blue-or-red-pill
To much to read. Some things i gathered is that a software player with audiophile parameters can mess up the output enough to make a DIY DAC sound different without breaking the data.
This is what they call different sound of bit-identiocal files while i can imagine it is more about broken DAC designs can behave strange without buffer :)
Would be nice they do a thread one day with the most important parameters in one place.
Yes that may be the analysis
That said, I had a go at ABXing recordings of the the analogue out of the dac on the two settings and got nowhere..

files can be downloaded here if anyone is interested
https://www.computeraudiophile.com/forums/topic/38493-blue-or-red-pill/?do=findComment&comment=807806

Re: Help with test control

Reply #12
If I am understanding you correctly we could record the outputs of the computer's S/PDIF out and then see if the test subject can A/B them.


Not really.   I am suggesting is that you direct the output of the player software into the input of some recording software, internally, i.e, without physically accessing the soundcard's digital or audio outs jacks.


Quote
This would potentially remove one variable (switching the computer software settings live) but I suspect it might not satisfy those who claim the software settings sound different because the mechanism they propose is via some jitter noise effect on the dac.*

9/10 on the odd ABX test they did that you described, would not likely be due to 'jitter noise'.  Blaming jitter is typically just audiophile superstition/handwaving. 




Re: Help with test control

Reply #13
If I am understanding you correctly we could record the outputs of the computer's S/PDIF out and then see if the test subject can A/B them.


Not really.   I am suggesting is that you direct the output of the player software into the input of some recording software, internally, i.e, without physically accessing the soundcard's digital or audio outs jacks.

I guess we could do that but my understanding is that a recording of the S/PDIF out has been taken and verified to be bitperfect, so one would assume that the same would have be the case for the player output.