Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Audibility of "typical" Digital Filters in a Hi-Fi Playback  (Read 367702 times) previous topic - next topic
0 Members and 17 Guests are viewing this topic.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #800
I think the suggestion that they'd get dither and quantisation the wrong way around, and not mention it, is going too far.
Sorry if I misread the phrase (pg.10): "this suggests that the effect of adding the RPDF dither on top of the 16-bit quantization and FIR filtering was to make it more difficult to identify that processing had been applied to the signal, which is perhaps counterintuitive."
Is this unambiguous for a native English reader like you ?

 

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #801
I think the suggestion that they'd get dither and quantisation the wrong way around, and not mention it, is going too far.
Sorry if I misread the phrase (pg.10): "this suggests that the effect of adding the RPDF dither on top of the 16-bit quantization and FIR filtering was to make it more difficult to identify that processing had been applied to the signal, which is perhaps counterintuitive."
Is this unambiguous for a native English reader like you ?



It comes down to how you interpret 'adding....on top of'.  In most contexts a native English reader would interpret that as meaning: 'after'.  (Well, this native English reader would, at least    )

But in the context of an peer-reviewed prize-winning AES presentation, it would be rather remarkable of them to have done it the 'wrong' way around, like that.  (Though stranger things have happened  in science papers.....)  So I guess we should assume that 'adding dither on top of quantization and filtering ' here means, the combined effects of dither, truncation, and filtering [done in the right order] was to make it more difficult etc....

It would be very easy to disambiguate in the next draft.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #802
Without having read the paper, I would take them at their word: dither on top of (ie: after) 16-bit quantization.

But it didn't appear to matter:

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #803
Without having read the paper, I would take them at their word: dither on top of (ie: after) 16-bit quantization.

But it didn't appear to matter:



In this interpretation of the phrase on p.10, the Meridian researchers  purposely did something 'wrong' that they would *expect to* matter...yet it didn't

That works too. 

A little disambiguation would go a long way.

(Amir posts his images to a service called 'smugmug.com'?  That's awesome. )


Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #804
It comes down to how you interpret 'adding....on top of'.  In most contexts a native English reader would interpret that as meaning: 'after'.  (Well, this native English reader would, at least    )

You would unless there was good reason not to, when the phrase would mean "in addition to" without prescribing that the thing "on top of" the others came last.

I would be surprised if the authors would call noise added after quantisation "dither".

Maybe I'm being too generous, but (apart from the conjecture sections) I found the paper cautious, honest and straightforward. It just needs a few more details.

Cheers,
David.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #805
I would be surprised if the authors would call noise added after quantisation "dither".

Indeed. The paper is written in a somewhat academic style and would be primarily aimed at people with a knowledge of the subject area. It is elementary knowledge that for dither to be beneficial it must be added prior to the reduction in bit depth, or in conjunction with it. To have simply added noise to the signal, after an earlier quantisation process by truncation, would have made no sense. It would have been bizarre.  Even a non-ideal implementation of dither would not consist merely of adding random noise to an already truncated signal. That would not be adding dither. That would be adding noise.

So, to make sense, the reference to adding dither "on top of" the quantisation to 16 bits must be read as "as an additional factor present in the processing", not as "subsequent to earlier processing". 

To use a loose analogy, if at a meal a person consumes additional alcohol by way of drinking wine "on top of" consuming a Christmas pudding made with rum, there is no suggestion as to exactly when the person drank the wine in relation to when they consumed the pudding. The important point is that the consumption of the wine was not the only source of alcohol for the meal, whether the wine was drunk early in the meal, or late in the meal. It was "on top of" the effect of the alcohol present in the pudding. It could be that everyone at the meal had the Christmas pudding but only a few chose to have wine "on top of" that. 

I see a thesaurus entry at http://www.thesaurus.com/browse/on+top+of  as follows:

on top of

also 

adv. in addition to

additionally 
again 
along 
along with 
and 
as well 
as well as 
besides 
conjointly 
further 
furthermore 
in conjunction with 
in like manner 
including 
likewise 
more 
more than that 
moreover 
on top of 
over and above 
plus 
still 
to boot 
together with




Elsewhere in the paper we see further detail. On page 5:

"After filtering with either FIR filter, the signals were either unchanged or were quantized to 16-bit. The quantization either included RPDF (rectangular probability density function) dither or did not. We chose to use undithered quantization as a probe and -- although we would normally recommend TPDF dither for best practice -- we considered rectangular dither to be more representative of the non-ideal dither or error-feedback processing found in some commercial A/D and/A filters."

I note that error-feedback is used in certain dither implementations.  What we might question is the claim that rectangular dither would be representative of non-ideal situations. It begs the question how common such non-ideal situations might be.


Quote
Maybe I'm being too generous, but (apart from the conjecture sections) I found the paper cautious, honest and straightforward. It just needs a few more details.

I would agree.

I found the conjectural comment  on page 10, "this suggests that the effect of adding the RPDF dither on top of the 16-bit quantization and FIR filtering was to make it more difficult to identify that processing had been applied to the signal, which is perhaps counterintuitive", puzzling. If a recording of music has very low noise, then a simple truncation could be expected to give rise to audible distortion at critical points in the recording (e.g. the tail end of reverberation). If even a non-ideal dither is applied at the time of quantization, in this case RPDF dither, that would reduce the distortion. So I am not sure why the authors suggested that RPDF dither apparently aiding transparency, was "perhaps counterintuitive". Perhaps what they had in mind was the possibility of audible noise modulation with RPDF dither.


Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #807
Make pedantic excuses for what amounts to poor communication all you want, the data suggests the issue is moot. Does it not?

I'm not sure I'd agree with some of the posters in this thread that the communication was all that poor, particularly if one is able to read the paper as a whole, but I would agree that the data as presented suggests the question is moot, there being such a low percentage of correct answers (not much above 50%, albeit "statistically significant") when the results from all sections are lumped together.

The real issue to my mind is the suggestion that filtering by itself was allegedly audible. The question of quantization and whether or not dither was applied becomes secondary. I expressed that view earlier in this thread when I said:
For those who have not read the whole of this thread, I'd note that the question of the use of no dither when quantising to 16-bits (test conditions  2 and 5), or rectangular dither when quantising to 16-bits (test conditions 3 and 6) could be regarded as a subsidiary matter. This is because the paper reports that there was a statistically significant correct identification of an audible difference for test condition 1, i.e. filtering to emulate a resampling to 44.1kHz, and without any quantisation to 16 bits. (As for conditon 4, a filtering to emulate resampling to 48kHz,  and without any quantization to 16 bits, "the t-test just failed to reach significance at the 5% level". )

If it is true that a mere filtering at 24-bit depth for the 22.05kHz Nyquisit limit of 44.1kHz sampling was of itself identifiable (in particular for certain "high yield" [easier to spot differences] sections of the music), it becomes a subsidiary matter what effect a subsequent quantisation to 16-bits might have had. The "damage" or "impairment" had already occurred, or so it would appear.


I think it would be helpful if two or three of the "high yield" sections (which are relatively short) could be released in their reference 24/192 form, together with their MATLAB filtered forms (the 48kHz sample rate filter emulation, and the 44.1kHz emulation).  Then anyone with a system capable of playing back 24/192 files, with ABX switching, could listen for themselves. The listener could then determine whether they could hear any difference at all; and, if so, whether this made a significant difference to the listening experience or was at the outer limits of their auditory perception.  They could also provide subjective comments.

Given that high definition versions of the reference files (but apparently with a reduction of around 2dB in amplitude) have been available for download free of charge for some years, it is hard to understand why some short sections of the reference files could not be hosted, without objection by the copyright holder.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #808
I think it would be helpful if two or three of the "high yield" sections (which are relatively short) could be released in their reference 24/192 form, together with their MATLAB filtered forms (the 48kHz sample rate filter emulation, and the 44.1kHz emulation).  Then anyone with a system capable of playing back 24/192 files, with ABX switching, could listen for themselves.

It will never happen. Making the tests accessable to all isn't in the best interests of the hi-re$ promoters, and besides, even if any of us somehow got a hold of the test segments all failures to obtain any statistically significant differentiations has already been cleverly dealt with, preemptively, built right into the wording of the 2nd conclusion of the opening abstract itself:

"Two main conclusions are offered: first, there exist audible signals that cannot be encoded transparently by a standard CD; and second, an audio chain used for such experiments must be capable of high-fidelity reproduction."

Any attempts using a lesser, budget conscious, say, um, $23K speaker instead of his $46K ones used for the test will be dismissed as "not being of high enough fidelity". [From Audiophile 101, RULE#1: Attack your critics' gear as being "pedestrian" and of "low resolution", so they wouldn't hear the difference themselves.]

As I pointed out earlier, there was no evidence to support that second conclusion. A cheap mini-system from a department store might have done just as well for all we know, but putting this line in the paper's opening abstract helps keep the truth seekers at bay.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #809
... I did some quick tests to see if one can easily tell the difference between no dither, RPDF dither and TPDF dither being applied to a conversion of a pure -60 dB sine wave from 32 bit floating point to 16 bit fixed point. The effects of no dither is clearly shown in the output 65 k point FFT analysis, but the difference between RPDF and TPDF is not clear at all to me.


The problem as I understand it is RPDF dither [unlike TPDF] suffers from noise modulation, key word "modulation", so the problem wouldn't occur from a fixed level recording which is I believe what you attempted.


That is correct, and my first set of tests was thus in error. Thank you for the assistance!

My second set of tests prepared 8 32 bit files.

Files 1 & 2 were 32 bits with a -60 dB sine wave

Files 3 & 4 were 32 bits with a -120 dB sine wave

I converted 1 & 3 to 16 bits using RPDF dither

I converted 2 & 4 tp 16 bits  with TPDF dither

I converted all files back to 32 bits and notched out the 1 KHz tone in each

The two TPDF files contained noise at the same level -96 dB as expected

The RPDF files with dramatically different signal levels contained noise at dramatically different levels -151 dB and -96 dB

Noise modulation due to RPDF dither was thus confirmed. The noise floor varied dramatically in the RPDF dither files as the signal level changed dramatically. With TPDF dither the noise was greater than with RPDF in one case, but it was consistent.  The consistent noise floor would be appreciated by listeners. Audible noise modulation bugs me and most other people that I've seen exposed to it.

Files 5 & 6 were 32 bits with a -60 dB sine wave and -70 dB brown (double pink) noise to simulate architectural noise in recording

Files 7 & 8 were 32 bits with a -120 dB sine wave and -70 dB brown (double pink) noise to simulate architectural noise in recording

I converted 5 & 7 to 16 bits using RPDF dither

I converted 6 & 8 to 16 bits  using TPDF dither

I converted all files back to 32 bits and notched out the 1 KHz tone in each.

All 4 files had the same noise level -82 dB.

Thus adding simulated consistent architectural noise at a reasonable level (which constituted relatively high level Gaussian dither) was able to erase any observable difference between TPDF and RPDF dither at the levels they would be used in real world conversions to 16 bits. 

The PDF of Gaussian dither is far closer to TPDF dither than RPDF dither. In this example, the RPDF and TPDF dithers had a flat PSD, while the brown noise dither had a PSD with a very pronounced downward slope.



Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #810
Elsewhere in the paper we see further detail. On page 5:

"After filtering with either FIR filter, the signals were either unchanged or were quantized to 16-bit. The quantization either included RPDF (rectangular probability density function) dither or did not. We chose to use undithered quantization as a probe and -- although we would normally recommend TPDF dither for best practice -- we considered rectangular dither to be more representative of the non-ideal dither or error-feedback processing found in some commercial A/D and/A filters."

I note that error-feedback is used in certain dither implementations.  What we might question is the claim that rectangular dither would be representative of non-ideal situations. It begs the question how common such non-ideal situations might be.


Quote
Maybe I'm being too generous, but (apart from the conjecture sections) I found the paper cautious, honest and straightforward. It just needs a few more details.

I would agree.

I found the conjectural comment  on page 10, "this suggests that the effect of adding the RPDF dither on top of the 16-bit quantization and FIR filtering was to make it more difficult to identify that processing had been applied to the signal, which is perhaps counterintuitive", puzzling. If a recording of music has very low noise, then a simple truncation could be expected to give rise to audible distortion at critical points in the recording (e.g. the tail end of reverberation). If even a non-ideal dither is applied at the time of quantization, in this case RPDF dither, that would reduce the distortion. So I am not sure why the authors suggested that RPDF dither apparently aiding transparency, was "perhaps counterintuitive". Perhaps what they had in mind was the possibility of audible noise modulation with RPDF dither.



In almost every recording that is converted to 16 bits the dither added to the conversion itself is far smaller in amplitude than residual noise that is already in the recording due to other noise sources in the production chain. In the days of analog recording the tape machines themselves were major sources of noise. In modern times the musicians, the room, the microphones, and the other analog components in the production chain are the major sources of noise. There may be some deterministic components to this room tone or background noise, but in a well-controlled environment the deterministic components provide the smaller contributions. Deterministic noises are generally asynchronous with the music and can still help address correlated quantization distortion, which is the problem that dither is there to address.

In any case TPDF dither in appropriate quantities should be added as a safeguard and it generally is inherent in the very common Sigma Delta ADCs, but other natural or at least inherent non-digital sources in the recording process generally render the details of dithering of the digital conversion moot.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #811
Thank you for the academic exercise, speculation about noise and making mention of self-dither.

Let's get back on topic now.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #812
"Two main conclusions are offered: first, there exist audible signals that cannot be encoded transparently by a standard CD;"



The authors cannot conclude that from these tests. There are two possible effects that caused these results:

1. The filters had an effect on the audible portion of the signal.

2. Allowing the ultrasonic frequencies to remain in the signal caused the introduction of audible artifacts due to non-linearities in the processing/amplification/transduction chain.

Until some effort is made to measure and analyze the signal that is actually radiated by the speaker with and without the filter(s) in place it is not possible to eliminate 2 and conclude 1.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #813
"Two main conclusions are offered: first, there exist audible signals that cannot be encoded transparently by a standard CD;"


The authors cannot conclude that from these tests.

Notice the wording Fred. Are "offered". Not "Are".

You're not buying? 

cheers,

AJ
Loudspeaker manufacturer

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #814
I think it would be helpful if two or three of the "high yield" sections (which are relatively short) could be released in their reference 24/192 form, together with their MATLAB filtered forms (the 48kHz sample rate filter emulation, and the 44.1kHz emulation).  Then anyone with a system capable of playing back 24/192 files, with ABX switching, could listen for themselves. The listener could then determine whether they could hear any difference at all; and, if so, whether this made a significant difference to the listening experience or was at the outer limits of their auditory perception.  They could also provide subjective comments.


But would the Meridian group approve?....the cognitive load might be  too high, you see...:

Quote
There is a more general problem with listening tests
of this kind [referring to Meyer & Moran 2007], which concerns the testing procedure.
ABX tests are viewed as the "gold standard" for
objective measures of listening. In an ABX test, a
listener is required to listen to two reference sounds,
sound A and sound B, and then to listen to sound
X, and to decide whether sound X is the same as
sound A or sound B. ABX tests have a high sensi-
tivity, that is, the proportion of true-positive results
out of total positive results is high. However, ABX
tests also have low speci city, meaning that the pro-
portion of true-negative results out of total negative
results can be spuriously low. Translating this into
outcomes in psychophysical tests, the proportion of
the time that a listener scores well on an ABX test
by chance is low, but the proportion of the time that
a listener can score poorly on a test in spite of being
able to discriminate the sounds is high. An ABX
test requires that a listener retains all three sounds
in working memory, and that they perform a min-
imum of two pair-wise comparisons (A with X and
B with X), after which the correct response must be
given; this results in the cognitive load for an ABX
test being high.


from Jackson et al, 2014 "The audibility of typical digital audio filters in a high- fidelity playback system"

Their own test was A/B, where A was always the reference (unfiltered) ; the task for listeners was to decide if B was the same as A or not. They  were allowed unlimited switching between A annd B , and were also allowed to adjust playback level (it is not clear whether they were allowed to do this once, at the start of the trials, or throughout the test, or even between A and B).

Listeners also had instant feedback on whether each choice was correct or incorrect.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #815

I think it would be helpful if two or three of the "high yield" sections (which are relatively short) could be released in their reference 24/192 form, together with their MATLAB filtered forms (the 48kHz sample rate filter emulation, and the 44.1kHz emulation).  Then anyone with a system capable of playing back 24/192 files, with ABX switching, could listen for themselves.



FWIW, the fragments themselves could be recreated using the Appendix chart that gives start and stop times for each one.  That just leaves the Matlab-filtered versions...


Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #816
I thought the claim that their A/B test was better than an ABX test was strange. I mean, you can always participate in an ABX test just playing A and X if you want, and that turns it into their test.

Sometimes I do that, sometimes I don't. It's nice to have the choice. You don't get the choice with their test.

I like the option to know if I get each trial right though. AFAIK as long as you pick the number of trials beforehand and stick to it anyway then knowing the result of each trial doesn't break the test, does it?

Cheers,
David.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #817
I thought the claim that their A/B test was better than an ABX test was strange. I mean, you can always participate in an ABX test just playing A and X if you want, and that turns it into their test.

Sometimes I do that, sometimes I don't. It's nice to have the choice. You don't get the choice with their test.

I like the option to know if I get each trial right though. AFAIK as long as you pick the number of trials beforehand and stick to it anyway then knowing the result of each trial doesn't break the test, does it?

Cheers,
David.



It shouldn't.  I'm assuming the number of trials was decided in advance , as per:

Quote
The extract presented in each trial was selected ran-
domly from the 17 sections into which the piece had
been divided based on musical phrases. Twelve trials
were presented within a "block", with the results of
the last 10 being counted; the two uncounted trials
were included in order to familiarise listeners with
the task and processing before beginning the test.
For each block, the type of filtering used was the
same. Each listener completed 2 blocks for each con-
dition, giving a total of 12 blocks per subject.


so, number of trials was pre-set at 12 per block (though only the last 10 were counted, the first 2 being 'warm ups') , and 2 blocks per condition, and 6 conditions, and thus , if I understand correctly, the pre-set total was: 

12*2*6 = 144 trials per subject in total,  of which

10*2*6 = 120 were counted

For each condition, 10*2 = 20 trials per subject were counted towards results.  Results for  each condition were reported as pooled results from  all 8 subjects.  Thus for each condition a total of 20*8 = 160 trials were counted.  (as per :  "160 trials combined across listeners for each condition", p. 8)

It's interesting that results are not broken down by subject anywhere. (NB Meyer and Moran 1997 didn't do that comprehensively either, though they did report some per-subject results: "The “best” listener score, achieved one single time, was 8 for 10, still short of the desired 95% confidence level. There were two 7/10 results. All other trial totals were worse than 70% correct.")

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #818
AFAIK as long as you pick the number of trials beforehand and stick to it anyway then knowing the result of each trial doesn't break the test, does it?  Cheers, David.


  If one is of the mind that, "People will be under tremendous, undo stress using ABX, having to juggle three distinct sounds in their heads to determine which is which" [I'm not one of those people, but Stuart seems to be] then I'd think one could equally argue that giving feedback to the listener may have adverse effects as well.

Here's an example. In taking a test one is on their honor to not only not cheat, but also to honestly give it their focused attention and to do the best they can, rather than randomly selecting answers without giving it a good listen. Everyone with me so far? OK, say while taking a test, three quarters through, you notice your correct number of responses is exactly 50% of the number of trials taken so far, clearly implying no real ability to differentiate A from B. Tell me, can we REALLY expect such test subjects to continue with the remaining trials giving it their "very best" effort? I doubt it.

Although it still exists in the training mode, mid-test feedback has been removed from the current foobar ABX v.2 testing, now in beta, and when I saw that I thought it was a good idea. Scolding or praising a test subject during a test ["You are doing great!" vs. "You can't hear a thing. You might as well be flipping a coin."] will influence at least the mood and disposition of the test subject, possibly increasing stress, even if it doesn't technically bias the test results in one specific way or another.


Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #820
If one is of the mind that, "People will be under tremendous, undo stress having to juggle three distinct sounds in their heads to determine which is which" [I'm not one of those people, but Stuart seems to be...

What explain this difficulty then?  http://www.avsforum.com/forum/91-audio-the...ml#post26141122
Quote from: mzil on AVS link=msg=0 date=
My selections of A and B were done entirely by my hearing alone, of an extremely subtle difference that took me over an hour to pull off, but the important point is I had no outside assistance from dogs, analyzers, etc.. [Using such external tools as these would have been correctly deemed "cheating", since it is then no longer a test of human hearing at all.]


If the difference was always there, why so much difficulty as to take an hour?

In the parallel thread I post the results of a double blind test with this note from the tester: http://www.hydrogenaud.io/forums/index.php...st&p=883067


Clearly Stuart is right in the stress that triple stimulus creates when differences are small.

Quote
Here's an example. In taking a test one is on their honor to not only not cheat, but also to honestly give it their focused attention and to do the best they can, rather than randomly selecting answers without giving it a good listen. Everyone with me so far? OK, say while taking a test, three quarters through, you notice your correct number of responses is exactly 50% of the number of trials taken so far, clearly implying no real ability to differentiate A from B. Tell me, can we REALLY expect such test subjects to continue with the remaining trials giving it their "very best" effort? I doubt it.

Instead of hypotheticals let's look at real results of a real sample:

Quote from: amirm on WBF Forum link=msg=0 date=
foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/11 06:18:47

File A: C:\Users\Amir\Music\AIX AVS Test files\Mosaic_A2.wav
File B: C:\Users\Amir\Music\AIX AVS Test files\Mosaic_B2.wav

06:18:47 : Test started.
06:19:38 : 00/01  100.0%
06:20:15 : 00/02  100.0%
06:20:47 : 01/03  87.5%
06:21:01 : 01/04  93.8%
06:21:20 : 02/05  81.3%
06:21:32 : 03/06  65.6%
06:21:48 : 04/07  50.0%
06:22:01 : 04/08  63.7%
06:22:15 : 05/09  50.0%
06:22:24 : 05/10  62.3%
06:23:15 : 06/11  50.0% <---- difference found reliably.  Note the 100% correct votes from here on.
06:23:27 : 07/12  38.7%
06:23:36 : 08/13  29.1%
06:23:49 : 09/14  21.2%
06:24:02 : 10/15  15.1%
06:24:10 : 11/16  10.5%
06:24:20 : 12/17  7.2%
06:24:27 : 13/18  4.8%
06:24:35 : 14/19  3.2%
06:24:40 : 15/20  2.1%
06:24:46 : 16/21  1.3%
06:24:56 : 17/22  0.8%
06:25:04 : 18/23  0.5%
06:25:13 : 19/24  0.3%
06:25:25 : 20/25  0.2%
06:25:32 : 21/26  0.1%
06:25:38 : 22/27  0.1%
06:25:45 : 23/28  0.0%
06:25:51 : 24/29  0.0%
06:25:58 : 25/30  0.0%

06:26:24 : Test finished.

----------
Total: 25/30 (0.0%)


Notice what the feedback loop of results allowed me to do.  I was able to positively identify a revealing segment and complete the test successfully.  Without that feedback I could not determine that and stay with that segment.  Doing this in trial mode does not work because once you think you have found the difference, you have to go and run the test again and by then you may forget what you had heard.  The newer foobar abx plug-in makes that near impossible anyway because there is no help with re-selection of the precise segment.

Another variation is second guessing yourself which is a serious, serious problem.  You identify a difference and you listen and get a bunch of trials right.  Without feedback you may wonder, "what if I am getting this wrong?"  That is all that is needed to change the perception you had of the difference.  Placebo works both ways.  It can easily erase differences or make them sound different.  Without feedback you would then get a bunch of trials wrong.  With feedback you would know that you got off track and get back on and see confirmation of that in correct answer after correct answer.

Our goals in these tests must be to do everything in our power to discover differences.  Not see how many ways we could encourage a negative outcome by handcuffing the listeners.

Quote
Although it still exists in the training mode, mid-test feedback has been removed from the current foobar ABX v.2 testing, now in beta, and when I saw that I thought it was a good idea. Scolding or praising a test subject during a test ["You are doing great!" vs. "You can't hear a thing. You might as well be flipping a coin." will influence at least the mood and disposition of the test subject, even if it doesn't technically bias the test results in one specific way or another.

Per above, it is only a "good idea" if you want to force more negative outcomes.
Amir
Retired Technology Insider
Founder, AudioScienceReview.com

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #821
A certain amount of stress is beneficial.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #822
"Two main conclusions are offered: first, there exist audible signals that cannot be encoded transparently by a standard CD;"


The authors cannot conclude that from these tests.

Notice the wording Fred. Are "offered". Not "Are".

You're not buying? 

cheers,

AJ


Not buying it without more evidence as to what the panel was really hearing. I'd like to see the tests re-run with different hardware (speakers in particular) and measurements of the nonlinearities in the system in the ultrasonic frequency range before the possibility of a conclusion is even raised.

Audibility of "typical" Digital Filters in a Hi-Fi Playback

Reply #823
A certain amount of stress is beneficial.
  Taking a test by definition is somewhat "stressful" however we generally want to minimize any additional stress as best we can, otherwise any failure to hear distinctions can be dismissed as being due to the "added stress".

This is why when I designed my blind amp testing of my audiophile, expert listener friend [to settle a small bet with 2:1 odds in his favor] I allowed him to decide almost EVERY single aspect of the test: He picked to the who, what ,where, when, why, and how of the entire test. My only provisions were that the number of trials he selected must show statistical significance to my satisfaction [we ended up agreeing on 16 trials total, >12 correct to win] and since he was a part time recording engineer he wasn't allowed to use his own, private recordings and had to limit himself to any commercially released CDs or SACDs of his choosing. [Of course the amps weren't allowed to be driven beyond their safe operational range, at any time, and were level matched.]

Forms of possible added stress I successfully avoided, which this Stuart et al. paper's listeners might have theoretically complained about, include:

- test listeners weren't in control of what music was selected

- test listeners weren't allowed to practice with the music and gear for an indefinite period of time, of their choosing, before the test

- test listeners weren't in control of the test transition points/segments

- test listeners weren't in control of what switching methodology was used

- test listeners, I assume, were told when and where to show up for testing, whereas my guy picked both

- test listeners didn't select the room

- test listeners didn't select the speakers used and other gear.

My guy received no correct answer feedback, mid-test, but never asked for any either. I think I would have allowed it had he asked prior to the start, but I think he would have been shooting himself in the foot by doing so [at least in retrospect], since his results were barely different than random chance in the end, and seeing that he was only guessing correctly about 50% of the time, mid test, might have potentially upset him and put him in a bad mood.