Topic: Audibility of phase shifts and time delays (Read 21658 times)
0 Members and 1 Guest are viewing this topic.

Audibility of phase shifts and time delays

2015-04-20 13:10:09
In the MQA hype I've found the following claim:

MQA Hype

"However from more recent research it turns out that although humans can not hear frequencies above 20kHz, they are sensitive to timing of sounds to about 10 microseconds. So first you notice the arrival of a sound (quick change in air pressure, a very high frequency) and later on you actual hear what sound it is. To preserve this timing info in the audio signal 96kHz is therefor not enough, we actually need 192kHz."

Both the claim that "(humans) are sensitive to timing of sounds to about 10 microseconds" and

"To preserve this timing info in the audio signal 96kHz is therefor not enough, we actually need 192kHz"

seem to be completely wrong, based on what I know about human hearing and digital audio.

Audibility of phase shifts and time delays

"So first you notice the arrival of a sound (quick change in air pressure, a very high frequency) and later on you actual hear what sound it is."
This must be their way of describing the terrific PRE-ringing!
With 24bit music you can listen to silence much louder!

Audibility of phase shifts and time delays

x-axis is in us (microseconds). Notice how the dots are spaced at 44.1 kHz or roughly 23 us apart.
y-axis is linear in 0.1 divisions, with the max shown being ~0.6 or about -4.4 dBFS.

We take some impulsive signal, quantize it and visualize it as red dots. The corresponding blue line shows the upsampled version of the signal before quantization (the 'ideal' or target).
We then delay this signal by 10us, quantize it and visualize it as green dots. The blue line is again the upsampled, this time delayed signal before quantization.

The dots match our target... Even if we attenuate this signal by 60 dB first, we will see the effects of quantization and dither, yes, but the 10us delay does not disappear.
"I hear it when I see it."

Audibility of phase shifts and time delays

x-axis is in us (microseconds). Notice how the dots are spaced at 44.1 kHz or roughly 23 us apart.
y-axis is linear in 0.1 divisions, with the max shown being ~0.6 or about -4.4 dBFS.

We take some impulsive signal, quantize it and visualize it as red dots. The corresponding blue line shows the upsampled version of the signal before quantization (the 'ideal' or target).
We then delay this signal by 10us, quantize it and visualize it as green dots. The blue line is again the upsampled, this time delayed signal before quantization.

The dots match our target... Even if we attenuate this signal by 60 dB first, we will see the effects of quantization and dither, yes, but the 10us delay does not disappear.

In short, 10 microseconds is nowhere near the actual abilities of 4416 to convey timing differences among
signals is:  1/(44100*65536) =

3.460042871315193e-10  seconds. Less than a nanosecond.

Audibility of phase shifts and time delays

The information that I am aware of related to the audibility of time differences is summarized here:

David L. Clarks AES paper about the audibility of time

"
Despite the possibility of "digital effects," the modest quality of the speakers, and the fact that the listeners were in effect being tested, every one of approximately 12 participants was able to match the time delays to within +/- 40 microseconds (about 1/2 inch). This finding confirms the clams of those who espouse the virtues of arrival time compensated loudspeakers.
"

Audibility of phase shifts and time delays

Cool, words about audio from someone who doesn't understand sampling.

Audibility of phase shifts and time delays

In the MQA hype I've found the following claim:

MQA Hype

"However from more recent research it turns out that although humans can not hear frequencies above 20kHz, they are sensitive to timing of sounds to about 10 microseconds. So first you notice the arrival of a sound (quick change in air pressure, a very high frequency) and later on you actual hear what sound it is. To preserve this timing info in the audio signal 96kHz is therefor not enough, we actually need 192kHz."

Audibility of phase shifts and time delays

signals is:  1/(44100*65536) =

3.460042871315193e-10  seconds. Less than a nanosecond.

Well using that formula we'd tread into the >1us 'resolution' range 70 dB down*. 90 dB down would be 10us.

*) This does not translate directly into what you see in a spectrogram. A 1ms long tone burst that reaches full scale would show up below -50 to -60 dB with a 65k FFT size in Audition.
"I hear it when I see it."

Audibility of phase shifts and time delays

Arny, they are not talking about phase in an absolute sense, but the phase relationship of the content within a track.
"I hear it when I see it."

Audibility of phase shifts and time delays

In the MQA hype I've found the following claim:

MQA Hype

"However from more recent research it turns out that although humans can not hear frequencies above 20kHz, they are sensitive to timing of sounds to about 10 microseconds. So first you notice the arrival of a sound (quick change in air pressure, a very high frequency) and later on you actual hear what sound it is. To preserve this timing info in the audio signal 96kHz is therefor not enough, we actually need 192kHz."

Working again...

Audibility of phase shifts and time delays

I didn't mean allpass filters either. That would be a constant phase shift at a given frequency.

I guess what they're talking about is some sort of quantization distortion along the x-axis. I doesn't make sense, but it doesn't need to for FUD.
"I hear it when I see it."

Audibility of phase shifts and time delays

x-axis is in us (microseconds). Notice how the dots are spaced at 44.1 kHz or roughly 23 us apart.
y-axis is linear in 0.1 divisions, with the max shown being ~0.6 or about -4.4 dBFS.

We take some impulsive signal, quantize it and visualize it as red dots. The corresponding blue line shows the upsampled version of the signal before quantization (the 'ideal' or target).
We then delay this signal by 10us, quantize it and visualize it as green dots. The blue line is again the upsampled, this time delayed signal before quantization.

The dots match our target... Even if we attenuate this signal by 60 dB first, we will see the effects of quantization and dither, yes, but the 10us delay does not disappear.

In short, 10 microseconds is nowhere near the actual abilities of 4416 to convey timing differences among
signals is:  1/(44100*65536) =

3.460042871315193e-10  seconds. Less than a nanosecond.

Well, 1/ ( 2 pi bandwidth number_of_levels)

This hardly disproves your point, naturally!
-----
J. D. (jj) Johnston

Audibility of phase shifts and time delays

x-axis is in us (microseconds). Notice how the dots are spaced at 44.1 kHz or roughly 23 us apart.
y-axis is linear in 0.1 divisions, with the max shown being ~0.6 or about -4.4 dBFS.

We take some impulsive signal, quantize it and visualize it as red dots. The corresponding blue line shows the upsampled version of the signal before quantization (the 'ideal' or target).
We then delay this signal by 10us, quantize it and visualize it as green dots. The blue line is again the upsampled, this time delayed signal before quantization.

The dots match our target... Even if we attenuate this signal by 60 dB first, we will see the effects of quantization and dither, yes, but the 10us delay does not disappear.

In short, 10 microseconds is nowhere near the actual abilities of 4416 to convey timing differences among
signals is:  1/(44100*65536) =

3.460042871315193e-10  seconds. Less than a nanosecond.

Well, 1/ ( 2 pi bandwidth number_of_levels)

This hardly disproves your point, naturally!

Thanks for the correction, JJ. Your correction seems to make the resolution even  more than my calculation said or 1.0990333934666109990166543854704e-10  seconds - about 0.1  nanoeconds.  No perceiving that with the ears, or its effects on frequency response.

Audibility of phase shifts and time delays

So in order for that to degrade to 1us we're down ~80 dB.
"I hear it when I see it."

Audibility of phase shifts and time delays

So in order for that to degrade to 1us we're down ~80 dB.

The way I see it, the perceptual limit is about 40 uSec or 40 x 10E-6 seconds, and the resolution limit is about 1.1 x 10E-10.  so, the resolution limit is 40 x 10-4 below the perceptual limit which is 92 dB down.

When we are talking about timing, there are a lot of ways to quantify it.

For example Clark is talking about trying to synchronize two identical sounds that are operating concurrently. Cancellations which lead to steady-state frequency response differences are a big part of this.

When we are talking about hearing the difference between two different musical sounds, then there are no steady state frequency response differences to listen for, and this is more like me ABXing two files that are not mixed but are displaced in time about a week ago. Its like hearing an echo which puts the perceptual limit up into the range of milliseconds, not microseconds.

Thus we find two false claims that are being used to sell MQA.

One is that the ears are at from 4 to over a thousand times more sensitive than they are to timing differences, and that 4416 is at least 10.000 times worse than it is at transporting timing information.

Audibility of phase shifts and time delays

"However from more recent research it turns out that although humans can not hear frequencies above 20kHz, they are sensitive to timing of sounds to about 10 microseconds."
"recent"?!

Klumpp, R. G.; and Eady, H. R. (1956).
Some Measurements of Interaural Time difference Thresholds.
Journal of the Acoustical Society of America, vol. 28, no. 5, Sept., pp. 859-860.

It's the earliest reference I know to 11 us.

Some of it is reproduced on pages 4+5 of this...
http://web.mit.edu/hst.723/www/ThemePapers.../Grantham95.pdf

Cheers,
David.

Audibility of phase shifts and time delays

Are we entering Kunchur-land again here?

Audibility of phase shifts and time delays

The way I see it, the perceptual limit is about 40 uSec or 40 x 10E-6 seconds, and the resolution limit is about 1.1 x 10E-10.  so, the resolution limit is 40 x 10-4 below the perceptual limit which is 92 dB down.

That's not what I meant. 2^16 values are only available to a tone that reaches full-scale. 80 dB (roughly 13 bits) down we're left with 2^16 * 10^(-80/20) values.
"I hear it when I see it."

Audibility of phase shifts and time delays

"However from more recent research it turns out that although humans can not hear frequencies above 20kHz, they are sensitive to timing of sounds to about 10 microseconds."
"recent"?!

Klumpp, R. G.; and Eady, H. R. (1956).
Some Measurements of Interaural Time difference Thresholds.
Journal of the Acoustical Society of America, vol. 28, no. 5, Sept., pp. 859-860.

It's the earliest reference I know to 11 us.

Some of it is reproduced on pages 4+5 of this...
http://web.mit.edu/hst.723/www/ThemePapers.../Grantham95.pdf

Thanks. Clark's experiment was based on loudspeaker listening, while it appears that Grantham95 is based on headphones. That one would be 4 times the other seems to make sense.

The big mistake in the MQA hype is then that somehow 44/16 can't hack reproducing time delays on the order of a few microseconds. It can hack it, and its resolution goes many orders of magnitude below that.

Audibility of phase shifts and time delays

Are we entering Kunchur-land again here?

Read here: Kunchur 2007 Temporal Resolution Paper

His Faux pas about serious problems with the temporal resolution possible with 4416 seem to have been avoided, but then he goes down the Stuart road.

However in the same year, same journal he published this:

Other Kunchur time resolution paper

"
Every componentâ€™s bandwidth limit
(even if it behaves perfectly linearly) causes it to have a
finite relaxation time of ??1/?max; use of digital carriers
limits the shortest resolvable time interval to about half
the sampling interval (which for CD would be 11 ?s);
"

Audibility of phase shifts and time delays

Even if his was audible- I have serious doubts that loudspeakers of the same make an model would be  able to reproduce this timing accurately anyway - people would have to calibrate their speahkers within microseconds so they matched - marketing madness

Audibility of phase shifts and time delays

The way I see it, the perceptual limit is about 40 uSec or 40 x 10E-6 seconds, and the resolution limit is about 1.1 x 10E-10.  so, the resolution limit is 40 x 10-4 below the perceptual limit which is 92 dB down.

That's not what I meant. 2^16 values are only available to a tone that reaches full-scale. 80 dB (roughly 13 bits) down we're left with 2^16 * 10^(-80/20) values.

Are you suggesting that all future resolution specifications be based on FS = -80 dB?

At that rate 4416 only has 16 dB dynamic range.