Skip to main content

Topic: Interesting Papers re temporal resolution (Read 79574 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • ncdrawl
  • [*]
Interesting Papers re temporal resolution

  • rpp3po
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #1
I'm not sure wether they have actually tested the impact of high frequencies on human hearing or just the impact of a single-capactitor low pass filter.
  • Last Edit: 25 July, 2009, 02:03:44 PM by rpp3po

Interesting Papers re temporal resolution
Reply #2
Nice to see this topic here where it can be discussed calmly and rationally.

--Ethan
I believe in Truth, Justice, and the Scientific Method

  • hellokeith
  • [*][*][*][*]
Interesting Papers re temporal resolution
Reply #3
from the link:
Quote
Our recent behavioral studies on human subjects proved that humans can discern timing alterations on a 5 microsecond time scale, indicating that that digital sampling rates used in common consumer audio (such as CD) are insufficient for fully preserving transparency.


Exactly how does some air vibration < 18kHz only last 5 microseconds?

Also,

.000005 = 200 KiloHertz

Does this mean we need > 400kHz sampling rates?

  • C.R.Helmrich
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #4
Does this mean we need > 400kHz sampling rates?

Well, I guess what we can conclude from Prof. Kunchur's research is that indeed, you might need 400 kHz to digitally represent a 7-kHz square wave transparently. I'm not listening to such square waves in my free time very often, though.

Would be curious to see what vinyl makes out of a "perfect" 7-kHz square wave.

Chris
If I don't reply to your reply, it means I agree with you.

  • rpp3po
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #5
Well, I guess what we can conclude from Prof. Kunchur's research is that indeed, you might need 400 kHz to digitally represent a 7-kHz square wave transparently. I'm not listening to such square waves in my free time very often, though.


Well, 400kHz aside, if he was anywhere vicinity of being right - what I'm not willing to swallow, yet - the long time objectivist argument would be broken:

  • Humans can't hear anything above 20kHz.
  • 44.1kHz sample rate is enough to cover all that completely according to Nyquist.
  • -> The Redbook storage format is completely sufficient for transparency.



  • hellokeith
  • [*][*][*][*]
Interesting Papers re temporal resolution
Reply #6
Well after reading through the first 3 PDF's and the FAQ, I surmise (from my novice knowledge of digital audio concepts) that his main point is centered on arrival times / phase differences.  His blind testing groups could identify with good confidence down to about 5 microseconds.  Apparently bandwidth restriction (44.1 kHz sampling for example) and loudspeaker placement (within a few millimeters) can each independently introduce timing variances > 5 microseconds that can be blind-test identified.  Also there is a mention of two ultrasonic off-phase samples which cause (unwanted) lower sonic harmonics that can be identified as well.

I wonder what kind of design changes and production costs could accommodate typical electronic audio hardware sampling at > 192 kHz or even > 400 kHz? And what software encoding scheme would be required?
  • Last Edit: 25 July, 2009, 08:27:42 PM by hellokeith

  • rpp3po
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #7
I'm still  why the usually quite vocal "44.1kHz ought to be enough for anybody"-crowd doesn't take a stand on this...

I have prepared a set of audio files to verify Professor Kunchur's claims for the domain of digital standard vs. high rez audio, that is not subject Kunchur's lowpass circuitry. 7kHz square waves were directly generated into the corresponding output formats. It was quite difficult to get done, even Audition 3 could not generate 7kHz squares without notable artifacts. The results are interesting! You'll need a high end DAC, though.

Use files of equal bit rate for ABX testing! They are normalized to -10db and have short fade-in/-outs applied to prevent transient clicks while looping:

32 bit, 192 kHz:
[ Specified attachment is not available ]
32 bit, 192 kHz upsampled from 44.1kHz (Sox VHQ):
[ Specified attachment is not available ]

32 bit, 110 kHz, optimized for Benchmark DAC1s:
[ Specified attachment is not available ]
32 bit, 110 kHz (DAC1) upsampled from 44.1kHz (Sox VHQ):
[ Specified attachment is not available ]

32 bit, 44.1 kHz (for reference only):
[ Specified attachment is not available ]

Please re-download! I had accidentally uploaded the wrong set of files.

PS These are 32 bit integer files, which is Sox' default. Some applications (Audition and as reported even Foobar) have trouble playing them. Convert them to 32 bit float if you are affected. I also opted for integer because they are easier to verify with a hex editor.
  • Last Edit: 28 July, 2009, 06:28:01 AM by rpp3po

  • ExUser
  • [*][*][*][*][*]
  • Read-only
Interesting Papers re temporal resolution
Reply #8
rpp3po, the 192kHz versions are still different. The 44.1kHz sample matches the upsampled sample, but not the raw 192kHz sample, verified by ears and spectrograms.
  • Last Edit: 27 July, 2009, 09:37:52 PM by Canar

  • rpp3po
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #9
Ok, I have re-generated them again and am going to re-upload.

In the meantime, does anyone see any flaws here?

Code: [Select]
mbp:~ rpp3po$ sox -r 44100 -n 44100.wav synth 5 square 7000 gain -10
mbp:~ rpp3po$ sox -r 192000 -n 192000.wav synth 5 square 7000 gain -10
mbp:~ rpp3po$ sox 44100.wav 44_1kHz.wav fade .010 0 .010
mbp:~ rpp3po$ sox 192000.wav 192kHz.wav fade .010 0 .010
mbp:~ rpp3po$ sox 44_1kHz.wav 192kHz-from-44_1kHz.wav rate -v 192000


PS This is the source code of Sox' square wave generator.
  • Last Edit: 27 July, 2009, 10:35:03 PM by rpp3po

  • Axon
  • [*][*][*][*][*]
  • Members (Donating)
Interesting Papers re temporal resolution
Reply #10
Don't even bother using a square wave generator in an audio editor - in order to ensure that the aliasing is below a 16-bit noise floor, you'd need 65536x oversampling....

Instead, construct the square wave by hand using additive synthesis based on the Fourier series expansion:

Amplitude(n) = 1/n, odd n; 0, even n
Phase(n) = 0

(Or, if you are sure your audio editor uses a technique immune to aliasing issues, like this one, use it.)

Such techniques are well documented - it's quite a shame that so many audio applications (and Dr. Kunchur, and other audiophiles) continue to use bad code.
  • Last Edit: 27 July, 2009, 09:57:01 PM by Axon

  • rpp3po
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #11
Yes, I asked myself why Adobe would even include such a broken feature. It is very obviously broken and they should have seen that. Sox' results look fine to me now, though. If not, I'm open for feedback.

Such techniques are well documented - it's quite a shame that so many audio applications (and Dr. Kunchur, and other audiophiles) continue to use bad code.


I thought that Kunchur had used an analog square wave generator?
  • Last Edit: 27 July, 2009, 10:15:47 PM by rpp3po

  • saratoga
  • [*][*][*][*][*]
Interesting Papers re temporal resolution
Reply #12
I only skimmed the paper, but IIRC tried a digital one couldn't get it to work (for unspecified reasons) and then used an analog one.  I presumed it was because he had a high end analog synthesizer handy (they're pretty common in labs since people used them for all sorts of stuff in the days before cheap digital DAQs).

  • ExUser
  • [*][*][*][*][*]
  • Read-only
Interesting Papers re temporal resolution
Reply #13
Now the WAV files won't load in foobar2000...

Edit: Hacked around with them, got them loading in foobar2000, but all of them have subharmonics well under 7000Hz. At first I thought I was just hearing some weird IMD, but the subharmonics are there.

Edit 2: Synthesized my own versions, using Axon's cited additive synthesis technique:
http://benjamincook.ca/441.wav - 44.1kHz square, harmonics at 7k (gain 1) and 21k (gain 1/3).
http://benjamincook.ca/192-441.wav - 44.1kHz square, harmonics at 7k (gain 1) and 22k (gain 1/3), resampled to 192kHz using sox 441.wav 192-441.wav rate -v 192000
http://benjamincook.ca/192.wav - 192kHz square, harmonics at 7k (gain 1), 21k (gain 1/3), 35k (gain 1/5), ..., 91k (gain 1/13)

These really don't look square in any editor, but they should be mathematically-acceptable.  I really can't ABX these. It hurts my ears, and I don't have a DAC that handles 192kHz nicely.

Edit 3: For the curious, this is simply the <math.h> sin function, at 32-bit floating-point precision. Fixed some numbers in Edit 2.
  • Last Edit: 28 July, 2009, 12:17:06 AM by Canar

  • krabapple
  • [*][*][*][*][*]
Interesting Papers re temporal resolution
Reply #14
Nice to see this topic here where it can be discussed calmly and rationally.


I'm wondering why this thread isn't getting more attention. It should be much more original HA territory than the "Why we need audiophiles" juggernaut.



Kunchur's claims were introduced here at HA by moi two weeks ago:

http://www.hydrogenaudio.org/forums/index....mp;#entry646398

and yes, rpp3po , he is saying the Redbook is broken in terms of transparency.

Hence the uproar on Stereophile's forum, where it's delightful to see what fulsome respect the letters 'PhD' can garner from audiophiles when they really want to believe. 
  • Last Edit: 28 July, 2009, 12:11:01 AM by krabapple

  • Woodinville
  • [*][*][*][*][*]
Interesting Papers re temporal resolution
Reply #15
Well, 400kHz aside, if he was anywhere vicinity of being right - what I'm not willing to swallow, yet - the long time objectivist argument would be broken:

  • Humans can't hear anything above 20kHz.
  • 44.1kHz sample rate is enough to cover all that completely according to Nyquist.
  • -> The Redbook storage format is completely sufficient for transparency.


Just as an aside, no, that's not the case.

In order to contain the bandwidth of a signal, you have to filter it. It is possible (i.e. it is done with ridiculous filters which I cheerfully stipulate are not useful in any real sense) that filters might have a slight, tiny effect, maybe, kinda sorta, PERHAPS, at 44.1. Even less likely at 48, and not at all at 64.  Nobody has shown this with sensible filters, by which I mean filters that have decent transition bandwidth (i.e. not as tight as humanly possible), ripple, and stopband rejection.

And, of course, if Dr. K's argument is mistaken, that shows nothing, for or against.
-----
J. D. (jj) Johnston

  • rpp3po
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #16
These really don't look square in any editor, but they should be mathematically-acceptable.  I really can't ABX these. It hurts my ears, and I don't have a DAC that handles 192kHz nicely.

Edit 3: For the curious, this is simply the <math.h> sin function, at 32-bit floating-point precision. Fixed some numbers in Edit 2.


Could anybody enlighten me why a sine function, that supposedly outputs something that is not square (can't check - files are offline right now), should be a better approximation of a square wave than successive sequences of -x,-x,-x,-x,-x,-x,-x,-x,+x,+x,+x,+x,+x,+x,+x,+x values and x being a constant?
  • Last Edit: 28 July, 2009, 06:38:45 AM by rpp3po

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #17
Could anybody enlighten me why a sine function

Not a sine, but a sum of sines:
Quote
Amplitude(n) = 1/n, odd n; 0, even n
Phase(n) = 0

is 1*sin(f*x) + 1/3*sin(3*f*x) + 1/5*sin(5*f*x) + ... + 1/N*sin(N*f*x), where N*f < 2*pi*Nyquist_frequency.

that supposedly outputs something that is not square (can't check - files are offline right now), should be a better approximation of a square wave than successive sequences of -x,-x,-x,-x,-x,-x,-x,-x,+x,+x,+x,+x,+x,+x,+x,+x values and x being a constant?

Because of aliasing. Take analog square wave and sample it without lowpassing it at Nyquist freq. You'll get that square digital wave; it contains frequencies below Nyquist limit and aliases of frequencies above it.

  • Nick.C
  • [*][*][*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #18
Why not just just create a 7.35kHz square wave "manually", i.e. for 44.1kHz sample rate 6x 32767 followed by 6x -32768 <repeat>; 15x for 110.25kHz and 24x for 176.4kHz?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

  • saratoga
  • [*][*][*][*][*]
Interesting Papers re temporal resolution
Reply #19
Could anybody enlighten me why a sine function, that supposedly outputs something that is not square (can't check - files are offline right now), should be a better approximation of a square wave than successive sequences of -x,-x,-x,-x,-x,-x,-x,-x,+x,+x,+x,+x,+x,+x,+x,+x values and x being a constant?


The sum of sins approach is exact for a band limited square wave (since a band limited square wave is by definition the Fourier series of a non-limited square wave truncated at the band limit).  Flipping between +/- x and then low pass filtering is only an approximation thats limited by the quality of the filtering applied.

  • Axon
  • [*][*][*][*][*]
  • Members (Donating)
Interesting Papers re temporal resolution
Reply #20
Because that places unacceptable restrictions on the desired wavelength.

OK, so, my paper is taking a little too long to get out the door, so I will provide an executive summary here. I am trying as fast as possible to get it out, but the discussion is about to pass me by, soooo...
  • His definition of the power difference at 14Khz, ΔLp(2), is not derived. When one derives it, one discovers that he fails to correctly sum the 14kHz components resulting from the (7+7) and (21-7) terms - in fact, he explicitly treats the calculation of ΔLp(2) as a "peak" level difference, when in fact the summed component quite obviously has a constant level. (That Dr. Kunchur makes such a basic trig error is profoundly disturbing to me.)
  • His definition of ΔLp(2) also uses the theoretical values of Δφ instead of the values he already documented in his own measurements.
  • When these issues are corrected, the values of ΔLp(2) for the RC-filter experiment fall from 1.4db down to the 0.2-0.3db range, and do not materially differ between the 3.9us and 4.7us cases. It therefore becomes extremely difficult to justify the results of the RC filter test due to nonlinear mixing.
  • Paradoxially, Dr. Kunchur never mentions nonlinear mixing as a theoretical justification for his other test (speaker alignment), and in fact explicitly ascribes the results to an unknown cause. But when the corrected ΔLp(2) calculation is run, the values found are even larger than for the RC filter test, and vary quite substantially between test configurations (2.9mm: 0.76db; 6.2mm: 1.71db; 10.3mm: 3.59db). The bizarre conclusion that must be reached is that Dr. Kunchur's analyses are mutually contradictory between his two later papers (Acta Acoustica/Technical Acoustics). Either nonlinear mixing is the cause (which means the RC filter results are unexplained), or it's not (which means the RC filter's results are incorrectly justified).
  • When ΔLp(2) is restated in terms of the before/after level difference of the 21khz level, and ignoring the differences in 7khz before/after levels, one finds that ΔLp(2) is strongly correlated with changes in 21khz level. If Dr. Kunchur truly wishes to justify audibility based on nonlinear mixing, and above ΔLp(2) being above a threshold, he must also admit that arbitrarily raising the 21khz level must also raise ΔLp(2) further above threshold, leading to arbitrarily lower measured temporal resolutions.
  • The choice of 7khz square wave is unusually, profoundly rich in ultrasonics: The 21khz level is only 10db below fundamental, where the vast majority of tonal musical instruments have ultrasonic components far below that. That is: the input signal is entirely unrepresentative of actual music.
  • Putting the above two points together yields quite possibly the most important objection to make here: Kunchur's results advocating a 5us temporal resolution are completely meaningless, because the process he uses to generate that result ought to be extended indefinitely to generate arbitrarily low resolutions. These results have no basis in psychoacoustic reality because they rely on signals further and further disconnected from realistic situations.
  • Even if one discounts the nonlinear mixing justification for all of this there are still extremely good reasons to believe that the measured resolutions are inversely proportional to ultrasonic level. For instance, if one adopts a level- or peak-detection model, a signal with faster rise time ought to result in a lower measured resolution - but this faster rise time must be accomplished with increased high frequency components (keeping signal amplitude constant).
  • Furthermore, his choice of 7khz square wave, far from being representative, is very carefully tuned. Any lower of a frequency would result in the 3rd harmonic becoming audible, which would invalidate his listening tests involving the manipulation of that component. Any higher of a frequency would move the 2nd harmonic further up in frequency, where human hearing is that much less sensitive; the risk becomes too great that the component would be made completely inaudible. In other words, results obtained for this type of input cannot be generalized for arbitrary music signals.

Of course, I'm only listing the comments here that relate to Dr. Kunchur's main thesis - I'm leaving out the rest of the points relating to his comments on 44.1khz digital audio, on signal synthesis, on high-end audio, etc...

ncdrawl, hold off on sending these to Kunchur just yet - I'll be able to give you/him a nicely typeset LaTeX file pretty soon with all of these points more fully fleshed out.

  • ExUser
  • [*][*][*][*][*]
  • Read-only
Interesting Papers re temporal resolution
Reply #21
If I may grossly oversimplify your argument Axon, you're partially arguing that he's choosing an extreme edge case to test. However, if his intent is to map the boundaries of audibility, wouldn't an edge case be acceptable? I find the conclusion that ultrasonics are perceptible fascinating, and if he's found a case in which they actually are audible, should we not hear it out? Even though it does not represent most cases, if they are truly audible in this case, isn't that worth considering?

As an archivist, I want transparency in all cases, so I don't have to worry about the edge cases. That's why I use FLAC and not MP3. If there is any case where 44.1kHz is not sufficient, isn't that worth devising solutions for?

  • Axon
  • [*][*][*][*][*]
  • Members (Donating)
Interesting Papers re temporal resolution
Reply #22
If I may grossly oversimplify your argument Axon, you're partially arguing that he's choosing an extreme edge case to test. However, if his intent is to map the boundaries of audibility, wouldn't an edge case be acceptable? I find the conclusion that ultrasonics are perceptible fascinating, and if he's found a case in which they actually are audible, should we not hear it out? Even though it does not represent most cases, if they are truly audible in this case, isn't that worth considering?

As an archivist, I want transparency in all cases, so I don't have to worry about the edge cases. That's why I use FLAC and not MP3. If there is any case where 44.1kHz is not sufficient, isn't that worth devising solutions for?


The use of a 7khz square wave as an input here, in this context, seems particularly unrepresentative to me, as a -10db ultrasonic third harmonic, with a signal completely absent of energy at 14khz from other sources, is a profoundly special case. It's not merely that it's an edge case - it is way, way over the edge to begin with. It's like arguing that 16 bits is insufficient because you can hear the noise with the gain raised ~20db above normal (as even shown by Meyer/Moran). Of course you can - but that situation never actually happens in the real world, where music is normalized near 0dbFS and released for an audience that actually wishes to listen to it. More generally, Kunchur never really justifies that input signal very well, and without careful delineation, nothing's stopping anybody from boosting 21khz levels arbitrarily high to get arbitrarily low measured thresholds (like with, for instance, a bipolar pulse train).

In the final reduction ad absurdum, it's hard to tell apart his conclusions apart from a claim that (say) 200khz bandwidth is necessary for audio, because if you play extremely powerful 200khz and 202khz tones, the inevitable intermodulation is audible. The existence of any form of intermodulation, combined with the existence of an ultrasonic bandwidth, necessarily implies that some classes of signals will show audible differences when filtered before distortion. Morevoer, this audibility will exist at any amount of filtering greater than zero, because I'll always be able to hand you a signal that will break threshold at the intermodulation frequency.

A test with ultrasonic content at ranges more representative of real situations would restore validity, but I think that is not going to save his conclusions. In that case, if audibility is shown in the first place, it will almost certainly be above 22us. And at that point it no longer has anything to do with time resolution. But it would be a convincing proof of CD's insufficiency - but before that point is reached, the question is, how would that be possible when every prior attempt has failed?
  • Last Edit: 28 July, 2009, 03:18:27 PM by Axon

  • saratoga
  • [*][*][*][*][*]
Interesting Papers re temporal resolution
Reply #23
As an archivist, I want transparency in all cases, so I don't have to worry about the edge cases. That's why I use FLAC and not MP3. If there is any case where 44.1kHz is not sufficient, isn't that worth devising solutions for?


I think his results are really interesting, but to an archivist, they're not relevant until they're shown to apply to something approaching actual audio.  After all if you just want to store square waves, you shouldn't be using PCM in the first place because of its nasty requirement that signals be band limited . . .

  • NullC
  • [*][*][*]
  • Developer
Interesting Papers re temporal resolution
Reply #24
I think his results are really interesting, but to an archivist, they're not relevant until they're shown to apply to something approaching actual audio.  After all if you just want to store square waves, you shouldn't be using PCM in the first place because of its nasty requirement that signals be band limited . . .


…Because other sampling methods don't require band-limited signals?

For archival purposes there is a decent argument for going beyond redbook "just in case" ... Perhaps the lizard people who will take over the earth after we nuke ourselves will have decent ultra-sonic hearing and want the full experience?  It's not like the behaviours of professional archivists have much relationship to the behaviour of normal people anywhere else.  (Or will you be micro-scribing my message onto a nickel plate?)