Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: temporal issue for absolute threshold in quiet (Read 3100 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

temporal issue for absolute threshold in quiet

The psychoacoustic model in modern perceptual audio coder identifies the frequency components which are energetically masked and imperceptible under auditory absolute threshold. However, it mainly considers the simultaneous effects without deeply involving those in the temporal domain. Temporal processing needs additional memory space to store necessary spectral data and increase the control complexity and latency. Some algorithms implement the pre-echo control, backward and forward temporal masking, but none link absolute threshold in quiet to temporal process.

Evidences are presented that absolute threshold are reduced for longer stimuli in

“A unifying basis of auditory thresholds based on temporal summation”
P Heil, H Neubauer - PNAS, 2003 - pnas.org
http://scholar.google.com/url?sa=U&q=http:...s%3B100/10/6151

This paper reasoned the first spike latency following stimulus onset contributes to the physiological substrate of absolute auditory threshold. The latency is a function of both the amplitude and onset duration of the stimulus. This temporal integration is occurring at the inner hair cell where the accumulation and clearance of receptor presynaptic calcium is going.

A computer model of the auditory periphery is described that has realistic first-spike latency properties in

Ray Meddis, “Auditory-nerve first-spike latency and auditory absolute threshold: A computer model”, JASA 119 (1), Jan 2006.

It has firing thresholds that decline with increased stimulus duration in a manner similar to above mentioned psychophysical observations. The model consists of a cascade of two stages: auditory periphery and cochlear nucleus chopper neuron. However, it is far from real-time for any current desktop computation machine. Computation in auditory-nerve resolution is quite expensive especially for application in audio compression engine.

If we do an analogy between MPEG psychoacoustic model and the above model,  mapping from spectral domain (FFT or MDCT) to bark scale is similar to auditory periphery processing, and spreading convolution with some post-compensation (absolute threshold in quiet is applied here) is like the integration of cochlear nucleus. Say, we place a quiet-reservoir to each bark dimension and accumulate the incoming signal and remove those vanished in time. Then, a set of threshold apply to the quiet-reservoirs in tonotopic distribution to detect the perceiveness. By considering this temporal implementation, we can still preserve those long-duration low-magnitude auditory pieces. There is no clear evidence showing any detrimental quality effect has arisen from this mismatch of absolute threshold in quiet. It might not be worthy to devote more thought to identify the thresholds for quiet-reservoirs and increase the computational loading of psycho-acoustic model. On the other hand, the dominance of masking tone and noise only derived from the simultaneous accumulation of neighboring band. With the introduction of the same mechanism of the quiet-reservoir, the original masking ability may degrade or promote after the calculation with both time and level input.

Psychoacoustic model for audio encoder only pay considerable attention on energetic masking but ignore the informational masking part which is becoming more aware in speech technology. Masking phenomenon is explained as the consequence of physiological behavior for inner hair cell. Maybe we should rename the term psychoacoustic model to physioacoustic model before the integration of informational masking.

temporal issue for absolute threshold in quiet

Reply #1
Interesting reading. However, it does not suggest that the ATH level is only locally adjusted according to integration of sound level integration. What this paper describes is that the ath level is slowly decreasing when playing a low amplitude tone in bursts. The problem is that we do not know what the results would have been with burst of low amplitude tones at different freqs.

Fom an evolutionary perspective, it would make sense that it works better keeping the same freq in tones, and this is backed up by the psysiological experiments.

In Lame we are adaptating the ath level based on sound loudness, ie a low amplitude sound in isolation will became audible after a few ath level adaptation time. However, we are adaptating the ath level independantly of the "frequency steadiness" of sound. When presented with noise, we are also adjusting the ath level, which might be an error.

temporal issue for absolute threshold in quiet

Reply #2
copy from Meddis's 2006 JASA paper

"When spiking activity is aggregated across a number of similar high spontaneous-rate fibers and used as the input to a model of a cochlear nucleus coincidence neuron, its response can be used to judge whether or not a stimulus is present."

Simulated cochlear nucleus with different parameters for accumulation and clearance of calcium are sensitive to different frequency tone and their expeiemental responses match with animal data. for your reference when regarding to "locally adjustment" argument.

 

temporal issue for absolute threshold in quiet

Reply #3
Quote
copy from Meddis's 2006 JASA paper

"When spiking activity is aggregated across a number of similar high spontaneous-rate fibers and used as the input to a model of a cochlear nucleus coincidence neuron, its response can be used to judge whether or not a stimulus is present."

Simulated cochlear nucleus with different parameters for accumulation and clearance of calcium are sensitive to different frequency tone and their expeiemental responses match with animal data. for your reference when regarding to "locally adjustment" argument.
[a href="index.php?act=findpost&pid=361449"][{POST_SNAPBACK}][/a]


Wow! Technical stuffs. Unfortunately, my psychoacoustic theories / knowledge is quite elementary.