HydrogenAudio

Hydrogenaudio Forum => Scientific Discussion => Topic started by: ncdrawl on 2013-02-09 14:20:50

Title: Human hearing beats FFT
Post by: ncdrawl on 2013-02-09 14:20:50
http://phys.org/news/2013-02-human-fourier...-principle.html (http://phys.org/news/2013-02-human-fourier-uncertainty-principle.html)
Title: Human hearing beats FFT
Post by: probedb on 2013-02-09 14:51:40
http://phys.org/news/2013-02-human-fourier...-principle.html (http://phys.org/news/2013-02-human-fourier-uncertainty-principle.html)


A summary of the article would be useful.
Title: Human hearing beats FFT
Post by: ncdrawl on 2013-02-09 15:02:15
http://phys.org/news/2013-02-human-fourier...-principle.html (http://phys.org/news/2013-02-human-fourier-uncertainty-principle.html)


A summary of the article would be useful.



erm..

it is right in the first paragraph
For the first time, physicists have found that humans can discriminate a sound's frequency (related to a note's pitch) and timing (whether a note comes before or after another note) more than 10 times better than the limit imposed by the Fourier uncertainty principle. Not surprisingly, some of the subjects with the best listening precision were musicians, but even non-musicians could exceed the uncertainty limit. The results rule out the majority of auditory processing brain algorithms that have been proposed, since only a few models can match this impressive human performanc

Read more at: http://phys.org/news/2013-02-human-fourier...nciple.html#jCp (http://phys.org/news/2013-02-human-fourier-uncertainty-principle.html#jCp)
Title: Human hearing beats FFT
Post by: lvqcl on 2013-02-09 15:10:04
From the comments:

Quote
The Fourier uncertainty principle doesn't set a limit on how accurately, say, the centre of a gaussian wavepacket, and the central frequency of a wavepacket, can be simultaneously be determined. There's no theorem in mathematics that says this is so and this experiment clearly demonstrates there is no such limit. So the title is a bit misleading.
Title: Human hearing beats FFT
Post by: greynol on 2013-02-09 17:00:59
That title?  What about the title of this discussion?!?
Title: Human hearing beats FFT
Post by: Alexey Lukin on 2013-02-09 19:13:21
Love the title
Title: Human hearing beats FFT
Post by: greynol on 2013-02-09 19:17:53
...with a blunt stick.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-10 03:41:51
Ad the Gabor limit does not address known frequencies, either. Geeze.
Title: Human hearing beats FFT
Post by: Kees de Visser on 2013-02-10 07:31:36
Ad the Gabor limit does not address known frequencies, either. Geeze.
JJ, does this study add anything new to the current understanding of how our auditory system works ?
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-10 11:51:14
Ad the Gabor limit does not address known frequencies, either. Geeze.
JJ, does this study add anything new to the current understanding of how our auditory system works ?


It does show, again, the value of extensive training. It settles what had been claimed anecdotally, which I suppose is some new understanding. It does not break the Gabor bound or do anything astounding, however it does confirm something that has been pretty much taken at face value.
Title: Human hearing beats FFT
Post by: Porcus on 2013-02-10 16:15:23
does this study add anything new to the current understanding of how our auditory system works ?


The authors themselves give a yes-and-no answer to whether they are surprised, the “no” part due to indications from as far back as the 70's that the auditory system does not work as they would have if the hypotheses of the uncertainty principle were applicable.  This does not answer the “anything” part of your question, but the authors certainly acknowledge that it isn't unheard of.
Title: Human hearing beats FFT
Post by: probedb on 2013-02-11 08:26:26
it is right in the first paragraph
For the first time, physicists have found that humans can discriminate a sound's frequency (related to a note's pitch) and timing (whether a note comes before or after another note) more than 10 times better than the limit imposed by the Fourier uncertainty principle. Not surprisingly, some of the subjects with the best listening precision were musicians, but even non-musicians could exceed the uncertainty limit. The results rule out the majority of auditory processing brain algorithms that have been proposed, since only a few models can match this impressive human performanc

Read more at: http://phys.org/news/2013-02-human-fourier...nciple.html#jCp (http://phys.org/news/2013-02-human-fourier-uncertainty-principle.html#jCp)


Yes, but posting a link to another web page is not a thread is it. I don't want to go to another website just to find out what your post is about.
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-11 09:25:17
Is it worth spending $25 to read this in full?

This kind of thing fascinates me, though I think this is beyond the limits of my understanding.

I understand time/frequency uncertainty, but I think you have to draw a distinction between knowing exactly what's there (in an absolute analytical way), and knowing something is different from something else without knowing (in an absolute analytical way) what either actually is. To put it another way, the FFT has an analytical time/frequency limit, but it can be lossless and reversible - meaning that any differences are preserved and therefore detectable. e.g. A computer can do a mathematical ABX test on the output of an FFT just as well as on the input audio signal. It will get both right every time. The FFT does not force you to lose information.

In the ear, it is one thing to be able to hear an arbitrary signal with no prior knowledge and say "ah, that's three notes, the second one is higher than the first, the third one is really high, and oh wow all their durations are so short and they're so close together that I shouldn't be able to tell that". It's quite another to listen to lots of signals of a certain pattern, learn their sound, and then be able to identify another of that pattern.

Training gives you that latter ability.


I know the ear doesn't transduce sound using an FFT. But I think you have to be really careful saying "this proves it's not subject to the limitations of an FFT, because just an FFT couldn't do this" when, in fact, an FFT followed by a clever computer (brain anyone?) could do pretty much anything.


I'd be interested to hear more from people who understand this better. Have you got any relevant slides JJ?

Cheers,
David.
Title: Human hearing beats FFT
Post by: Kees de Visser on 2013-02-11 09:56:11
I just stumbled upon this blog. Are we heading towards another "Kunchur (http://www.hydrogenaudio.org/forums/index.php?showtopic=73598)" discussion ?
http://tapeop.com/blog/2013/02/06/human-he...inty-principle/ (http://tapeop.com/blog/2013/02/06/human-hearing-beats-fourier-uncertainty-principle/)

Quote
More pointedly: until scientists devise and conduct more tests like this one, we may need to continue with a skeptical stance toward the application of mathamatical formulas and AB/ABX testing as the end-all of our windows into human perception.
[/size]
It's been a long time (80's) since I learned auditory basics at (rec.eng.) school. Lots of Blauert. But even the simplified models always emphasised the importance of time information. I still fail to see what's new in this study.
Title: Human hearing beats FFT
Post by: Porcus on 2013-02-11 10:46:03
Quote
More pointedly: until scientists devise and conduct more tests like this one, we may need to continue with a skeptical stance toward the application of mathamatical formulas and AB/ABX testing as the end-all of our windows into human perception.
[/size]

Impressive ...
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-11 11:21:34
Quote
More pointedly: until scientists devise and conduct more tests like this one, we may need to continue with a skeptical stance toward the application of mathamatical formulas and AB/ABX testing as the end-all of our windows into human perception.
[/size]

Impressive ...
Especially as psychoacoustic tests, by definition, are blind tests.

Can you imagine a sighted version?
Tester: "OK, I'm going to play you three very short notes - a low one, a slightly higher one, and a very high one, in that order."
beep beep beep
Tester: "Did you hear what I just described?"
Test subject: "Yes"
Tester: "It's amazing - your hearing is far better than humanly possible"



Cheers,
David.
Title: Human hearing beats FFT
Post by: db1989 on 2013-02-11 15:39:29
Given that the writer misspelled mathematical and uncertainty, I think it’d be fairly safe to ignore whatever they say.
Title: Human hearing beats FFT
Post by: greynol on 2013-02-11 15:48:44
Would someone mind correlating the article with the concept of the FFT?  The original poster, perhaps?
Title: Human hearing beats FFT
Post by: ojdo on 2013-02-11 16:13:38
Is it worth spending $25 to read this in full?


Access for free here: arxiv.org/pdf/1208.4611.pdf (http://arxiv.org/pdf/1208.4611.pdf)
(Link found in a comment by user AlexDSP in the otherwise highly off-topic discussion below the phys.org article.)
Title: Human hearing beats FFT
Post by: benski on 2013-02-11 17:32:32
Off the top of my head.  I might be wrong.  But part of the time/frequency uncertainty for FFT is the requirement to save the phase information so that it is reversible.  If we replaced the FFT with a bank of IIR bandpass filters, would we not be able to measure frequency magnitude per bin as well as timing information?  But this would be at the expense of losing precise phase information.
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-11 18:02:43
By taking an FFT, you are performing an (approximate, with issues) time/frequency analysis. You can picture is as a time/frequency grid. Click on "spectral view" in your favourite audio editor, and there it is (but only the amplitude is visible; FFT phase is vital too).

If you use a short FFT, you have fine time resolution, but wide frequency bins.
If you use a longer FFT, you have fine frequency resolution, but longer time bins.

e.g. sampling at 44.1kHz:
a 64-point FFT gives you blocks 1.5ms apart, and 689Hz apart.
a 1024-point FFT gives you blocks 23ms apart, and 43Hz apart.
a 65536-point FFT gives you blocks 1.5 seconds apart, and 0.7Hz apart.

The analysis limit claimed for FFT in that paper (and derived right at the start), is 4*pi better than just hitting the nearest block in this spectral grid would give you.

The human auditory system is doing 10x better than this.

The human auditory system is stunningly good at picking out the start of sharp-onset notes - it amplifies the onset greatly.
IIRC the human auditory system doesn't use the (fairly poor) frequency resolution of its filterbank to determine pitch. There are a couple of different models of how pitch perception probably works, but counting cycles and/or tracking harmonics are the simplest (not entirely accurate) ones. They barely need the filter bank at all.

I'm hoping someone will explain this better though.

Cheers,
David.
Title: Human hearing beats FFT
Post by: Ethan Winer on 2013-02-11 22:05:25
Given that the writer misspelled mathematical and uncertainty, I think it’d be fairly safe to ignore whatever they say.

There's plenty of other reasons to ignore what Allen says. If you follow the link in his current blog to his older Neil Young blog (in the 3rd paragraph, hard to spot), you'll see enough belief to choke a religious cult.

This morning a poster in Lynn Fuston's 3dAudio forum summarized it nicely:

[blockquote]Posted by Andreas Lassak: So, I'm not saying that the headline of the article in question ("Human hearing beats the Fourier uncertainty principle") is wrong, but it could easily be interpreted the wrong way. The problem is, that people, who have no clue about the background of this study, may only be able to understand or, in the worst case, may only be willing to read the headline. They may read it like "New finding about human hearing renders current scientific knowledge worthless" or "Human hearing is much more accurate than previously assumed”.[/blockquote]
IMO this exactly describes that Tape Op blog post. All wishful thinking, with no evidence or even a credible theory.

Edit: BTW, I've seen this article put forth in at least two hi-fi forums as evidence that previous knowledge about audio and hearing is now proven wrong. Sigh.

--Ethan
Title: Human hearing beats FFT
Post by: krabapple on 2013-02-12 06:18:37
Here's what the audiophilosphere reliably takes away from articles like this, regardless of the actual content:

-- see, sighted claims of difference are valid, and blind testing is unnatural and misleading
-- see, we do need higher sample rates and wordlengths
-- see, analog is better. Especially vinyl.
Title: Human hearing beats FFT
Post by: knutinh on 2013-02-12 08:59:03
Is not some wavelet/filterbank transform more relevant than the FFT for comparing with human hearing? If the time-frequency space is sparse, would it not be possible to guess at time/frequency speculatively (incorporating knowledge about the waveform), and produce better results than a general approach could ever do?

I attended a speech by Richard Lyon on models of human hearing. He was very critical of our tendency to approximate it using linear systems, when non-linear components seems to be integral to the whole thing.

-k
Title: Human hearing beats FFT
Post by: scuttle on 2013-02-17 16:33:32
I thought it would be fun and informative to poke Allen by posting a comment:

Quote
>>> but I can see that this study shows that the assumed limits of human auditory perception as figured by the Fourier Uncertainty Principle were too narrow - especially when expert listeners (a pro musician and an electronic music producer) are tested<<<

What limits were so "figured"??? Very few people did believe that human auditory processing was FFT based, so these experiments are just confirming mainstream opinion. And what limits relevant to an ABX test would ever have been affected???

It really sounds to me as if you are using something you don't understand as an excuse to justify believing something that you want to believe is true....
Title: Human hearing beats FFT
Post by: Willakan on 2013-02-18 19:03:23
Good God, I have been reading/getting PMed on other sites with a variety of ridiculous stuff about this article. So far, it apparently disproves sampling theorem (!) and, by an utterly incredible chain of logic, renders all existing measurement techniques worthless, either because there are ENORMOUS DISTORTIONS HIDING INSIDE THE FOURIERS or because...well...human hearing is nonlinear...erm...therefore all linear measurements are stupid and wrong...therefore tube amps. Or something.
Title: Human hearing beats FFT
Post by: Porcus on 2013-02-18 19:22:28
or because...well...human hearing is nonlinear...erm...therefore all linear measurements are stupid and wrong...therefore tube amps. Or something.


That one was cute.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-18 23:44:39
Good God, I have been reading/getting PMed on other sites with a variety of ridiculous stuff about this article. So far, it apparently disproves sampling theorem (!) and, by an utterly incredible chain of logic, renders all existing measurement techniques worthless, either because there are ENORMOUS DISTORTIONS HIDING INSIDE THE FOURIERS or because...well...human hearing is nonlinear...erm...therefore all linear measurements are stupid and wrong...therefore tube amps. Or something.



Yeah, me too, intercourse-it.
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-19 09:53:51
It's amazing what people conclude, given that the experiment will have been carried out with digital audio signals (not analogue signal generators, and certainly not vinyl!), and the extreme "10x better than FFT" test clips will happily survive mp3 encoding.

Cheers,
David.
Title: Human hearing beats FFT
Post by: lvqcl on 2013-02-19 14:51:05
I didn't find any mention of FFT in this article. Only "Fourier uncertainty principle" and "uncertainty limit"
Title: Human hearing beats FFT
Post by: ExUser on 2013-02-19 15:46:11
The first thing that comes to my mind is: Now how do we design some frequency transform that provides better results than human hearing? I honestly don't know. FFT has been the de-facto frequency transform in my head for far too long. My own attempt to hack around with wavelets never gave me better time/frequency resolution than your typical STFT. What other options do we have?
Title: Human hearing beats FFT
Post by: greynol on 2013-02-19 15:56:39
I didn't find any mention of FFT in this article. Only "Fourier uncertainty principle" and "uncertainty limit"

I tried this already.  I guess I'm not the only one wondering how 10x better than "FFT" can survive going through an FFT process. 

EDIT: added scary quotes. I don't wonder how a reversible process can satisfy the requirements of a non-linear system.
Title: Human hearing beats FFT
Post by: Garf on 2013-02-19 16:27:08
Is not some wavelet/filterbank transform more relevant than the FFT for comparing with human hearing?


Yes, there's no reason to limit yourself to FFT. The most advanced psymodels don't use them exactly because of that reason, they use QMF filterbanks or similar. (This already implies that what's in that article isn't so shocking as you'd think)
Title: Human hearing beats FFT
Post by: Garf on 2013-02-19 16:32:40
The first thing that comes to my mind is: Now how do we design some frequency transform that provides better results than human hearing? I honestly don't know. FFT has been the de-facto frequency transform in my head for far too long. My own attempt to hack around with wavelets never gave me better time/frequency resolution than your typical STFT. What other options do we have?


Parallel bandpass filters (PEAQ Advanced). More accurate, very slow.
Wavelets on the MDCT coefficients (Opus). Fast, can switch the T/F tradeoff depending on the signal.

The ear works more like the parallel filters setup.
Title: Human hearing beats FFT
Post by: jmvalin on 2013-02-19 20:07:09
Yes, there's no reason to limit yourself to FFT. The most advanced psymodels don't use them exactly because of that reason, they use QMF filterbanks or similar. (This already implies that what's in that article isn't so shocking as you'd think)


FFTs, MDCTs, QMFs and other filter banks are all fundamentally bound by the uncertainty principle: the product of the frequency resolution and time resolution cannot be smaller than 1. This is the case for any non-parametric model/transform, i.e. when you don't make any particular assumptions about your signal. There are however parametric models one can use. The best example is a model where you directly fit sinusoids of arbitrary frequencies (as opposed to Fourier, which uses sinusoids of predetermined frequencies). With such a model, the resolution is only limited by practical concerns like noise, other sinusoids, and modulation effects. As a trivial example, if you give me three samples and promise that they represent only a single sinusoid (no noise or modulation), then I can calculate the exact frequency of that sinusoid. So in theory, sinudoidal modeling solves all the time-freq issues of the FFT. The only problem is that it's damn hard to use, especially when it comes to having a good enough analysis. And that's why we don't don't have any high-quality sinusoidal-based audio codecs.
Title: Human hearing beats FFT
Post by: Paulhoff on 2013-02-20 18:51:51
For those in the know..........



The Princess and the Pea.


That sums it all up.



Paul



 
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-21 03:55:56
It's still a confused headline. Recognzing one of a set of different sine waves is not limited by the Gabor limit.

Observing that that is not limited by the Gabor limit is like observing that white is not limited by aircraft.
Title: Human hearing beats FFT
Post by: Garf on 2013-02-21 06:54:09
It's still a confused headline.... is like observing that white is not limited by aircraft.


I'm willing to rename this thread to "white is not limited by aircraft" but I don't think it'll make things better
Title: Human hearing beats FFT
Post by: dhromed on 2013-02-21 09:02:10
Yeah, we don't want to turn HA into an extension of horse_ebooks.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-22 03:22:59
It's still a confused headline.... is like observing that white is not limited by aircraft.


I'm willing to rename this thread to "white is not limited by aircraft" but I don't think it'll make things better


No better. Just as meaningful.
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-25 12:14:41
Never mind the title, I still don't find a satisfactory answer in this thread.

I understand that the human ear uses a wobbling membrane as something like a filter bank, with a number of non-linear processes, and an amazing analysis of the signals coming from it, to deliver the hearing capacities that we can probe in listening tests and experience every day. I understand that this is nothing like an FFT. I understand that the frequency resolution of masked noise is not that critical, so we use FFTs in codecs in a place where their frequency resolution is far over-specified, rather than being an issue.

However, we often describe other things in audio and hearing with an FFT-like model. It crops up in sampling theory. We push all the audio through a comparable filterbank in most lossy codecs. It is true that these transforms are mathematically lossless/reversible - but if we're messing with things in the other domain, this is little comfort.

So, simply, what is the reason that this is OK?

Cheers,
David.
Title: Human hearing beats FFT
Post by: knutinh on 2013-02-25 14:32:44
Never mind the title, I still don't find a satisfactory answer in this thread.

I understand that the human ear uses a wobbling membrane as something like a filter bank, with a number of non-linear processes, and an amazing analysis of the signals coming from it, to deliver the hearing capacities that we can probe in listening tests and experience every day. I understand that this is nothing like an FFT. I understand that the frequency resolution of masked noise is not that critical, so we use FFTs in codecs in a place where their frequency resolution is far over-specified, rather than being an issue.

However, we often describe other things in audio and hearing with an FFT-like model. It crops up in sampling theory. We push all the audio through a comparable filterbank in most lossy codecs. It is true that these transforms are mathematically lossless/reversible - but if we're messing with things in the other domain, this is little comfort.

So, simply, what is the reason that this is OK?

Cheers,
David.

I guess the switching between two different time/frequency resolution transforms in many lossy codecs is a sort of "ad hoc" fix for not doing a proper modelling of our hearing aparatus?

Not all audio processing/transmission may need to include an accurate model of our hearing. Perhaps a crude STFT is simply sufficient for some applications.

So what if we deviced an insanely complex, irregular, nonlinear filterbank (Volterra filterbank?). What could it be used for? Better lossy coding? (I think that there are other tradeoffs in lossy coding as well, such as signal compaction). Could we make better "frequency analyzers"? (what engineers would be able to interpret the plots from such a device?).

-k
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-25 16:24:50
I guess the switching between two different time/frequency resolution transforms in many lossy codecs is a sort of "ad hoc" fix for not doing a proper modelling of our hearing aparatus?
Not in the sense discussed in this paper. If that lossy codec filterbank and/or transform defined/trashed the performance that's measured in this paper (it doesn't), then even with optimal choice of transform length options and optimal switching between them, the result would be 10x too bad.

I think your other two paragraphs are right though. I'd just love to see a robust scholarly explanation, because I think we're going to need it after this paper.

Cheers,
David.
Title: Human hearing beats FFT
Post by: jmvalin on 2013-02-25 23:10:03
Never mind the title, I still don't find a satisfactory answer in this thread.

I understand that the human ear uses a wobbling membrane as something like a filter bank, with a number of non-linear processes, and an amazing analysis of the signals coming from it, to deliver the hearing capacities that we can probe in listening tests and experience every day. I understand that this is nothing like an FFT. I understand that the frequency resolution of masked noise is not that critical, so we use FFTs in codecs in a place where their frequency resolution is far over-specified, rather than being an issue.

However, we often describe other things in audio and hearing with an FFT-like model. It crops up in sampling theory. We push all the audio through a comparable filterbank in most lossy codecs. It is true that these transforms are mathematically lossless/reversible - but if we're messing with things in the other domain, this is little comfort.


If you want to think about this in terms of FFTs... consider the case of a 10 ms FFT window. The resolution of that FFT is 100 Hz. Does this mean we can't tell the frequency of a sinusoid with better than 100 Hz accuracy using that FFT? Absolutely not. First, we can use interpolation with the neighbouring bins to get a more precise value. If we have FFTs at other time offsets, we can do even better. We can look at phase changes for a certain bin and compute the exact (within noise limits) frequency of the sinusoid that's around that bin. So we've again "beaten Heisenberg", but only because we've assumed that we have a single sinusoid around that bin. AFAIK, the human ear is capable of similar phase processing to figure out the frequency. It has to do something like htat because it's "critical bands" are far wider than the bins of a 10 ms FFT. There's only ~25 critical bands for the entire 20 Hz - 20 kHz spectrum.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-26 03:30:12
There are a number of issues confused in this thread.

The first is that the Gabor limit applies. The Gabor limit only applies when what you need to detect is completely unknown.

Hearing the difference between notes is not at all the same problem.

The second that this 'beats FFT'. It beats the single-bin resolution of an FFT, but once you know you're dealing with a single cycle of a single sine wave, that problem becomes moot, because an FFT is 1:1 and onto, i.e. orthonormal, tight frame, etc, and the information is all retained. So, yes, it is there in the FFT that has wider bands, just not in the usual way one would extract it. The GABOR LIMIT DOES NOT APPLY TO THIS DETECTION ISSUE, and YES, Batman, the FFT can be used in such detection, it's just a dumb way to do it.

Third, the ear has about 60Hz bands until you get to the point where 1/4 octave is wider, and then they are 1/4 octave wide, give or take. This has little reading on the actual frequency detection mechanism, because the phase of firing of neurons is radically different below and above the center frequency of a given hair cell. This, alone, to 500Hz, can suffice to demonstrate pitch detection ability. And since the filters are wide, they settle fast, and hence again we beat the gabor limit, because we know we're looking for ONE set of frequencies, not any arbitrary frequency.

So, the headline is just confused, it's comparing an apple, an orange, and a crate full of bowling balls, and concluding that apples are orange-colored and weigh 12 lbs.

Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-02-26 09:57:04
Thank you JJ.
Title: Human hearing beats FFT
Post by: krabapple on 2013-02-27 00:51:44
http://arstechnica.com/science/2013/02/hum...3s-sound-worse/ (http://arstechnica.com/science/2013/02/human-hearing-beats-sounds-uncertainty-limit-makes-mp3s-sound-worse/)
Title: Human hearing beats FFT
Post by: greynol on 2013-02-27 01:44:34
Well the quality of the comments look pretty encouraging, though I imagine the section's entropy will increase, especially after the more informed people get tired of participating.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-02-27 04:55:22
Well the quality of the comments look pretty encouraging, though I imagine the section's entropy will increase, especially after the more informed people get tired of participating.


I don't belong to that particular site. If somebody would like to convey my feeling, please feel free.

I'm tired of dealing with what I can only describe as 'poo flinging' in most of the audio press.
Title: Human hearing beats FFT
Post by: Alexey Lukin on 2013-03-14 16:39:05
The first is that the Gabor limit applies. The Gabor limit only applies when what you need to detect is completely unknown.

Speaking of Gabor, there's a nice prior art from 1946 suggesting that “human hearing beats FFT”:
Quote
Actually, as noted by Dennis Gabor (best known for his invention of holography, but who also worked in audio) back in 1946, the ears actually analyse the frequency content of sounds in time faster than suggested by the uncertainty principle by a factor of about 7. The seeming logical contradiction with the fundamental theoretical limit of time/frequency resolution is avoided by the ear’s use of a-priori or previously assumed knowledge of the nature of typical sounds but at the expense of getting the analysis ‘wrong’ when sounds not of the assumed form occur.

(quote taken from M. Gerzon's paper (http://www.collinsaudio.com/Prosound_Workshop/Gerzon_Why_do_equalisers_sound_different.pdf) (http://audio.rightmark.org/lukin/temp/rmm/ExtLinkGS.png))
Title: Human hearing beats FFT
Post by: 2Bdecided on 2013-03-15 11:15:57
Interesting paper, thank you.
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-04-02 01:03:49

EST is a new transform that can explain the results of the article.

Fourier-related transforms, like FFT, are just one way to find frequencies, and clearly not the best possible.

EST derives frequencies from samples and is unrelated to Fourier/FFT.
The process of EST is deterministic, does not use non-linear equations, and can handle noise.

In the ideal case of a noiseless signal composed of n sinusoids, the frequencies, amplitudes and phases are precisely recovered from 3n
equally spaced real samples.

A noisy signal will require more samples, depending on noise level.

Other than the minimum for the ideal case, accuracy does not depend on the number of samples (time). The additional samples for a noisy signal
are needed to handle noise.

EST can also transform samples into increasing/decreasing sinusoids, which is a better way to model audio. In such a case, for a noiseless
signal, 4 samples are required per increasing/decreasing sinusoid, and more for a noisy signal.

EST can be evaluated using a demo program that implements it. There is also a paper that details the transform and its mathematical basis.

Those interested to see the paper and/or the demo program, can email me at gringya atsign gmail dot com.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-04-02 23:47:51
Fourier-related transforms, like FFT, are just one way to find frequencies, and clearly not the best possible.

Which, of course, depends entirely on your definition of "Frequency", something that itself is trickier than some seem to realize.
Quote
EST derives frequencies from samples and is unrelated to Fourier/FFT.

What does "EST" stand for, in the first place. Does it use a complex exponential or a representation of a complex exponential?

Quote
The process of EST is deterministic, does not use non-linear equations, and can handle noise.

Which is true of the Fourier Transform, as well.
Quote
In the ideal case of a noiseless signal composed of n sinusoids, the frequencies, amplitudes and phases are precisely recovered from 3n
equally spaced real samples.

Sounds pretty good. What's the basis set you're using?  Sounds a lot like a * sin (b *t +c) where a,b,c are the 3 samples. Not sure what "equally spaced" means here, unless you're referring to the fact you can characterize a sine wave with 3 non-degenerate points.
Quote
A noisy signal will require more samples, depending on noise level.

No surprise.
Quote
Other than the minimum for the ideal case, accuracy does not depend on the number of samples (time). The additional samples for a noisy signal
are needed to handle noise.

EST can also transform samples into increasing/decreasing sinusoids, which is a better way to model audio. In such a case, for a noiseless
signal, 4 samples are required per increasing/decreasing sinusoid, and more for a noisy signal.

So it's Laplace-based instead of Fourier based, then?

Instead of bombarding us with a bunch of not-very-specific qualities, why not just tell us what the basis set is, and how the analysis works?

I am aware of approximately infinite (well, literally infinite but obviously I haven't generated them all!) numbers of basis sets, many of which this could describe.
Title: Human hearing beats FFT
Post by: Alexey Lukin on 2013-04-02 23:59:42
Yaakov, also check out the Reassigned spectrogram mode in iZotope RX. It “beats FFT” in terms of time and frequency resolution: it can precisely localize impulsive events in time and precisely display frequencies of harmonics, assuming that they do not overlap in FFT spectrum.
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-04-03 01:42:13
EST stands for Exponential Sum Transform and it uses complex exponentials.

The basis is sigma(c*b^t) where b and c are non-zero complex numbers and the set of b is distinct. If all b are on the unit circle, then it is simply a spectrum.

When all b are on the unit circle and the samples are real, this becomes sigma(a*cos(b*t+c))

The samples must be equally space, not just non-degenerate.

It clearly looks more like Laplace than Fourier, but a specific relation, if exists, is not known to me.

As for describing the analysis, I offered to send the detailed paper. Do you prefer an informal description?

Title: Human hearing beats FFT
Post by: ExUser on 2013-04-03 05:27:16
I think a lot of us here would be interested in a formal description, myself included. I think from what you've just said that we'll get it puzzled out though.
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-04-03 18:14:54
I think a lot of us here would be interested in a formal description, myself included. I think from what you've just said that we'll get it puzzled out though.

If I understand you correctly, you prefer a formal description of the process, and only that.
Title: Human hearing beats FFT
Post by: db1989 on 2013-04-03 18:31:37
If I may guess, I think he means that this site has a significant number of users who would appreciate detailed descriptions. However, that is not to stop you from providing less technical information (i.e. ‘layman’s terms’) if you want to; there are probably other users who would like that, too.
Title: Human hearing beats FFT
Post by: Porcus on 2013-04-03 20:34:23
I think I could very well use a formula or two ... point seven eighteen twentyeight ...

As for describing the analysis, I offered to send the detailed paper. Do you prefer an informal description?


I think I just got one that was a bit too rough  although I do suspect I have guessed the point.
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-04-03 22:10:18
The following link:

http://www.mediafire.com/view/?ce47jurz43wzjce (http://www.mediafire.com/view/?ce47jurz43wzjce)

is to a short document that describes the EST process for real noiseless samples.

Title: Human hearing beats FFT
Post by: Woodinville on 2013-04-11 11:09:59
Hm.  Define "noiseless".  Most instruments have a chaotic part of their performance that in fact is noiselike in that it does not repeat, is not entirely stationary, depends on technique, and so on.

So, I'm not quite sure I know what you mean by noiseless.
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-04-11 19:33:25
The paper described the mathematical basis of EST, which uses the ideal case of perfect increasing/decreasing sinusoids.

For realistic data, EST uses different processes, that expect noise.

For audio, the EST process is as follows.
1. Find linear prediction coefficients, preferably using the covariance method and not the auto-correlation method.
2. Create the linear prediction polynomial.
3. Find the roots of the linear prediction polynomial to establish the basis set of an exponential sum function, as described in the paper.
4. Use the samples and the basis set to find the coefficients of the function.

The key point is that linear prediction coefficients and an exponential sum function, are equivalent, with the exponential sum function having the distinct advantage of being an analytic function with a useful structure. The mathematical basis proves this equivalence.

Due to the equivalence, an exponential sum function models an audio signal with the same quality as linear prediction.

You may note that the best lossless audio compressors, like OptimFROG, use linear prediction. This is a strong indication of the power of linear prediction to model audio.

Since EST generates an analytic function, it is suitable for lossy audio compression, as well as other audio applications.

Once EST generated an exponential sum function, you can do the following:
Identify noise elements, using frequency and/or amplitude, and remove them.
Identify inaudible elements, and remove them.
Quantize the coefficients.
Resample the audio signal, both sample rate and sample depth.
And various other things.

Unlike Fourier related methods, which use a predefined basis, EST uses a basis derived from the data.

In short, EST for audio combines the flexibility and usefulness of an analytic function with the modeling power of linear prediction.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-04-11 20:36:57
Unlike Fourier related methods, which use a predefined basis, EST uses a basis derived from the data.

In short, EST for audio combines the flexibility and usefulness of an analytic function with the modeling power of linear prediction.


Try applying EST to the first 30 seconds of the track "We Shall Be Happy" by Ry Cooder off the album titled "Jazz".  Let me know how big your covariance matrix is, too, ok?
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-04-11 21:32:17
Unlike Fourier related methods, which use a predefined basis, EST uses a basis derived from the data.

In short, EST for audio combines the flexibility and usefulness of an analytic function with the modeling power of linear prediction.


Try applying EST to the first 30 seconds of the track "We Shall Be Happy" by Ry Cooder off the album titled "Jazz".  Let me know how big your covariance matrix is, too, ok?


In a practical implementation the samples will be broken into blocks and there will be a chosen matrix size for that block size.

The size of the matrix and the block size will determine accuracy and an accuracy-speed trade-off.

This is also the way it is done when using linear prediction for lossless audio compression or for speech compression. The difference is that EST returns an analytic function.

30 senconds of audio will therefore be broken into many smaller blocks, and not treated as a single block.
Title: Human hearing beats FFT
Post by: Woodinville on 2013-06-04 01:51:24
Unlike Fourier related methods, which use a predefined basis, EST uses a basis derived from the data.

In short, EST for audio combines the flexibility and usefulness of an analytic function with the modeling power of linear prediction.


Try applying EST to the first 30 seconds of the track "We Shall Be Happy" by Ry Cooder off the album titled "Jazz".  Let me know how big your covariance matrix is, too, ok?


In a practical implementation the samples will be broken into blocks and there will be a chosen matrix size for that block size.

The size of the matrix and the block size will determine accuracy and an accuracy-speed trade-off.

This is also the way it is done when using linear prediction for lossless audio compression or for speech compression. The difference is that EST returns an analytic function.

30 senconds of audio will therefore be broken into many smaller blocks, and not treated as a single block.


I do know how coders work, so try your EST basis on We Shall Be Happy and get back to me, ok?  And tell me how many basis functions you need for that one, too. And how many are orthogonal. And then how many of those you have to code.
Title: Human hearing beats FFT
Post by: Specy on 2013-08-17 11:52:13
Over 10 years ago, for my master thesis, I wrote an algorithm that determines nearly exact frequency values from an FFT transform - it can find any frequency as long as they are far enough away from each other and constant in tone and level.

The method is pretty simple:
1. Create an FFT using a window that's a lot bigger than the block of audio that you use
2. Find the highest peak in the FFT domain. This is an estimation of the loudest frequency present.
3. Write down the found frequency, phase and amplitude
4. Generate an FFT based on the found freq, phase, amp (this can be optimized for speed, since it's only a single tone).
5. Subtract a small percentage of this (I found that 5-10% works well) from the original FFT from step 1.
6. Go back to step 2.

This gives you a whole lot of values, next you need to combine all the values that have approximately the same frequency. This can be done as follows:
- If a frequency is new (no data within 0.5 FFT bin size), this is a new frequency that we haven't seen before.
- Otherwise combine this new measurement with the measurement closest to it.

Tones that are 1 bin apart will not be found perfectly (frequency and amplitude might be very slightly wrong), but they still clearly show up as separate signals. Tones that are 2 or more bins apart show up nearly perfectly.

Test tones:
(http://masterthesis.hansvanzutphen.com/@x9bp-0.gif)

Real signal (voice):
(http://masterthesis.hansvanzutphen.com/@a.gif)


Signal and it's peak data:
(http://masterthesis.hansvanzutphen.com/x_xf.gif)
(http://masterthesis.hansvanzutphen.com/x_pk.gif)
(http://masterthesis.hansvanzutphen.com/x_wp.gif)
Title: Human hearing beats FFT
Post by: Yaakov Gringeler on 2013-11-04 19:15:57
Several months ago, in posts in this topic, I provided some information about my transform, EST.

I now have a document with better explanations, actual results, and charts.

The link to the document is:
http://www.mediafire.com/?0bprdaoop81d0cx (http://www.mediafire.com/?0bprdaoop81d0cx)
Please note that viewing the document online will only display the text, and not the charts. It has to be downloaded to be fully viewed.

As a reminder, this topic followed an article that showed that human hearing performance in finding frequencies exceeds the Fourier uncertainty limit.

EST finds frequencies using a deterministic algorithm unrelated to Fourier transforms and not bound by the Fourier uncertainty principle.

This shows that the results of the article are not surprising.