Skip to main content

Topic: STFT vs FFT for pretty visualization? (Read 584 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • bennetng
  • [*][*][*][*][*]
STFT vs FFT for pretty visualization?
When using log scale to display the bins, low frequencies usually look ugly unless you use a big FFT size (assume same window type), then at the same time high frequencies are too dense and the monitor has not enough pixels to show the details. Also, using higher FFT size reduces time resolution and uses more CPU.

Then I read about STFT, which I roughly understand as using different resolutions at different frequencies.

So my question is, human hearing works more like log scale than linear scale, why I can't find any animated STFT analysis on typical DAWs? Because FFT is faster? Because STFT is not suitable for displaying animation? Other reasons?

  • saratoga
  • [*][*][*][*][*]
Re: STFT vs FFT for pretty visualization?
Reply #1
Then I read about STFT, which I roughly understand as using different resolutions at different frequencies.

The STFT is a group of transforms that let you resolve a time varying signal with a given time/frequency tradeoff.  Usually they have fixed resolution though unless it is a very exotic transform.

So my question is, human hearing works more like log scale than linear scale, why I can't find any animated STFT analysis on typical DAWs? Because FFT is faster? Because STFT is not suitable for displaying animation? Other reasons?

The typical spectrogram view you see in a lot of applications is based on the STFT or something very similar.  If you want a transform with log spaced frequency bins, that is much harder.  I think the usual approach is to just use a larger than required FFT and then throw away pixels as needed. 

Edit:  Wikipedia also suggests this, although I've never tried implementing it: 

https://en.wikipedia.org/wiki/Constant-Q_transform
  • Last Edit: 04 October, 2017, 10:22:57 AM by saratoga

  • bennetng
  • [*][*][*][*][*]
Re: STFT vs FFT for pretty visualization?
Reply #2
I think the usual approach is to just use a larger than required FFT and then throw away pixels as needed. 

Yeah I am also doing this now. I am getting an array of bins then discarding some of them based on screen resolution, with an adjustable density.

It is a mobile app and it also displays other stuff at the same time so there will not be a lot of spare CPU time for older devices.

Quote
Relative to the Fourier transform, implementation of this transform is more tricky. This is due to the varying number of samples used in the calculation of each frequency bin, which also affects the length of any windowing function implemented.

Looks like a good answer from Wikipedia. At first I thought discarding bins is a waste of CPU time but now it seems that doing things in the quote above can even use more CPU.

  • saratoga
  • [*][*][*][*][*]
Re: STFT vs FFT for pretty visualization?
Reply #3
Looks like a good answer from Wikipedia. At first I thought discarding bins is a waste of CPU time but now it seems that doing things in the quote above can even use more CPU.

I'm not sure which is more efficient, probably depends on how much resolution you need at low frequencies.  Another option if you just want to have a uniform (but coarse) representation is to use a few parallel band pass filters.  For computing the response at 50 Hz a bandpass is probably a lot fewer cycles than a whole transform.  Of course that doesn't scale very well to hundreds or thousands of bins. 

Re: STFT vs FFT for pretty visualization?
Reply #4
When using log scale to display the bins, low frequencies usually look ugly unless you use a big FFT size (assume same window type), then at the same time high frequencies are too dense and the monitor has not enough pixels to show the details. Also, using higher FFT size reduces time resolution and uses more CPU.

Then I read about STFT, which I roughly understand as using different resolutions at different frequencies.

So my question is, human hearing works more like log scale than linear scale, why I can't find any animated STFT analysis on typical DAWs? Because FFT is faster? Because STFT is not suitable for displaying animation? Other reasons?

I find that fractional-octave displays of FFTs can be very informative to study. I understand that they are simply made by mathematically combining varying numbers of bins of a standard fixed-bandwidth FFT.