11
Scientific Discussion / Re: AudioWorklet-based filter bank spectrum analyzer
Last post by TF3RDL -- 1/nth octave bands mode (equal-tempered scale, adjustable reference frequency and supports transposing like in old foo_musical_spectrum component for foobar2000) and frequency bands mode (supports Mel/Bark/ERB psychoacoustic scales and others like hyperbolic sine, nth root, and period, in-addition to standard linear and logarithmic frequency scales)
- Adjustable frequency range in Hz for frequency bands equally spaced in arbitrary frequency scale, and in note index # (starting from C0 = ~16Hz) on octave bands mode
- Truly constant-Q (assuming logarithmic frequency scale since the bandwidth is automatically determined by frequency band's lower and upper boundaries), no longer limited to 32768 samples (~682.7ms at 48kHz sampling rate) but it can be set to be variable-Q by unchecking "Use constant-Q instead" if you want better time resolution at lower frequencies, where minimum Hz resolution (to cope with time/frequency resolution tradeoff) is specified by "time resolution" parameter in milliseconds
- Bandwidth for this filter bank is adjustable; higher values makes a smoothed spectrum while having improved time resolution on bass frequencies at the same time and vice versa
- Three filter bank types are supported:
- Analog-style analyzer: Cascaded biquad bandpass filters, simply stacking one another is good enough even without a flat-top response of Butterworth bandpass filters. Also, Q values are prewarped (unless you opt-out) for "truly" logarithmic resolution as this filter bank is designed using bilinear transform (borrowed from RBJ EQ cookbook)
- Sliding windowed infinite Fourier transform: which is a bank of IIR complex resonators and it resembles a Gammatone filter on 4th order cascaded SWIFT
- Variable-Q sliding DFT: Recursive FIR filter bank, complex-valued. Also have an option to use NC method to enhance time/frequency resolution. Window function options are limited to cosine sums like Hann and Hamming windows but custom frequency-domain windowing are supported and since it is a FIR type of filter bank, there is a "maximum time resolution" parameter that determine how long the circular buffer should be, which is no longer constant-Q if gets too low
- For IIR filter bank modes: Adjustable filter order, higher values reduces leakage at the expense of some time resolution (especially on lower frequencies)
- Peak decay and exponential moving average-based smoothing as well as optional fading peaks effect. Also an option to perform time smoothing operation during calculation rather than after (in other words, per-sample instead of per-frame) for greater accuracy and framerate-independence
- Linear and nth root amplitude scale is supported in-addition to logarithmic/dB scale. Also, dB range on this visualization can be adjusted and "Use absolute value" sets minimum dB range to -∞ dBFS on linear/nth root amplitude mode
- Multiple X-axis scales supported:
- Decade: A standard grid for logarithmic scale
- Octaves: Each label/line corresponds to center frequencies of a common 10-band graphic equalizer
- Notes: Frequency gridlines are equidistantly-spaced in logarithmic frequency scale, following 1/12th octave bands and each label displays musical notes instead of Hz
- Automatic: Frequency grid-lines and labels corresponds to actual frequency bands, much like foobar2000's built-in "Spectrum" visualization though X-axis labels become cluttered on large number of bands
- Y-axis labels are in dB even if in linear/nth root scale and dB step for Y-axis grid are adjustable. Common dB interval values to set are 6dB, 10dB, 12dB, and 20dB, though it can be almost any value
- As this project is intended towards audio analysis algorithms (in this case, IIR filter banks) and not intended to be eye-candy after all, color customization is very limited; it boils down to just light/dark mode and a switch between solid color and color gradient (borrowed from foobar2000 built-in visualizations for gradients)
This works best if you have a CPU that is much newer than a relatively-ancient Intel Core 2 Duo series as it demands more CPU than FFT-based analyzers using AnalyserNode.getFloatTimeDomainData() + FFT library
BTW, when using "Analog-style analyzer" mode and a "filter order" of 1, it produces a shape of the frequency bars visualization that reminds me of a hybrid of Windows Media Player's "Bars" and VLC media player's "Spectrum" visualizations where the former is for true log frequency scale and general shape, and the latter for spectral leakage and apparently equal peak width