Print Page - pitch detection with Harmonic Product Spectrum

Title: pitch detection with Harmonic Product Spectrum
Post by: pkh on 2011-10-13 23:23:44

Hi,

I tried to implement the Harmonic Product Spectrum like it is described for instance in this Introduction to Signal Processing chapter (http://www.scribd.com/doc/50529329/154/Harmonic-product-spectrum).

The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested. But I'm certainly doing something wrong, so I'll describe the process I've followed so far. First, the basis:

Split the audio samples in windows of size N=1024
Apply a Hann window on these samples
Run a FFT on those samples to get N/2+1 bins
Compute the magnitude buffer with a hypot(re,im) giving a spectrum of len N/2 + 1

Those first steps are verified and OK, so I won't detail the implementation here.

So now, concerning HPS:

I first create a f0 histogram of length (N/2 + 1) / M, M being the number of downsampling - 1 (here, M=3). Each windows processing will increment the index of fundamental frequency found. Here is the code ran for each window:

Code: [Select]

    for (i = 0; i < (N/2 + 1) / M; i++) {
        // multiply downsampled (M-1 times) magnitudes of length N/2 + 1
        float mul = 1;
        for (n = 1; n <= M; n++)
            mul *= magnitude[i * n];

        // update maximum magnitude and get its related frequency
        if (mul > max)
            max = mul, freq_id = i;
    }
    f0[freq_id]++;

And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected. So the question is: how is that really supposed to work?

Title: pitch detection with Harmonic Product Spectrum
Post by: DVDdoug on 2011-10-14 21:27:01

Quote

And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected.

Sorry, I don't know what you mean by "fundamental frequency of the whole song"? I understand how the fundamental relates to a note or chord, but I don't know about a whole song... I would assume that means the lowest frequency in the song???

That might work for a solo instrument, but if you are analyzing a recording of a rock band, the "fundamental frequency" is probably the kick-drum. If you want to analyze the musical notes, you might need to filter-out (or ignore) the percussion. You might also need to ignore the attack and analyze the sustained part of the note/chord.

Title: pitch detection with Harmonic Product Spectrum
Post by: pkh on 2011-10-15 09:09:42

Quote from: DVDdoug on 2011-10-14 21:27:01

Quote
And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected.
Sorry, I don't know what you mean by "fundamental frequency of the whole song"? I understand how the fundamental relates to a note or chord, but I don't know about a whole song... I would assume that means the lowest frequency in the song???

I am looking for the overall pitch of the song, so the histogram is here to count fundamental frequency of each window and grab the dominant one.

Quote from: DVDdoug on 2011-10-14 21:27:01

That might work for a solo instrument, but if you are analyzing a recording of a rock band, the "fundamental frequency" is probably the kick-drum. If you want to analyze the musical notes, you might need to filter-out (or ignore) the percussion. You might also need to ignore the attack and analyze the sustained part of the note/chord.

I'm looking for a way to extract the pitch of songs of any kind as best as possible, maybe HPS isn't what I need. Trying to filter-out some specific sounds might require a lot of heuristic I don't really want to deal with at first…

If you have a few samples where HPS applies, I'm interested in them: I could check if at least the algorithm is implemented correctly and that my target (whole song instead of specific musical notes) is just wrong.

Note that I'm kind of new to all of this so I'm certainly mixing up a bunch of things (you certainly have already noticed it).

Title: pitch detection with Harmonic Product Spectrum
Post by: alexeysp on 2011-10-16 11:35:28

Quote from: pkh on 2011-10-13 23:23:44

The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested.

Maybe I'm wrong, but my guess would be you should apply some sort of equal loudness curve compensation to the spectrum.

Also, the window size probably has to be optimized, maybe even dynamically optimized. Again, I can't tell you how exactly, but the word "autocorrelation" comes to mind.

Title: pitch detection with Harmonic Product Spectrum
Post by: pkh on 2011-10-16 13:44:19

Quote from: alexeysp on 2011-10-16 11:35:28

Quote from: pkh on 2011-10-13 23:23:44
The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested.

Maybe I'm wrong, but my guess would be you should apply some sort of equal loudness curve compensation to the spectrum.

Also, the window size probably has to be optimized, maybe even dynamically optimized. Again, I can't tell you how exactly, but the word "autocorrelation" comes to mind.

I can't easily change the window size in the context of my app unfortunately. However, I started implementing the YIN method, and it seems much more efficient so I'll stick with that. It is "autocorrelation" based, so no spectrum comes into play, but results sound better.

HydrogenAudio

Hydrogenaudio Forum => Scientific Discussion => Topic started by: pkh on 2011-10-13 23:23:44