Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: pitch detection with Harmonic Product Spectrum (Read 7941 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

pitch detection with Harmonic Product Spectrum

Hi,

I tried to implement the Harmonic Product Spectrum like it is described for instance in this Introduction to Signal Processing chapter.

The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested. But I'm certainly doing something wrong, so I'll describe the process I've followed so far. First, the basis:
  • Split the audio samples in windows of size N=1024
  • Apply a Hann window on these samples
  • Run a FFT on those samples to get N/2+1 bins
  • Compute the magnitude buffer with a hypot(re,im) giving a spectrum of len N/2 + 1

Those first steps are verified and OK, so I won't detail the implementation here.

So now, concerning HPS:

I first create a f0 histogram of length (N/2 + 1) / M, M being the number of downsampling - 1 (here, M=3). Each windows processing will increment the index of fundamental frequency found. Here is the code ran for each window:

Code: [Select]
    for (i = 0; i < (N/2 + 1) / M; i++) {
        // multiply downsampled (M-1 times) magnitudes of length N/2 + 1
        float mul = 1;
        for (n = 1; n <= M; n++)
            mul *= magnitude[i * n];

        // update maximum magnitude and get its related frequency
        if (mul > max)
            max = mul, freq_id = i;
    }
    f0[freq_id]++;


And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected. So the question is: how is that really supposed to work?

 

pitch detection with Harmonic Product Spectrum

Reply #1
Quote
And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected.
Sorry, I don't know what you mean by "fundamental frequency of the whole song"?  I understand how the fundamental relates to a note or chord, but I don't know about a whole song...    I would assume that means the lowest frequency in the song???   

That might work for a solo instrument, but if you are analyzing a recording of a rock band, the "fundamental frequency" is probably the kick-drum.  If you want to analyze the musical notes, you might need to filter-out (or ignore) the percussion.  You might also need to ignore the attack and analyze the sustained part of the note/chord.

pitch detection with Harmonic Product Spectrum

Reply #2
Quote
And at the end I pick the higher value in f0 in order to get the fundamental frequency of the whole song. But since the higher magnitudes are always in the lower frequencies, the HPS results (peak in freq_id=0) are to be expected.
Sorry, I don't know what you mean by "fundamental frequency of the whole song"?  I understand how the fundamental relates to a note or chord, but I don't know about a whole song...    I would assume that means the lowest frequency in the song???

I am looking for the overall pitch of the song, so the histogram is here to count fundamental frequency of each window and grab the dominant one.

That might work for a solo instrument, but if you are analyzing a recording of a rock band, the "fundamental frequency" is probably the kick-drum.  If you want to analyze the musical notes, you might need to filter-out (or ignore) the percussion.  You might also need to ignore the attack and analyze the sustained part of the note/chord.

I'm looking for a way to extract the pitch of songs of any kind as best as possible, maybe HPS isn't what I need. Trying to filter-out some specific sounds might require a lot of heuristic I don't really want to deal with at first…

If you have a few samples where HPS applies, I'm interested in them: I could check if at least the algorithm is implemented correctly and that my target (whole song instead of specific musical notes) is just wrong.

Note that I'm kind of new to all of this so I'm certainly mixing up a bunch of things (you certainly have already noticed it).

pitch detection with Harmonic Product Spectrum

Reply #3
The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested.


Maybe I'm wrong, but my guess would be you should apply some sort of equal loudness curve compensation to the spectrum.

Also, the window size probably has to be optimized, maybe even dynamically optimized. Again, I can't tell you how exactly, but the word "autocorrelation" comes to mind.

pitch detection with Harmonic Product Spectrum

Reply #4
The issue I have is that the peak is always detected at the lower frequencies with the various music samples I tested.


Maybe I'm wrong, but my guess would be you should apply some sort of equal loudness curve compensation to the spectrum.

Also, the window size probably has to be optimized, maybe even dynamically optimized. Again, I can't tell you how exactly, but the word "autocorrelation" comes to mind.


I can't easily change the window size in the context of my app unfortunately. However, I started implementing the YIN method, and it seems much more efficient so I'll stick with that. It is "autocorrelation" based, so no spectrum comes into play, but results sound better.