HydrogenAudio

Hydrogenaudio Forum => Scientific Discussion => Topic started by: fooball on 2024-04-05 08:23:28

Title: Pitch Shifting in the Encoded Domain
Post by: fooball on 2024-04-05 08:23:28
This is a question for those who (unlike me) understand the nuts and bolts of compression algorithms.

I have an interest in high quality real-time tempo shifting (which requires pitch shifting to restore the pitch after resampling).  I presume one way of pitch shifting involves FFT into the frequency domain, shifting the frequency components, and then FFT back into the time domain.

What I am curious about is whether any compression formats save the payload as frequency components, and could therefore perform the pitch shifting directly, in the compressed data prior to decoding for playback.  Eg wavelets.

Title: Re: Pitch Shifting in the Encoded Domain
Post by: DVDdoug on 2024-04-06 19:44:23
I believe MOST lossy compression works in the frequency domain.   So it should be possible. 

But existing editors like Mp3DirectCut that work without de-compressing/re-compressing don't go that "deep" into processing.   i.e. They  don't do equalization or mixing, etc.
Title: Re: Pitch Shifting in the Encoded Domain
Post by: saratoga on 2024-04-06 20:54:45
have an interest in high quality real-time tempo shifting (which requires pitch shifting to restore the pitch after resampling).  I presume one way of pitch shifting involves FFT into the frequency domain, shifting the frequency components, and then FFT back into the time domain.

There is a lot more to it.  Take a look at this tutorial:  https://www.guitarpitchshifter.com/algorithm.html

What I am curious about is whether any compression formats save the payload as frequency components, and could therefore perform the pitch shifting directly, in the compressed data prior to decoding for playback.  Eg wavelets.

Almost all lossy formats do, but that doesn't really help you with tempo adjustment.
Title: Re: Pitch Shifting in the Encoded Domain
Post by: fooball on 2024-04-06 23:16:31
Okay... I guess I knew that much already.

Is it true that the ear is only sensitive to frequency, not phase?

Supposing that to be true, then if a signal were represented as a series of overlapping wavelets (which themselves do not contain discontinuities), wouldn't transposing wavelets with equivalent wavelets but at a shifted frequency achieve the effect of pitch shifting?  That's my idea, anyway.