Time Domain Harmonic Scaling

2019-11-29 06:07:53

I have put an implementation of this up on my GitHub account, including a Windows executable of a demo command-line utility. I wrote this originally in the 90s but have almost completely refactored it for this release. Specifically I implemented a 2:3 scaling that I've never seen described, created a pitch detector optimized for speed, and improved the way intermediate scaling ratios are handled to be more accurate and flexible.

The basic effect that you can create with TDHS is to scale an audio clip's duration (i.e., speed) without changing its underlying pitch, and this is done entirely in the time domain. The other effect is that you can scale the pitch of a clip without changing the duration (which is basically the first effect with resampling afterward).

One nice thing about TDHS is that audible degradation occurs linearly as the scaling diverges from unity, so it's possible to make slight adjustments to clip timing (or pitch) with no audible artifacts. On the other hand, it doesn't work well on polyphonic music like algorithms implemented in the frequency-domain, but for the signals it does handle I think it sounds better (I experimented with one called Rubber Band).

For my testing I created some demonstration samples. The first is a sort of speaking fugue based on one of the SQAM samples.

Spoken 5-part fugue based on SQAM sample

I also created two alternate versions of Suzanne Vega's great a capalla Tom's Diner. The first is a 3-part harmony using the pitch scaling feature. The second uses the feature of the demo utility to alter the timing of a clip sinusoidally such that the resulting output exactly aligns with the input every 2π seconds.

Tom's Diner with two harmony parts

Tom's Diner with sinusoidal timing

Hopefully someone might find a use for this implementation (or have as much fun as I obviously have playing with it).

Notice