Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: How does audio downsampling work? (Read 19224 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

How does audio downsampling work?

Hi, I'm wondering If I have to manually convert a 44KHz audio taken from CD to 8KHz sampling rate. How would I choose these 8,000 samples of audio that will be played every second? Then again, If I convert audio at 8KHz to 44KHz. How would those samples look like? Would there be some added silence or noise in those places?

How does audio downsampling work?

Reply #1
Proper resampling would lowpass the frequency range to fit the requirements of an 8kHz sample rate, then calculate proper sample values. There is most likely more than one way to go about it but a way to visualize the process is to convert the 44.1kHz sample rate audio to analogue, than sample that A to D at 8kHz. None of the original samples from the 44.1kHz data  would transfer to the new output, each output sample value would be the best calculated result based on the slower clock. There would not actually be an analogue step in the process, the calculated results would just be equal to what would be obtained if there were.

How does audio downsampling work?

Reply #2
I am enjoying the mental image of manually picking each sample...

How does audio downsampling work?

Reply #3
Hi, I'm wondering If I have to manually convert a 44KHz audio taken from CD to 8KHz sampling rate. How would I choose these 8,000 samples of audio that will be played every second?


In theory:
(1) zero stuffing (inserting a fixed number of zeros between neighouring samples) + scaling
(2) lowpass filtering
(3) picking every Nth sample and throw the rest away

The effects:
zero stuffing creates spectral images. You still have exactly the same frequencies in the original frequency band but also lots of other mirrored and shifted ones in the higher frequency spectrum part. Lowpass filtering is supposed to remove these image frequencies as well as those that cannot be represented at the target sampling rate. picking every Nth sample out of the filtering result would be okay w.r.t. the sampling theorem because the filter makes sure that no or little aliasing will be present in the final result.

In practice:
Of course, it makes little sense to actually compute all "filtered samples" when in the next step many of'em are thrown away. Also, many samples of the zero-stuffed signal will be zero. This knowledge can be exploited during filtering. So all three steps can be combined into one step to save computation time.

You'll find similar information if you google for the terms I mentioned as well as
- band-limited interpolation
- decimation

Then again, If I convert audio at 8KHz to 44KHz. How would those samples look like? Would there be some added silence or noise in those places?

The same procedure. Zero stuffing, filtering & throwing lots samples away.

In case of 8000 Hz <-> 44100 Hz conversion, the smallest sampling rate that is both a multiple of 8000 and 44100 is 3528000.

So, for the 44100->8000 conversion, you'd insert 79 zeros between each neighbouring original samples to get to the sampling rate of 3528000. Then, you'd lowpass filter this signal to only preserve frequencies up to 4000 Hz. Then, you'd pick out every 441th sample and be done.

For the 8000->44100 conversion, you'd insert 440 zeros between each neighbouring original samples to get to the sampling rate of 3528000. Then, you'd lowpass filter this signal to only preserve frequencies up to 4000 Hz (because all other frequencies are unwanted images). Then, you'd pick out every 80th sample and be done.

Cheers!
SG

How does audio downsampling work?

Reply #4
I should probably study this post a little further when I get home, but I'm curious: if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?

How does audio downsampling work?

Reply #5
if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?

Some of the crappy ones are probably crappy because the author is not familiar with the theory and not completely aware of how resampling is done properly.

Other crappy ones are probably crappy because the author made sacrifices in quality to speed up the computation. The theory is not that hard, but making efficient implementations or approximations of this ideal process is a bit more complicated, I suppose.

How does audio downsampling work?

Reply #6
I should probably study this post a little further when I get home, but I'm curious: if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?



room to mess up

-    Doing the calculations at a low bit depth.
                        -    loosing small details due to way the compution process is undertaken.
                        -    dithering process used after computations .



-
Owen.

How does audio downsampling work?

Reply #7
I should probably study this post a little further when I get home, but I'm curious: if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?

What about length of filters? Could even IIR filters be used, messing up phase and such? Using symmetric/asymmetric FIR filters?
If the lack of information technology knowledge is a common issue, I'm instantanously losing my faith in (software) engineering.

How does audio downsampling work?

Reply #8
What about length of filters? Could even IIR filters be used, messing up phase and such? Using symmetric/asymmetric FIR filters?

These are the areas in which there is a lot of debate, and which yield a lot of the differences, and probably the mistakes, in implementations.  Yes, IIR and asymmetric FIR filters are used in some designs, and yes, this "messes up" phase, but there is no reason to suppose that this is a problem for audio—all-pass filters, specifically designed to "mess up" phase are frequently used in mastering to smooth out 'rogue' peaks (well they used to be; mostly, they just let 'em clip these days).

Cheers,
Rob

How does audio downsampling work?

Reply #9
room to mess up

-    Doing the calculations at a low bit depth.
                        -    loosing small details due to way the compution process is undertaken.
                        -    dithering process used after computations .


Not really. Typical messups in resampling filters are not understanding the need to lowpass, using a too short filter, or well, botching the algorithm.

How does audio downsampling work?

Reply #10
I should probably study this post a little further when I get home, but I'm curious: if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?


Good resampling isn't terribly hard.  Good resampling that is efficient is very hard. 

Particularly back in the day.  No sense having a high quality resampler that would have used 30% CPU time on a Pentium 4.  Now its less of a problem since processors are so much faster.

How does audio downsampling work?

Reply #11
Also, if you need real-time resampling (e.g. during playback), you sometimes need to compromise the filter design in order to reduce latency.  This might mean using IIR filters (which are really bad, as floating point precision limits you to maybe 6th order at most), using a minimum phase sinc filter (although it's arguable whether or not this will have a subjective quality difference), or limiting the taps in an FIR filter to reduce how many samples you have to read ahead before producing your first output sample.

How does audio downsampling work?

Reply #12
...This might mean using IIR filters (which are really bad, as floating point precision limits you to maybe 6th order at most),

I havent done any IIR stuff since school, but is this generally true regardless of topology?
Quote
using a minimum phase sinc filter

How does a minimum phase sinc look like? Do you mean apply an assymetric window to the sinc?

-k

How does audio downsampling work?

Reply #13
I should probably study this post a little further when I get home, but I'm curious: if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?

In addition to what the others said: the key aspect is the length and design of the low-pass (interpolation/decimation) filter. The filter should attenuate well enough the unwanted frequency range (keyword: stop-band attenuation) and should keep the desired frequency range unchanged (keyword: pass-band ripple).

Quote from: Sebastian link=msg=0 date=
Of course, it makes little sense to actually compute all "filtered samples" when in the next step many of'em are thrown away. Also, many samples of the zero-stuffed signal will be zero. This knowledge can be exploited during filtering. So all three steps can be combined into one step to save computation time.

That's what they call polyphase implementations, I guess?

Chris
If I don't reply to your reply, it means I agree with you.

How does audio downsampling work?

Reply #14
Some of the crappy ones are probably crappy because the author is not familiar with the theory and not completely aware of how resampling is done properly.
Best recent example:
Quote from: Microsoft KB2653312 link=msg=0 date=
When you convert the sample rate of an audio file from one frequency to another frequency on a computer that is running Windows 7 or Windows Server 2008 R2, the new audio file sounds distorted during playback.
[...]
This issue occurs because the sample rate converter uses linear interpolation when it converts audio files. This behavior creates noise on the audio file that is sensitive to the human ear.
(emphasis by me) 

How does audio downsampling work?

Reply #15
...This might mean using IIR filters (which are really bad, as floating point precision limits you to maybe 6th order at most),

I havent done any IIR stuff since school, but is this generally true regardless of topology?
Quote
using a minimum phase sinc filter

How does a minimum phase sinc look like? Do you mean apply an assymetric window to the sinc?

-k




How does audio downsampling work?

Reply #17
What makes the upper plot "sinc"?

-k


Hard to explain, but basically, it's a sinc wave with different phase (i.e. FFT Magnitude plot is the same as sinc, FFT Phase plot isn't). 

Academically, you design a minimum-phase filter by inverting the zeroes of a linear-phase equivalent.  In practice, however, that's difficult.  One straightforward method of designing it is the cepstral method - http://www.eurasip.org/Proceedings/Eusipco...pers/cr1074.pdf

How does audio downsampling work?

Reply #18
What about length of filters?

This converter claims to sound better than all the rest, and uses a fixed filter length of around 200, whilst this converter also claims to better than all the rest and uses a default, but increasable filter length of the order of 67 million, so definitely some difference of opinion on what makes a good converter.


How does audio downsampling work?

Reply #20
This converter claims to sound better than all the rest, and uses a fixed filter length of around 200, whilst this converter also claims to better than all the rest and uses a default, but increasable filter length of the order of 67 million, so definitely some difference of opinion on what makes a good converter.

I think you are comparing apples and oranges. The first converter apparently refers to filter length in samples at the final sample rate, where the second apparently refers to intermediate samples during processing.

How does audio downsampling work?

Reply #21
You're right. Selecting 88.2k->44.1k (so no intermediate stages) gives this with Brick (with steep filter):

Code: [Select]
Filter Length: 535067 FFT Size: 4096 K

Then, using sox (as it also spits out numbers) allowing aliasing above 20k (i.e. similar to iSRCs filter), for the same conversion gives:

Code: [Select]
fir_len=185  dft_length=4096

 

How does audio downsampling work?

Reply #22
I should probably study this post a little further when I get home, but I'm curious: if resampling is simple enough to be explained in a single post like the above, then why are some resampling filters "crappy" and some "good"? What room is there to mess up?


This site shows the results of any number of screw ups and successes in well known commercial resamplers:

Infinite Wave

To some degree it includes historical artfiacts - products that are based on  code that was or simply are a decade or more old.

The mess ups are all due to programmers that didn't take or didn't properly comprehend the training that they needed to obtain in numerical methods university level courses, whether as part of their formal education or self-education.  The fact that really good resamplers were on the market at the turn of the millennium should be instructive to us all. It's a good idea to do some research into what you are coding before you start coding it.