HydrogenAudio

Hydrogenaudio Forum => Uploads => Topic started by: MLXXX on 2008-05-06 12:07:51

Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-06 12:07:51
Having accepted that a well dithered 24-bit source (at least at 44.1KHz or above) appears to sound the same at 16 bits (at ordinary listening levels), I am now focussing my energies on the sample rate question: is 44.1KHz a sufficient sampling rate?

As part of that exercise, I am uploading three versions of a sound file.  These three versions are the result of converting a short 96/24 original sound file (of a triangle being struck) to 44.1/32:-

Audition version: [attachment=4441:attachment]
Cooledit version: [attachment=4442:attachment]
R8brain version:  [attachment=4444:attachment]


If anyone is interested in my amateur comments about these files, the relevant thread is Hydrogenaudio Forum > General Audio > The Emperor's New Sample Rate, MIX magazine wonders if "maybe CD is good enough", at post  #72 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=62478&view=findpost&p=563394).

EDIT: On reflection I am now suggesting that discussion on this topic is perhaps a bit specialised and might be better suited to this thread.  However please note there are already some relevant comments in the Emperor's New sample Rate thread.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-07 10:13:52
is 44.1KHz a sufficient sampling rate?

Yes, it is. You're able to precisely describe any wafeform content between 0 and 22049 Hz which is, in most cases, more than you can hear or audio equipment can reproduce.
There is, however, a problem with converters using such sampling rate as their analog low-pass filter under the Nyquist frequency needs to be very precise and steep so it doesn't attenuate/phase-distort the 18-22 kHz range much and, at the same time, silences anything above 22kHz (even more crucial for ADCs).
Perhaps this is what caused CD players to be deemed "not good enough", people sometimes complaining about unnatural trebles (compared to vinyl). 16 bits of resolution might be another issue for very dynamic (e.g. orchestral) music, where quiet passages are getting only a few bits of resolution.
If you have content recorded using 96/192 kHz, the problem with converters is much better to handle (you need to have a decent analog lowpass filter with transition band between, say, 25kHz and 48/96kHz, which is likely to distort the audible signal much less than the one with 21kHz to 22kHz transition).
If you use a good quality downsampling algorithm, you should retain all the relevant audio information even at the 44kHz rate. There is, however, the caveat of playing it through a native 44 kHz DAC, which might degrade it (as mentioned above).
Most contemporary soundcards have DACs with native rate of 96 or 192 kHz and anything other than this gets digitally resampled to such rate. So there is a question why to downsample anything to 44 kHz when it is likely to get upsampled upon playback... Just to eliminate redundant information and save space?
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-07 16:55:24
Martel, thx for your comments.
So there is a question why to downsample anything to 44 kHz when it is likely to get upsampled upon playback...

One reason is to test the quality of the traditional CD format. If we are to compare the audio transparency of a traditional 44.1KHz sampling rate with say a 48KHz sampling rate, then one approach is to start with a very high bit-rate clip (say 192Khz) and downsample it to 48KHz and 44.1Khz and compare the two conversions.

However I have realized that before attempting such an exercise we must be satisfied that the filtering necessary to avoid aliasing in a conversion to 44.1KHz is not of itself so derimental to the sound quality that it would mask any difference due solely to any alleged insufficiency of 44.1Khz as a sample rate.

It has dawned on me that a useful testing process could be as follows:-I imagine this exercise of evaluating the performance of specialised low pass filters has been attempted in the past.  However I have not stumbled upon the results.

I would like to be compare the struck triangle 96/24 clip against the same clip after it has been subjected to a specialised digital low pass filter designed for use as a prefilter for a downsampling to 44.1Khz.


I ask:

[blockquote]A.  If a low pass filter has been identified with exceptionally good performance as a prefilter for a conversion to 44.1KHz, is it available as a plug-in, or as standalone software? [Or is the digital filter in Audition 3 about as good as it gets?]

B. If capturing analogue audio at 44.1Khz, are analogue pre-filters available that outperform the best digital filters that might be used in a downsampling as referred to in question A?[/blockquote]
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-07 17:15:40
[blockquote]A.  If a low pass filter has been identified with exceptionally good performance as a prefilter for a conversion to 44.1KHz, is it available as a plug-in, or as standalone software? [Or is the digital filter in Audition 3 about as good as it gets?]

Filtering in Audition will probably be OK. Don't be tempted to make the filter too steep - a transition band from 20kHz to 22kHz is probably fine. Be careful of the "group delay" when ABXing though - depending on what Audition does, one sample might be delayed. Zoom in and look - if it's only by a couple of samples it should be fine.

B. If capturing analogue audio at 44.1Khz, are analogue pre-filters available that outperform the best digital filters that might be used in a digital to digital downsampling as referred to in question A?[/blockquote]
No, digital filters are better than analogue filters in nearly all non-realtime cases. In fact, you can convert filter designs between analogue and digital and vice versa. Having said that, many ADCs include digital fiters anyway - due to their use of dela-sigma designs.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-07 17:18:10
B. If capturing analogue audio at 44.1Khz, are analogue pre-filters available that outperform the best digital filters that might be used in a downsampling as referred to in question A?
Please don't be offended, because it's a fascinating exercise, but the fact that you can even ask this question shows that you have terrifyingly little knowledge in the area, and haven't read much about it here, or elsewhere.

No one samples at 44.1kHz. They sample at many times this, then filter and downsample digitally - precisely because analogue filters are so pitiful (in this application) compared to digital. All but the worst ADCs and DACs are oversampled.

Hope this helps.

Now, have you see the FAQ?

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-07 17:29:27
[No one samples at 44.1kHz. They sample at many times this, then filter and downsample digitally - precisely because analogue filters are so pitiful (in this application) compared to digital. All but the worst ADCs and DACs are oversampled.

I daresay early sound cards did sample at 44.1KHz, and I recall early CD players did not use oversampling.  However what you have written emphasizes the importance of the filtering involved in downsampling.  It is nowadays what a direct capture of audio at 44.1Khz will involve as part of the capture process.
Title: Resampling down to 44.1KHz
Post by: lvqcl on 2008-05-07 18:09:45
A.  If a low pass filter has been identified with exceptionally good performance as a prefilter for a conversion to 44.1KHz, is it available as a plug-in, or as standalone software? [Or is the digital filter in Audition 3 about as good as it gets?]


A comparison of some SRCs (96 -> 44.1) is available on http://src.infinitewave.ca/ (http://src.infinitewave.ca/). But there's only Audition 2, not 3. Anyway, "Adobe Audition 2 Pre/Post Filter" graphs look good enough.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-07 19:09:16
Thanks Ivqcl.  The graphs look fine.  However I wonder what the performance sounds like to the human ear.


Can people so inclined please try  to ABX the three files I upoaded at post #1?

I would have thought that the very best conversions to 44.1KHz would all sound the same as they would introduce negligible phase shifts, and negiligible drop-off in frequency response in the audible range.  (Last night I tested my own hearing of sinewave tones and was able to hear up to 19.5KHz but that frequency was very very faint.)

One of our issues is that 44.1Khz may not be well implemented in some devices as it is being supplanted by 48KHz and 96Khz.  If anyone does get a positive ABX result (or a negative result for that matter) could they please describe what device they used for the digital to analogue playback, e.g. PC motherboard sound chip, sound card, audio video receiver; and whether using speakers or headphones?
Title: Resampling down to 44.1KHz
Post by: KikeG on 2008-05-07 19:48:05
The original triangle file has a problem, and it is that it has too a short silence before the actual sound begins. I have generated a version with this issue corrected.

Also, in order to have a better knowledge of what is going on, I have generated versions of this file lowpassed at different frequencies.

MLXXX, could to try to ABX the unfiltered file versus the lowpassed ones? The lowpassed ones have a suffix that goes from l1 to l4.

Edit: the applied lowpasses are:

triangleMod_2_2496_l1.flac : 46 KHz lowpass
triangleMod_2_2496_l2.flac : 30 KHz lowpass
triangleMod_2_2496_l3.flac : 26 KHz lowpass
triangleMod_2_2496_l4.flac : 21 KHz lowpass

The lowpass takes 2.5 KHz approx. For example, in case of the 21 KHz lowpass, it is -0.1 dB at 19950 Hz and -90 dB at 22600 Hz.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-07 20:00:01
Will be pleased to do that Kike, but may be 24 hours before I get the opportunity to post here again. Cheers

EDIT: Couldn't resist a quick check using my AVR on your original vs 1 and your original vs 4. Dead easy.  Stereo image moves to the left with the non-original versions.
Title: Resampling down to 44.1KHz
Post by: AndyH-ha on 2008-05-07 20:04:48
I have no suggestions about what the Current Creative cards do sound wise, but my understanding is that they all resample everything to 48kHz, except for those with a playback 96kHz bypass option (Creative has been blatantly lying about their cards for years. Now that they lost a lawsuit over the issue, they do reveal the truth -- as indirectly as possible in the small print). I don’t know if this 96kHz thing has to be specially chosen or if one gets it automatically when playing a 96kHz file. Regardless, using an Audigy card that does such “mechanical” manipulations to most, or all, the audio passing through it seems a rather poor way to investigate what good software does with different sample rates.

I would suggest you need a card that isn’t oriented to gaming and multi-media needs. There are scores of them but of course money is involved.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-07 20:23:28
This really must be my last post for now!

My AVR is a Yamaha HTR-5750.  Would its DAC perform ok at 44.1KHz (fed from a pc running Vista) for the purpose of comparing the three files?

Alternatively if I burned a CD I could play it on a CD player.  Could a CD player be expected to perform sufficently well to compare the three files [e.g. Denon DCM-370]?  Presumably I'd dither down to 16 bits.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-07 21:32:02
This really must be my last post for now!

My AVR is a Yamaha HTR-5750.  Would its DAC perform ok at 44.1KHz (fed from a pc running Vista) for the purpose of comparing the three files?

Alternatively if I burned a CD I could play it on a CD player.  Could a CD player be expected to perform sufficently well to compare the three files [e.g. Denon DCM-370]?  Presumably I'd dither down to 16 bits.

Why do you need a REAL proof? You probably cannot provide fair enough conditions for such test.
First - you need a high-quality downsampling algorithm (which uses a steep filter with linear phase/constant group delay), are you sure you can get this? If you cannot, do not bother.
Second - you need a high-quality upsampling algorithm, this algorithm should be a part of the device you're going to use for playback (since the device will probably have 96/192 kHz DACs and must resample). Or you may manually upsample the 44kHz downsampled track to device's native sampling rate and hope it won't use its built-in (and possibly degrading) resampling algorithm. Are you sure you have this? If not, don't bother.
Third - you need exactly the same equipment for comparison. Otherwise, you will be comparing two pieces of hardware instead of the samples.
And last but not least - you have to realize under what paradigm you are going to compare. Theoretically, the 44kHz wafeform is able to perfectly describe any content between 0 and 22050 (excluded) Hz. You're not likely to hear anything above 22kHz, so 44kHz sampling rate should be perfectly OK for you. If you think about hardware limits, then the suitability of 44kHz depends on how good equipment you're able to get. If your hardware contains lame resampler/DAC then 96kHz is likely to offer better results for you.
Title: Resampling down to 44.1KHz
Post by: AndyH-ha on 2008-05-07 22:10:53
It sounds like you are confusing resampling with oversamling. Soundcards do not resample to 96 kHz, or any other high frequency. Resampling involves recalculating all sample values. DACS oversample, usually by much larger amounts (e.g. 128x to 5.6 MHZ) which means interpolating additional values between the original samples, without changing the original values, shifting the alias images to a very high frequency range.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-08 10:38:35
Kike,
Thanks for preparing and uploading these files.

MLXXX, could to try to ABX the unfiltered file versus the lowpassed ones? The lowpassed ones have a suffix that goes from l1 to l4.

The lowpassed files all sounded quite different to the original file.  They were not as loud, and appeared to emanate more from the left.  (This was either with the high-definition sound chip of my HTPC, or with the DAC of my AVR, and using loudspeakers.)

I have not looked at the spectra of the files.  I have simply ABXd them.

I whipped through the 4 tests quite quickly in one sitting, with 40 out of 40 correct:-

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/08 19:10:52

File A: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496.flac
File B: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496_l1.flac

19:10:52 : Test started.
19:11:26 : 01/01  50.0%
19:11:35 : 02/02  25.0%
19:11:43 : 03/03  12.5%
19:11:51 : 04/04  6.3%
19:11:59 : 05/05  3.1%
19:12:06 : 06/06  1.6%
19:12:13 : 07/07  0.8%
19:12:21 : 08/08  0.4%
19:12:28 : 09/09  0.2%
19:12:35 : 10/10  0.1%
19:12:38 : Test finished.

----------
Total: 10/10 (0.1%)

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/08 19:13:23

File A: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496.flac
File B: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496_l2.flac

19:13:23 : Test started.
19:13:40 : 01/01  50.0%
19:13:48 : 02/02  25.0%
19:13:56 : 03/03  12.5%
19:14:02 : 04/04  6.3%
19:14:08 : 05/05  3.1%
19:14:13 : 06/06  1.6%
19:14:20 : 07/07  0.8%
19:14:26 : 08/08  0.4%
19:14:33 : 09/09  0.2%
19:14:47 : 10/10  0.1%
19:14:49 : Test finished.

----------
Total: 10/10 (0.1%)

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/08 19:15:19

File A: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496.flac
File B: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496_l3.flac

19:15:19 : Test started.
19:15:52 : 01/01  50.0%
19:16:02 : 02/02  25.0%
19:16:11 : 03/03  12.5%
19:16:15 : 04/04  6.3%
19:16:23 : 05/05  3.1%
19:16:28 : 06/06  1.6%
19:16:33 : 07/07  0.8%
19:16:38 : 08/08  0.4%
19:16:43 : 09/09  0.2%
19:16:50 : 10/10  0.1%
19:16:52 : Test finished.

----------
Total: 10/10 (0.1%)

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/08 19:17:25

File A: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496.flac
File B: C:\Users\Public\Downloads\DownloadsfromHA\triangleMod_2_2496_l4.flac

19:17:25 : Test started.
19:17:45 : 01/01  50.0%
19:17:50 : 02/02  25.0%
19:17:54 : 03/03  12.5%
19:18:01 : 04/04  6.3%
19:18:07 : 05/05  3.1%
19:18:12 : 06/06  1.6%
19:18:18 : 07/07  0.8%
19:18:24 : 08/08  0.4%
19:18:29 : 09/09  0.2%
19:18:41 : 10/10  0.1%
19:18:44 : Test finished.

----------
Total: 10/10 (0.1%)
Title: Resampling down to 44.1KHz
Post by: SebastianG on 2008-05-08 10:52:43
It sounds like you are confusing resampling with oversamling. Soundcards do not resample to 96 kHz, or any other high frequency. Resampling involves recalculating all sample values. DACS oversample, usually by much larger amounts (e.g. 128x to 5.6 MHZ) which means interpolating additional values between the original samples, without changing the original values, shifting the alias images to a very high frequency range.

How's that not resampling?

"Resampling is the digital process of changing the sample rate or dimensions of digital imagery or audio" (Wiki quote I very much agree with)

So, resampling is of course involved in "oversampling ADCs/DACs".


Cheers,
SG
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-08 15:11:41
[You probably cannot provide fair enough conditions for such test.
First - you need a high-quality downsampling algorithm (which uses a steep filter with linear phase/constant group delay), are you sure you can get this? If you cannot, do not bother.

Indeed, if I cannot find a high quality downsampling algorithm for converting the 96/24 recording to 44.1Khz, without introducing audible diminishment or artifacts as a result of the digital filtering settings, I cannot proceed further.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-08 18:18:25
I have uploaded some files to test, MLXXX, which you might find interesting.The first one is to test a theory of mine about intermodulation distortion (if you wouldn't mind indulging me), the next two are high quality lowpass filters - some ringing but zero group delay and very high stopband attenuation. The next two were included because you reported a change in stereo image caused by lowpass filtering.

Edit: You can actually ignore the two lowpassed samples - they don't have any advantages over the ones KikeG uploaded.
Title: Resampling down to 44.1KHz
Post by: KikeG on 2008-05-08 18:18:38
MLXXX, I think there is something wrong going on with your audio setup. The first lowpassed sample, triangleMod_2_2496_l1.flac, has been lowpassed just over 45 KHz with a high quality filter (sox filter with a 256 sample window: sox filter 0-46000 256). The files are different only over 45 KHz (they go up to 48 KHz). You can test this by substracting the original and the filtered files.

There is also some pre-ringing before the triangle hit due to the filtering, but it is around 40 samples long (0.4 ms) has a frequency over 45 KHz and a level of around -65 dB, not audible by any means.

Also, I don't think your speakers or headphones can go that high. So the difference is most likely due to something else. Maybe your card is resampling or it is clipping, or there is some strange kind of intermodulation going on. As for intermodulation, I find strange that filtering such a small frequency range, where there is not any exceptionally high energy, causes an audible intermodulation difference into the audible range.

The right channel peaks at -0.47 dB before filtering. After filtering at 45 KHz, it peaks  to -0.68 dB. Maybe the first one is clipping and the second one is not, but this is only speculation.
Title: Resampling down to 44.1KHz
Post by: SoleBastard on 2008-05-08 18:26:54
Indeed, if I cannot find a high quality downsampling algorithm for converting the 96/24 recording to 44.1Khz, without introducing audible diminishment or artifacts as a result of the digital filtering settings, I cannot proceed further.


According to infinitewave (http://src.infinitewave.ca/), SSRC (http://otachan.com/foo_dsp_ssrc_057.7z) which is freely available as a plugin for Foobar2000 seems pretty much perfect to me. In most graphs it comes pretty close to the white 'ideal' line. I use it to convert 5.1 channel 24bit/96kHz DVD-A rips. Why not give it a try?
Title: Resampling down to 44.1KHz
Post by: Alex B on 2008-05-08 19:43:27
MLXXX, I think there is something wrong going on with your audio setup. The first lowpassed sample, triangleMod_2_2496_l1.flac, has been lowpassed just over 45 KHz with a high quality filter (sox filter with a 256 sample window: sox filter 0-46000 256). The files are different only over 45 KHz (they go up to 48 KHz). You can test this by substracting the original and the filtered files.

There is also some pre-ringing before the triangle hit due to the filtering, but it is around 40 samples long (0.4 ms) has a frequency over 45 KHz and a level of around -65 dB, not audible by any means.

Also, I don't think your speakers or headphones can go that high. So the difference is most likely due to something else. Maybe your card is resampling or it is clipping, or there is some strange kind of intermodulation going on. As for intermodulation, I find strange that filtering such a small frequency range, where there is not any exceptionally high energy, causes an audible intermodulation difference into the audible range.

The right channel peaks at -0.47 dB before filtering. After filtering at 45 KHz, it peaks  to -0.68 dB. Maybe the first one is clipping and the second one is not, but this is only speculation.


The difference between the reference file and the #4 is clearly audible on my setup. No ABX is needed, but I passed a test 10/10 even without using headphones (I used small powered Genelec desktop monitors).

The biggest difference is already in the beginning of the sample. The immediate sound of the hit on the triangle is very different.

Actually I don't seem to have any tool that could create a transparent 44.1 kHz version or even just a transparent 22 kHz low pass filtered version (without resampling) of the reference file. Resampling and low pass filtering seem to create quite similar audible changes. I tried several tools and filters in Audition 2 and Wavelab 5 and also foobar's SSRC plugin.

I am sure I can't hear anything over 20 kHz. Possibly my limit is about 18 kHz with test signals. So apparently resampling or low-pass filtering this triangle sample causes changes in the audible range -- at least on my setup.

My Terratec DMX6 fire 24/96 sound card does not resample and I used ASIO and Kernel streaming output modes for bypassing Windows Kernel resampling.

For excluding possible software/device problems at certain sample rates I resampled the file to 44.1 and then back to 96 kHz. These two versions sound identical to me (couldn't ABX). For this test I used foobar's SSRC in the Ultra mode.

Something very odd happens with this triangle file.
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-05-08 22:13:37
Something very odd happens with this triangle file.
It seems that this triangle sample suffers from an "Intersample Overload". Although the sample values are below maximum, the reconstructed signal is clipping.
This is how the samples (white dots) and the reconstructed waveform look like with Izotope RX:

[attachment=4460:attachment]
This could be considered as a non legitimate signal since it might cause clipping in DA and sample-rate converters.
I'd like to suggest to reduce the level of the triangle sample by 3 (or even 6) dB and redo the tests.
Title: Resampling down to 44.1KHz
Post by: Alex B on 2008-05-08 22:52:06
It seems that this triangle sample suffers from an "Intersample Overload". Although the sample values are below maximum, the reconstructed signal is clipping...

This could be considered as a non legitimate signal since it might cause clipping in DA and sample-rate converters.
I'd like to suggest to reduce the level of the triangle sample by 3 (or even 6) dB and redo the tests.


Seems like this can make some analog devices go nuts. I localized the problem to my old analog mixer that I have used between the sound card and the powered speakers.

I tested this with a difference signal of the reference and the #4 (invert mix paste). When the mixer is in the chain the difference signal sounds like a rather loud "pop". When I connect the speakers (or headphones) directly to the sound card the difference signal is almost inaudible. I can hear only a very faint high pitched click.

The difference is so small between the original reference and the #4 sample that ABXing wasn't any more possible after removing the mixer from the chain.

I'll test reduced volume samples sometime later.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-08 23:10:01
My Terratec DMX6 fire 24/96 sound card does not resample and I used ASIO and Kernel streaming output modes for bypassing Windows Kernel resampling.

For excluding possible software/device problems at certain sample rates I resampled the file to 44.1 and then back to 96 kHz. These two versions sound identical to me (couldn't ABX). For this test I used foobar's SSRC in the Ultra mode.

Something very odd happens with this triangle file.

Yes, it appears that the 96KHz samples cannot be used at full amplitude as there are implied values between the samples that exceed 0dB.  Boy, another hurdle.  I wonder whether the people who developed the test page including the triangle clip were aware of that issue.

I will not be able to do any further testing myself for another 12 hours or so. Until then.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-09 08:46:45
For your ABXing pleasure - versions reduced by 3dB and 6dB.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-09 10:19:53
I localized the problem to my old analog mixer that I have used between the sound card and the powered speakers.

I tested this with a difference signal of the reference and the #4 (invert mix paste). When the mixer is in the chain the difference signal sounds like a rather loud "pop".
So that's one positive ABX result traced to faulty/poor equipment. I just need to convince MLXXX to look in the same direction, and we might get a sane conclusion to this discussion.

I should probably mention (to try to avoid sounding like I've chosen to disbelieve people for the heck of it) that I ran similar experiments at university, using a variety of signals, and a variety of DACs, amps, and transducers. It is near-impossible to get the most extreme samples to play back at high levels without introduction audible distortion below 10kHz (for example). This distortion is generated by signals above 20kHz (for example) intermodulating, and so disappears when the content above 20kHz is removed. The audible difference between the original and low pass filtered version is due only to this distortion - so on really high quality equipment which minimises the distortion, the audible difference goes away.

It also means that, on standard equipment, with extreme samples where a difference is heard, the version with content above 20kHz removed actually gives the correct sound. When you listen to the version with the content above 20kHz present, you are hearing extra distortion which is not present in the recording itself, and is therefore incorrect. If you prefer this version, it's simply due to the common and well known human bias to perceive louder sounds, and sounds with slightly more distortion, as being "better".


You can get similar problems where there is significant content near the Nyquist limit. Typical D>A filtering leaves significant aliases in the first few kHz above the Nyquist limit; if there is nothing there, fine; if there are strong spectral components there, you have a classic intermodulation distortion test signal, and it's no surprise that you often get audible intermodulation distortion as a result.

Cheers,
David.

P.S. Of course bad resampling in itself can introduce audible differences.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-09 10:40:01
My guess is, if you meet all the "perfection" requirements (best resampling algorithms, avoid clipping due to filtering etc.), you will not be able to discern between the samples. At least this is what theory says (unless you can hear sounds above 22 kHz, which I think you can not).
Any practical failures will be due to factors other than the limitations of the 44 kHz waveform itself (e.g. lame software/hardware).
Title: Resampling down to 44.1KHz
Post by: AndyH-ha on 2008-05-09 12:34:21
It isn't unusual for the signal level to go above 0dB with all samples under 0dB. I don't have a ready reference to what I've read, I don't recall if the theoretical maximum is +8dB, +12dB, +14dB, but anyway, way above 0. Such extremes are rare in real music, but a couple of dB or so isn't. Good DACs handle this without working up a sweat. Clipping somewhere further along in the analogue chain is certainly a possibility if headroom is too little or if the volume control is set too high.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-09 13:25:17
Cabbagerat, thanks for the first set of uploads.  [The file wiith the added noise is very noisy indeed when played back on my systems. The noise gives the impression, on my equipment, of being well within the audible range.]

  • triangleMod_lp1.flac - This is the triangleMod sample with a lowpass filter nominally at 22kHz - the transition band is quite wide, but the filter has very high stopband attenuation.
  • triangleMod_lp2.flac - This is the triangleMod sample with a lowpass filter nominally at 20kHz - the transition band is quite wide, but the filter has very high stopband attenuation.


Quite interesting. The second file does sound duller but only if my ears are fresh.

So my experience when ABXing the two files goes something like this (and assuming the first unknown X is actually B):
[blockquote]Action ............  Result
Play A - solid energetic sound
Play B - not quite as intense an attack; a slightly 'plainer' sound as the triangle rings
Play X - aha! dull, just like B
Play Y - er, brighter than X but not as bright as A sounded
Select 'Y is A, X is B' answer.  Foobar says Correct.
Play X - dull
Play Y - er, dull
Take a break from the test, as ears can no longer distinguish between A and B.
Resume, get answer right but straight away be obliged to take another break.[/blockquote]

I conclude from this, that if I were listening  to loud music with lots of high frequency content my ears would rapidly lose their sensitivity to the highest frequency sounds that I can normally hear.

For your ABXing pleasure - versions reduced by 3dB and 6dB.



I wonder whether there was some real clipping when the recording was made (and not just occasional  intersample excesses).
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-09 14:00:40
Cabbagerat, thanks for the first set of uploads.  [The file wiith the added noise is very noisy indeed when played back on my systems. The noise gives the impression, on my equipment, of being well within the audible range.]
This, to me, seems to confirm what 2Bdecided said about intermodulation distortion. For the noise to sound as if it's in the audible range in this sample, something non-linear must be going on. Either your machines are resampling, or the soundcard is introducing IMD, or the amp is, or something else is going on. The noise energy in this sample is all about 21kHz, so if you could hear it, it would sound very, very high pitched.

On my machine (no resampling), the sample with added noise sounds exactly like the original. Oddly, if I fiddle with the bias of the amp attached to my PC*, and bias it into class B operation, the noise added sample gets a lot of hiss. From measurements done in the past, this amplifier exhibits lots of IMD in class B operation, so I am guessing it's that. 2Bdecided, or anybody else, what do you think?

* I'm not even going to try and explain why I would build an amp with a panel mounted bias adjustment.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-09 14:17:48
Well that noise sample of yours cabbagerat is very useful indeed.

It prompted me to try my XP computer driving my Audigy 4 hub into headphones, and guess what? - no noise!

I've also tried another AVR, an older unit (Denon AVR-1802) by sending SPDIF from the XP computer.  This time when playing the sample including noise above 22KHz:  no noise.

So it's my more modern setups that are currently causing problems.

N.B. With the old AVR and with the Audigy 4, comparing your two lowpass files, the 20KHz one still sounds a little duller.

EDITS:
1. With SPDIF on my XP computer set to 96KHz, I compared KikeG's version_1 (lowpass set to 45Khz, with cabbagerat's version_1 (lowpass set to 22KHz), and cabbagerat's version did sound duller.  This difference was easier for me to hear than the difference between cabbagerat's version_2 (lowpass set to 20KHz) and version_1.  That doesn't make a lot of sense unless perhaps the beginning of the rolloff of cabbagerat's 22KHz version is within an audible part of the spectrum.

2. KikeG's version_4, which from its spectrum looks like a 22KHz lowpass filter does appear to sound very slightly duller than his version_1, but I will try to get my Vista HTPC SPDIF operating at 96KHz, before making a more specific comment.
Title: Resampling down to 44.1KHz
Post by: Alex B on 2008-05-09 14:31:45
Here is the difference signal sample that I used in my test.

It should be dead silent or if you have truly exceptional high frequency hearing you might be able to hear some ultra high pitched sound.

If you hear anything that appears to be in the usual audible range (up to 20 kHz) it is caused by problems in the playback chain.


EDIT

It might not be a good idea to increase the volume level over what you would normally use. You might ruin your tweeters.
Title: Resampling down to 44.1KHz
Post by: KikeG on 2008-05-09 17:45:51
I have identified the lowpasses of the files I posted at the original post (here (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=63123&view=findpost&p=564023))
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-09 18:54:53
Edit: the applied lowpasses are:

triangleMod_2_2496_l1.flac : 46 KHz lowpass
triangleMod_2_2496_l2.flac : 30 KHz lowpass
triangleMod_2_2496_l3.flac : 26 KHz lowpass
triangleMod_2_2496_l4.flac : 21 KHz lowpass

The lowpass takes 2.5 KHz approx. For example, in case of the 21 KHz lowpass, it is -0.1 dB at 19950 Hz and -90 dB at 22600 Hz.

Would it be useful to ABX the |4 version against the |3 version?  If, when tested, |4 sounded duller, or different in some other way, would that be a basis for concluding that if |3 were downconverted from 96KHz to 44.1KHz, its perceptible sound quality would change to a similar degree?  (I am not across the details of current good practice for such conversions.)
Title: Resampling down to 44.1KHz
Post by: KikeG on 2008-05-09 18:59:03
Quote
would that be a basis for concluding that if |3 were downconverted from 96KHz to 44.1KHz, its perceptible sound quality would change to a similar degree?


Yes.
Title: Resampling down to 44.1KHz
Post by: AndyH-ha on 2008-05-09 21:22:26
Some CD players and stand-alone DACs "upsample" everything to 96kHz, 176.4kHz, 192kHz, or perhaps some other figure. Gaming soundcards and many multi-media soundcards resample everything to 48kHz. Whether these all then oversample by a large factor like most delta-sigma converters (64x and 128x are common) I am not sure, but by whatever labels, that and oversampling are two different processes. I have never seen nor heard of any soundcard that resamples everything to 96kHz or 192kHz, or any other such figure. The oversampling factor is applied to the data so the result it is not a constant, it depends on the audio's sample rate.
Title: Resampling down to 44.1KHz
Post by: Glenn Gundlach on 2008-05-10 05:01:52

A.  If a low pass filter has been identified with exceptionally good performance as a prefilter for a conversion to 44.1KHz, is it available as a plug-in, or as standalone software? [Or is the digital filter in Audition 3 about as good as it gets?]


A comparison of some SRCs (96 -> 44.1) is available on http://src.infinitewave.ca/ (http://src.infinitewave.ca/). But there's only Audition 2, not 3. Anyway, "Adobe Audition 2 Pre/Post Filter" graphs look good enough.


There is an Audition 3. I found my Athlon XP will not work with it (or the Orban loudness meter) because both require SSE2 instructions not present in Athlon XP. Anyway, here's a link to Audition 3

http://www.adobe.com/products/audition/ (http://www.adobe.com/products/audition/)

Title: Resampling down to 44.1KHz
Post by: krabapple on 2008-05-10 06:00:26
I think what he meant was there is no comparison data for Audition 3 on the SRC site.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-10 06:01:31
2. KikeG's version_4, which from its spectrum looks like a 22KHz lowpass filter does appear to sound very slightly duller than his version_1, but I will try to get my Vista HTPC SPDIF operating at 96KHz, before making a more specific comment.

I have now been able to get the SPDIF out on my home theatre pc (which uses an Intel DG965WH motherboard) to work at up to 192KHz.  [I did this by rolling back to a SIGMATEL driver. SIGMATEL have been supplanted by IDT but the IDT driver for some reason on my mobo limits the SPDIF out to a maximum of 48KHz.]

I will have to ask others to attempt an ABX on KykeG's samples, as my own high frequency hearing is not quite up to the task.  When I ABX test I like to know that I have the correct answer; such that foobar merely confirms it.  However these samples are so close for my ears, that when comparing |3 and |4, I merely get impressions with varying degrees of conviction.

I find there is more of a difference between |1 and |4 and I suspect that on a good day I could probably get a long series of correct answers using ABX software.  When I do hear differences between |3 and |4 they are:
* The beginning of |1 has more brilliance
* |1 sounds sharper (in the sense of an overall impression of musical pitch) than |4
* The beginning of |4 is slightly 'hollow' sounding.

40 years ago when I was in my early teens I would have heard these two versions differently, than I hear them now.

I imagine that the young first time poster in the Emperor's New Sample Rate thread, gantrithor, could easily ABX even |3 and |4, using headphones that had a reasonably decent treble response.  His post in that thread is at #96 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=62478&view=findpost&p=564054).

If an individual  who can hear these sorts of differences when the bitrate is kept at 96KHz can then be asked to listen to the same 96KHz sample downconverted to 44.1KHz and if they then report that they hear similar differences, this might point towards 44.1KHz being an inadequate sample rate, for that individual, within the hardware and software constraints of current technology.

I have a disquiet about 44.1KHz, but I have not so far been able to prove that, for my own middle-aged hearing, it is an inadequate sample rate.
Title: Resampling down to 44.1KHz
Post by: krabapple on 2008-05-10 06:40:14
I have a disquiet about 44.1KHz, but I have not so far been able to prove that, for my own middle-aged hearing, it is an inadequate sample rate.


Attenuation of ultrahigh frequencies in air is such that it has to be very energetic to make a significant contribution to the sound.  THat means listening very close, or very loud, or both, to material that generates UHF content.  Most of use don't encounter loud triangle concertos up close  (I'm imagining some  'more cowbell' scenario here); and if you are routinely listening to content loud enough for this to matter, your high-end hear will degrade anyway (as it has for most of us who attended rock concerts regularly in our youths)

Your concern should  not be with the sample rate, it should be with the playback circuit architecture.  When the upper limit of the baseband is close to the limit of hearing, the output (anti-imaging) filtering tolerances become critical. An oversampling playback chain, common today, takes care of that.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-10 07:04:44
Your concern should  not be with the sample rate, it should be with the playback circuit architecture.

krabapple, I was leaving that to last.

At this stage (by comparing the |4 and |1 samples), I am trying to establish whether digital filters exist that can be placed in a 96KHz digital path so as to reduce the response in the digital domain at 22050Hz to a negligible level, but not affect the perceived played back sound using a 96KHz rate DAC.

By keeping the playback at 96KHz I am removing one set of variables from the analysis; though it appears creating another complexity in the analysis; as per my next paragraph.

What I have not read up on, is how current sample rate converters handle the pre-filtering necessary to avoid aliases. They appear to sidestep (or postpone) the issue somewhat by oversampling in the first instance such that pre-filtering requirements would be much laxer.  But still as part of the processing, any part of the source signal that was in the zone just under 22050Hz must be strongly attennuated if not eliminated.  Is there reason to suppose that the postponed pre-filtering would be more effective at not disturbing source frequencies in the zone below 22050Hz, than the digital filtering used for upload |4, provided to us by KikeG?

I come to this forum with a dated general knowledge of electronics, and a curiousity to understand why 96/24 is being so strongly promoted if 44.1/24 (or even 44.1/16) are really quite adequate, even in extreme circumstances such as a cowbell concerto!
Title: Resampling down to 44.1KHz
Post by: lvqcl on 2008-05-10 09:46:41
I think what he meant was there is no comparison data for Audition 3 on the SRC site.

Yes. (To all: sorry if my english isn't good enough.  )
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-10 14:31:12
krabapple, I was leaving that to last.

At this stage (by comparing the |4 and |1 samples), I am trying to establish whether digital filters exist that can be placed in a 96KHz digital path so as to reduce the response in the digital domain at 22050Hz to a negligible level, but not affect the perceived played back sound using a 96KHz rate DAC.

By keeping the playback at 96KHz I am removing one set of variables from the analysis; though it appears creating another complexity in the analysis; as per my next paragraph.

What I have not read up on, is how current sample rate converters handle the pre-filtering necessary to avoid aliases. They appear to sidestep (or postpone) the issue somewhat by oversampling in the first instance such that pre-filtering requirements would be much laxer.  But still as part of the processing, any part of the source signal that was in the zone just under 22050Hz must be strongly attennuated if not eliminated.  Is there reason to suppose that the postponed pre-filtering would be more effective at not disturbing source frequencies in the zone below 22050Hz, than the digital filtering used for upload |4, provided to us by KikeG?

I come to this forum with a dated general knowledge of electronics, and a curiousity to understand why 96/24 is being so strongly promoted if 44.1/24 (or even 44.1/16) are really quite adequate, even in extreme circumstances such as a cowbell concerto!

http://martel.ic.cz/bordel/filtering.gif (http://martel.ic.cz/bordel/filtering.gif)
Applying a 21-22kHz lowpass filter on your original sample, you get this
http://martel.ic.cz/bordel/filtering2.gif (http://martel.ic.cz/bordel/filtering2.gif)
http://martel.ic.cz/bordel/triangle-2_2496...kHz-lowpass.wav (http://martel.ic.cz/bordel/triangle-2_2496-21-to-22kHz-lowpass.wav)
Inverting the filtered waveform and subtracting it from the original one, you get this
http://martel.ic.cz/bordel/filtering3.gif (http://martel.ic.cz/bordel/filtering3.gif)
http://martel.ic.cz/bordel/triangle-2_2496-difference.wav (http://martel.ic.cz/bordel/triangle-2_2496-difference.wav)

If you can get the difference sample through your audio chain without aliasing artifacts introduced into audible spectrum, you may test if you actually hear above 21kHz. If you do not hear anything on the difference sample, then 44kHz is good enough for you and/or your equipment.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-11 01:15:43
My results were:As my hearing extends only to 19.5KHz at best, it is not surprising I could not hear anything with this difference file, which commences at about 21KHz. If I did hear anyhing it had to be caused by aliasing or other distortion effects.

The reason I tried ABXing against KikeG's files |1 or |3 rather than the original unfiltered file, was that |1 and |3 had already filtered out frequencies near 48KHz, thus removing the complication of potential alisasing on playback.  The differences I heard would have been for other reasons such as phase changes within the audible part of the spectrum or, conceivably, intermodulation or other effects from the high amplitude high frequency audio, these types of effects being mentioned above by 2Bdecided.

But if nobody with extended high frequency hearing listens to some of these files, this thread may remain inconclusive.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-11 08:37:01
My results were:

  • DAC of old audio video receiver:  nothing audible unless volume above a particular setting (presumably non-linearity commenced in the amplifier at that point)

As far as I know, static nonlinearity (found in amplifiers) spawns only higher harmonics (integral multiples of base frequency), not frequencies lower than the base one. Perhaps the device has poorly designed power supply and you hear some interference from power grid or air.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-11 08:46:29
As far as I know, static nonlinearity (found in amplifiers) spawns only higher harmonics (integral multiples of base frequency), not frequencies lower than the base one. Perhaps the device has poorly designed power supply and you hear some interference from power grid or air.
There are lots of distortion mechanisms in an amplifier that can introduce lower frequencies from higher ones. These effects are generally smaller than harmonic distortion, but are still common. Intermod products, for one example, exist on both sides of the "carrier". Other nonlinear transfer functions (like the one sided exponential of a diode) also have this effect.
Title: Resampling down to 44.1KHz
Post by: lvqcl on 2008-05-11 09:04:34
MLXXX, can you record this 'click' with, say, soundcard of your HTPC? It would be interesting to look at its spectrum.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-11 09:56:23
As we do not appear to have any posters with extended high-frequency hearing, I created a simulation for myself.  I changed the sample rate of KikeG's |2 and |4 files to 88.2KHz.  Note there was no resampling: merely the header of the wave file was altered. [This can be done in cooledit when viewing a file: select edit > adjust sample rate.]

This meant that KikeG's file |4 instead of lowpassing up to a nominal 21KHz, lowpassed up to a nominal 91.875% of 21KHz, or 19294Hz.

Kike described the characteristics as follows:
[blockquote]The lowpass takes 2.5 KHz approx. For example, in case of the 21 KHz lowpass, it is -0.1 dB at 19950 Hz and -90 dB at 22600 Hz.[/blockquote]

In view of the reduction in playback speed to 88.2KHz, the lowpass would be 0.1dB down at 18330Hz and 90dB down at 20764Hz.

I was able to ABX KikeG's |2 and |4 (with difficulty), at this lower playback speed.  Seeing as how my upper limit of hearing is about 19.5KHz, this exercise might have been roughly equivalent to a person listening at original speed, whose upper limit of hearing was 21.25Khz  (19.5KHz x 96/88.2).

MLXXX, can you record this 'click' with, say, soundcard of your HTPC? It would be interesting to look at its spectrum.

I have looked into this a bit more and I've found the amplfier was being driven to clipping.  The spectrum is not all that interesting.  It's spread across the whole range 0 to 48KHz with peaks around 22KHz.  I made this recording with the Audigy 4 module both sending the difference file as 96KHz SPDIF to the AVR, and recording the audio output of the AVR appearing at the headphone socket:- [attachment now deleted to increase upload capacity]

Indeed, if I cannot find a high quality downsampling algorithm for converting the 96/24 recording to 44.1Khz, without introducing audible diminishment or artifacts as a result of the digital filtering settings, I cannot proceed further.


According to infinitewave (http://src.infinitewave.ca/), SSRC (http://otachan.com/foo_dsp_ssrc_057.7z) which is freely available as a plugin for Foobar2000 seems pretty much perfect to me. In most graphs it comes pretty close to the white 'ideal' line. I use it to convert 5.1 channel 24bit/96kHz DVD-A rips. Why not give it a try?


SoleBastard,
Sorry I overlooked your post until now.  SSRC indeed has an enviably steep cut-off.  However, I see that in the the infinitewave sweep graph for it,  there are some artifacts.

R8brain free is probably at the other extreme: wonderfully free of artifacts, but with a  mild rolloff (commencing at just over 18KHz).

This illustrates the tension in digital filtering between quantity (a wide passband bandwith) and quality (minimal spurious responses).
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-12 10:12:11
This illustrates the tension in digital filtering between quantity (a wide passband bandwith) and quality (minimal spurious responses).[/color]
In truth, it illustrates the nature of the universe we inhabit. You can only know the frequency perfectly with infinite time. You can only know the time perfectly with infinite frequency.

Limiting the time spreads the data in the frequency domain; limiting the frequency spreads the data in the time domain.


"Digital" or "analogue" has nothing to do with it. It's the uncertainty principle (well, not the uncertainty principle, but something like it).

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-13 14:17:20
SoleBastard,
Sorry I overlooked your post until now.  SSRC indeed has an enviably steep cut-off.  However, I see that in the the infinitewave sweep graph for it,  there are some artifacts.

R8brain free is probably at the other extreme: wonderfully free of artifacts, but with a  mild rolloff (commencing at just over 18KHz).

This illustrates the tension in digital filtering between quantity (a wide passband bandwith) and quality (minimal spurious responses).

SSRC seems to do an excellent job at downsampling, the aliasing artifacts are at -120 to -130 dB which is perfectly suitable for 16-bit target depth. The stopband attenuation may be a little insufficient for 24-bit targets, however (read http://en.wikipedia.org/wiki/Sample_rate_conversion (http://en.wikipedia.org/wiki/Sample_rate_conversion) for explanation on the "higher than the quantization noise" required attenuation).
EDIT: Well, perhaps the stopband attenuation is also a variable in SSRC and if a 24-bit target was chosen, the algorithm would attenuate even more... Someone needs to check.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-14 14:34:07
[Martel, I have not tried to find out exactly how SSRC performs.]

Based on the contents of this thread up till now, I'd be inclined to prefer a 48KHz sampling rate over 44.1Khz as it gives more margin for error, with only a relatively slight (less than 10%) increase in raw file size.

[A range of 48KHz sound cards could be used for the playback and the precise characteristics of the filter would not be all that critical.  Similarly the recording could be made with a range of recording devices, without undue concern about the filter characteristics.]

But there is another concern that is sometimes raised, beyond mere frequency response.  It is a concern about relative timing and phase.

Is it good enough to shoehorn everything into a strict timing regimen of say 48000 samples a second, if some waveforms are slightly out of phase with each other, as captured by different microphones? 

Arguably if 96KHz is used, any natural or artificial reverberation can be richer as the instantaneous wave cancellations are subtly recorded and reproduced without the constraint of a time structure (e.g. the volume level of different recorded tracks could be changed when creating a new mix and this could generate a whole new set of complex phase additions and cancellations, arguably more complex than if 48KHz had been used when recording).

Put another way, if an analogue source is captured simultaneously at 48KHz by two soundcards that are not locked in phase with each other, one card may be triggered by its sampling oscillator to take its sample* as much as 1/96000th sec after the other.  In such a case, will the played back sound be perceptibly different in an A B comparison?  This could be similar to comparing the sound from two microphones placed a distance apart equal to the distance sound travels in 1/96000th second.  At 25 degrees Celsius, sound travels at about 346m/s.  In 1/96000 sec, it would travel about 3.6mm, or a bit over a third of a centimetre.

A similar small difference due to sampling phase could also apply if downsampling a 192KHz recording to 48Khz.  There will be 4 samples at 192KHz for every 1 at 48KHz.  What if a 192Khz recording has 2 samples shaved off the start of it?  If it is then converted to 48KHz it will give a slightly different result compared with a version that has not been shaved being converted to 48KHz.  Substraction of the two conversions will leave a small residue.  But will the two conversions sound different to the ear in an A-B comparison?

Even if they do sound different, is this not comparable with the difference we experience if we move our head back by a third of a centimetre [not when listening to headphones].  A practically negligible difference?

Are there any situations where it could make a material difference to the listening experience if the sound is captured at 48KHz and not, say, 96KHz?

_______________________

* Even with oversampling, there is subsequent decimation/averaging.  After all of the processing, there exists but one sample value per channel, for each arbitrarily selected period of 1/48000 sec.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-14 14:54:11
To me, it sounds like you still haven't read the relevant threads in the FAQ. The subject of timing issues, or rather the lack of them, is quite well covered.


Also, IIRC there were cheap converters that did left then right, but I think we're talking decades ago, and they were quite rare. AFAIK no one is using them now in anything like a high quality application.

You can check for this fault quite trivially by recording or playing back the same thing on both channels. Impulses are an ideal test signal.


Even a sub-sample interchannel delay would cause audible high frequency loss if the output was combined to mono, which is one good reason why they are avoided. The other is that there is little reason to introduce them!

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: pdq on 2008-05-14 15:36:17
Put another way, if an analogue source is captured simultaneously at 48KHz by two soundcards that are not locked in phase with each other, one card may be triggered by its sampling oscillator to take its sample* as much as 1/96000th sec after the other.  In such a case, will the played back sound be perceptibly different in an A B comparison?  This could be similar to comparing the sound from two microphones placed a distance apart equal to the distance sound travels in 1/96000th second.  At 25 degrees Celsius, sound travels at about 346m/s.  In 1/96000 sec, it would travel about 3.6mm, or a bit over a third of a centimetre.

If the two soundcards have clock frequencies that differ by only 0.001% (10 ppm) then the phase shift between them will reach 1/96000 second after only one second and will increase by this amount every second.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-14 15:52:06
2Bdecided, I have looked through the FAQs but there is not a lot that seems conclusive.  Several times in my searches I came across this interesting report by yourself from 5 years ago, which seems quite relevant to the current topic:-


[blockquote]
... The next day, while the demo was being run for the Nth time, I was at the back of the room talking with someone. Suddenly, I heard a difference as the source switched. I was surprised, having failed to hear a difference the previous day listening in the sweet spot. I listened as it switched again, and heard it switch back – ah ha, it must have just gone analogue / digital / analogue. I kept listening – I couldn’t hear the difference next time it switched.

I went to the middle/back of the room, and listened through the next demo. Without being told, I could pick out 44.1 and 48kHz. The difference was more obvious back from the sweet spot than in the sweet spot itself. More importantly, the difference wasn’t what I (or the other people who failed to hear it) had been listening for. It didn’t make any difference to the frequency response at all, or to the clarity of the high frequencies.

What 44.1kHz and 48kHz did do was to make the sound slightly less realistic, like the difference between a good and bad CD player. If the lower sampling rate had any defined “quality” it was a glassy kind of sound – I’d heard that word associated with CD before and thought it was complete rubbish – but now I actually heard the difference, I understood exactly what people had meant.

The change from 44.1 or 48kHz to analogue to 96kHz slightly increased the depth of the sound stage. I’d been listening to the amazing demo 1 for 2 days, so it was hardly an impressive difference, but it was still there.

If you’re counting, that’s only two blind detections – once when I wasn’t even listening, and again when I went back to the middle of the room to check – I confirmed which had been which with Kevin afterwards – “The next to last one was 40something, wasn’t it?”


You can say many things about this. You could say it was just luck, but I don’t think it was – I wasn’t even listening for the difference because (having listened the previous day) I didn’t think there was one to hear! You can say that I was hearing sonic deficiencies in the equipment. Well, maybe. That may be what the whole 44.1/96k debate is based on. All I can say is that, if there are sonic deficiencies in this equipment (I think the dCS boxes are around 5k each, and are used in many recording studios) then there isn’t much hope for the rest of us!

What you could say, with some justification, is that the “character” of 44.1 was more obvious outside the sweet spot, so maybe it’s not such a big issue. That’s probably true – except that maybe I was just listening for the wrong thing when I was “in the sweet spot”. Maybe I had to stop listening to the Hi-Fi, and start listening to the music and the performance to hear what was happening.

What is significant is that the 44.1kHz version wasn’t just different from the 96k and analogue version, it was [I[worse[/I]. As the analogue was the master, any difference would be bad news, but for it to be subjectively worse makes matters even, well, worse!

I was upset to think how much recorded music only exists as a 44.1kHz or 48kHz sampled digital master tape. I discussed the subjective imperfections (the improved depth and realism of the 96k version) with Kevin, and he agreed. He was surprised that I’d noticed it that day, but couldn’t even hear anything wrong with 32kHz the previous day! I asked him what he heard with 16-bit (we’d been using 24-bit all along) and DSD. He said 16-bit was even worse – it made the whole sound “grungy”, and that DSD sounded nice, but added it’s own signature. “You can tell when you’re playing DSD through this system – the rooms heats up ” he said – I looked at the huge amps, and could believe it.

One thing I should note: I didn’t think the analogue master was particularly good quality. It was a gorgeous recording, but it had obvious flaws – e.g. background noise, and some audible edits. Also, I didn’t hear any difference between analogue, 96k and 192k. I can’t explain why 44.1kHz and 48kHz sounded worse, but they did. No one responsible for the demo had any reason to rig the results, and I played with enough of the equipment to know that everything was above board and fair, even though some of the cables we used might not have met with audiophile approval. ...
[/blockquote]

If the two soundcards have clock frequencies that differ by only 0.001% (10 ppm) then the phase shift between them will reach 1/96000 second after only one second and will increase by this amount every second.

I think this goes towards explaining why it is good practice to have a master synchronising signal, if more than one sound card is used for a recording.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-14 15:57:13
2Bdecided is right - you need to do some background reading. I will try to answer your questions as best I can.

But there is another concern that is sometimes raised, beyond mere frequency response.  It is a concern about relative timing and phase.
If the waves are slightly out of phase with eachother, they will be captured slightly out of phase. The ability to distinguish two phases in a sampled waveform is not directly limited by the sample rate - the SNR comes into play, too. This has been covered before in a number of threads. A recent thread on time resolution in PCM has all the answers.
Arguably if 96KHz is used, any natural or artificial reverberation can be richer as the instantaneous wave cancellations are subtly recorded and reproduced without the constraint of a time structure (e.g. the volume level of different recorded tracks could be changed when creating a new mix and this could generate a whole new set of complex phase additions and cancellations, arguably more complex than if 48KHz had been used when recording).
You could argue that, but you would be wrong. No matter how "complex", "rich" or "nuanced" a signal is, it can still be described by it's bandwidth and SNR.
Put another way, if an analogue source is captured simultaneously at 48KHz by two soundcards that are not locked in phase with each other, one card may be triggered by its sampling oscillator to take its sample* as much as 1/96000th sec after the other.
Yeah, and probably will be. Quartz clocks suck at long term stability - so you are going to be sampling at different instants. It's not a bad assumption that, given two arbitrary clocks at 96kHz the difference between them will be distributed evenly across 1/96000th of a second.
In such a case, will the played back sound be perceptibly different in an A B comparison?
No, because the output of reconstruction will be the same in both cases, given a bandlimited signal. Kotelnikov's original paper (one of the first in the field) actually discusses this, and it can be proven without difficulty. There is a small theoretical problem with the turn on condition (the beginning of time), but this can be ignored in audio.
This could be similar to comparing the sound from two microphones placed a distance apart equal to the distance sound travels in 1/96000th second.  At 25 degrees Celsius, sound travels at about 346m/s.  In 1/96000 sec, it would travel about 3.6mm, or a bit over a third of a centimetre.
Back in the mists of time, some radar signal processing was done with things called "accoustic delay lines" which worked in exactly this way. It worked amazingly well, for the time.

A similar small difference due to sampling phase could also apply if downsampling a 192KHz recording to 48Khz.  There will be 4 samples at 192KHz for every 1 at 48KHz.  What if a 192Khz recording has 2 samples shaved off the start of it?  If it is then converted to 48KHz it will give a slightly different result compared with a version that has not been shaved being converted to 48KHz.  Substraction of the two conversions will leave a small residue.  But will the two conversions sound different to the ear in an A-B comparison?
Blindly subtracting one digital signal from another isn't a good idea for just this reason. The two downsampled versions will differ by a "group delay", which you can correct digitally, in analogue, or by moving your speakers back a couple of centimeters. After reconstruction, the two signals will be identical. There can be a slight difference at turn on, but after that they'll be the same.

Even if they do sound different, is this not comparable with the difference we experience if we move our head back by a third of a centimetre [not when listening to headphones].  A practically negligible difference?

Yes. Take the function f(x) = cos(x) u(x), where u(x) is zero for negative x and 1 for positive x. Start sampling at time zero, and at time 0+1/96000. When you have those samples, reconstruct the original wave. Notice that they will be different at the beginning. After this turn-on period they will be the same. Due to the antialiasing filter, this example is a little more subtle than that, even - but they will still be different as the whole process has to be causal. Does it matter in the real world? No.

Are there any situations where it could make a material difference to the listening experience if the sound is captured at 48KHz and not, say, 96KHz?
In an ideal world, no. With real hardware, I don't know.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-14 16:17:53
2Bdecided, I have looked through the FAQs but there is not a lot that seems conclusive.  Several times in my searches I came across this interesting report by yourself from 5 years ago, which seems quite relevant to the current topic:-
Oh, I stand by that report (though I agree with the criticisms in the same thread).

It was subsequent reading, research, and experiments that cleared up (for me) most of the issues that you are working through. They are none issues (at least in theory).


There are only two slightly credible explanations: human ears don't quite work in the way we think, or the well known and understood imperfections in real equipment combine together to create audible differences.

There is an even more relevant point: no one has ABXed CD vs anything else, except by using seriously faulty equipment or by turning the volume up so high on near-silent passages that "normal" recordings would deafen you.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-15 02:51:26
Thanks cabbagerat; your specific explanations in response to my post are appreciated.

There are only two slightly credible explanations: human ears don't quite work in the way we think, or the well known and understood imperfections in real equipment combine together to create audible differences.

There is an even more relevant point: no one has ABXed CD vs anything else, except by using seriously faulty equipment or by turning the volume up so high on near-silent passages that "normal" recordings would deafen you.

It's relatively easy equipment-wise to test 24 bits against a dither to 16 bits because you have exactly the same timing of the samples, and can use the same sound card for playback, operating with the same filter; whether reproducing 24 bits, 16 bits dithered, or a truncation to 16 bits.  [I have done this myself with my own equipment at home.]

It's much harder to compare 96KHz as against 44.1KHz, and any differences that were heard could be ascribed to deficiencies in the equipment.  I assume that is how you might now primarily explain that report of your own listening experience in 2003, at different sample rates.

But I wonder whether there are any recent tests with highly evolved equipment that have concentrated on the 44.1 vs 48 vs 96+ issue with audio clips designed to highlight differences.

I could imagine that if six violinists played in front of a microphone each and the sound was mixed in analogue the result would be quite complex.  Alternatively, if each of the six sources were separately converted to digital at just 44.1KHz and mixed digitally with the other violins each at 44.1KHz, the result seems likely to be different, compared with sampling each at say 96KHz and mixing; even if the final mixdown of the 96KHz sources were at 44.1KHz.

People may ask 'why bother to use a separate ADC for each microphone?': just mix in an analogue mixer.  Well as technology advances, ADCs are becoming quite cheap and it may be an attractive proposition to fit out a microphone with its own ADC (and perhaps some sort of wireless data link) and dispense with any analogue mixer.

There may be other recording situations that would be more demanding and have greater potential to be affected by phase differences.

If we really are sure that 96KHz is of no benefit now, are recording engineers using it just in case it may make a difference with loudspeakers of the future; or is the use of 96KHz driven by (i) flawed technical assumptions, and/or (ii) a market demand fostered by advertising hype?
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-15 09:46:28
But I wonder whether there are any recent tests with highly evolved equipment that have concentrated on the 44.1 vs 48 vs 96+ issue with samples designed to highlight differences.

Well, those tests would merely prove/deny the equipment's ability to play back those samplerates. I guess there are some tests of CD players versus SACD on some audiophile pages. Since the differences between the formats are theoretically negligible, the real difference should lie only in playback equipment quality (or different mastering of CD and SACD version, so beware).
I could imagine that if six violinists played in front of a microphone each and the sound was mixed in analogue the result would be quite complex.  If each of the six sources were separately converted to digital at just 44.1KHz and mixed digitally with the other violins each at 44.1KHz, the result seems likely to be different, compared with sampling each at say 96KHz and mixing; even if the final mixdown is to 44.1KHz.

I think there's no theoretical reason why it should be different. Theoretically, you should be able to do filtering, analog-to-digital conversion, resampling and mixing in arbitrary order and get the same result. Practically, there is a preferred  order of those since equipment is not ideal (linear, unlimited dynamic range etc.) and the effort is to minimize the overall distortion. Just look at the resampling results of those software resamplers. It is all about lowpass filtering and most of the resamplers fail at that utterly. It is problematic to properly design an analogue antialiasing filter for a 44kHz ADC, so a 96kHz one is a much better choice.
Of course we could ask 'why bother to use a separate ADC for each microphone?': just mix in an analogue mixer.

Because it is not practical to have million ADCs in a studio and have to mix million different tracks in software.
If we really are sure that 96KHz is of no benefit now, are recording engineers using it just in case it may make a difference with loudspeakers of the future; or is the use of 96KHz driven by (i) flawed technical assumptions, and/or (ii) a market demand fostered by advertising hype?

96kHz ADCs are less likely to be plagued by analog antialiasing filter, which they need to include. You may (relatively) easily design something like the SSRC's lowpass in software but it is virtually impossible using analogue circuit.
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-05-15 10:55:07
If we really are sure that 96KHz is of no benefit now, are recording engineers using it just in case it may make a difference with loudspeakers of the future; or is the use of 96KHz driven by (i) flawed technical assumptions, and/or (ii) a market demand fostered by advertising hype?
On a recording budget the difference between using 44.1 and 96 kHz (or higher) is really benign these days. Since there seems no evidence that using 44.1 gives better results there is very little reason not to use 96 kHz or higher as a production format.
There seems anecdotal evidence that some plug-ins perform (sound) better at 96 kHz rate. A possible explanation is that the code has been optimized for that rate and not for 44.1. This "shouldn't" be a reason to record at 96, but it's probably the most practical workflow.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-15 11:34:28
I could imagine that if six violinists played in front of a microphone each and the sound was mixed in analogue the result would be quite complex.  Alternatively, if each of the six sources were separately converted to digital at just 44.1KHz and mixed digitally with the other violins each at 44.1KHz, the result seems likely to be different, compared with sampling each at say 96KHz and mixing; even if the final mixdown of the 96KHz sources were at 44.1KHz.
Let's think this through. Firstly, nothing samples at 44.1kHz in the 21st century - ADCs are always oversampled. So what you have is at least 352.8kHz resampled to 44.1kHz, vs at least 384kHz resampled to 96kHz resampled to 44.1kHz. The mixing is not the only (or even the main) difference here. It's bad experimental practice to introduce multiple variables: You should compare sample rates and associated resampling, or mixing - not both at once.


Here is a comparison which at least has analogue vs digital mixing (and the inevitable circuit differences) as the only variable:

Situation 1 = 6 ADCs, 96kHz, resample to 44.1kHz, mix signals
Situation 2 = mix signals, 1 ADC, 96kHz, resample to 44.1kHz

The problem with this experiment in practice is that the digital gains could be matched perfectly, whereas the analogue gains could not. Still, let us forget that for a moment. Let us assume we can do a perfect summation in both digital and analogue, use unity gain for each, and not clip. Let us make the equations easier by simply having two violin players yielding two microphone feeds, x and y. Let us denote the function of the ADC and the resampling by f. Let us denote simple summation by +.

Situation 1: 2 ADCs, digital mixing
final output = f(x) + f(y)

Situation 2: analogue mixing, 1 ADC
final output = f(x+y)


Then question becomes simple, because the very definition of a linear system (in this case, system f) is that these two situations yield an identical result for any value of x and y. In reality, we would put limits on x and y and say that the system was linear within these limits (no use considering levels that would blow up the equipment!).

So, if x and y are sensible voltages from real microphones, is f a linear system? Let's pull it apart and check each part in term, since a concatenation of linear systems is by definition also linear.

ADC:
0. the buffer amplifier might(!) be linear
1. low pass filtering is linear
2. straight quantisation is not linear - so we won't use that!
2a. dithered quantisation is still not linear, but breaks down into a linear-on-average system, and a noise source
3. Nyquist sampling is linear, but that assumes a perfect filter
3a. non Nyquist sampling creates aliases - however, this is linear distortion, so is still linear
Resampling: conceptualised as a resample up to a common multiple, filtering, and decimation to the desired rate
4. adding zero samples to pad the sample rate to the desired one is linear
5. low pass filtering is linear
6. throwing away samples is linear

The only part which may be mathematically non linear is the dithered quantisation, and that can be arbitrarily good based on the bit depth - which you already seem unconcerned by.


To summarise, the systems involved are linear, and it doesn't make any difference whether you have 6 ADCs and a digital mixer, or an analogue mixer followed by 1 ADC. All these superfine details that you are imagining are perfectly captured (to within the parameters of the system, namely bandwidth and noise floor) - whichever way around you do it.

Non linearities (e.g. that first buffer amplifier) would break this - but they'd also introduce signals that weren't supposed to be there anyway! Depending on where in the chain you introduced non-linearities, either version could be closer to the "correct" version.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-05-15 11:37:03
96kHz ADCs are less likely to be plagued by analog antialiasing filter, which they need to include. You may (relatively) easily design something like the SSRC's lowpass in software but it is virtually impossible using analogue circuit.
That's why almost all modern ADCs use oversampling and digital filtering. I think it's the need for low latency that restricts the complexity of digital filtering in recording equipment.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-16 08:48:44
96kHz ADCs are less likely to be plagued by analog antialiasing filter, which they need to include. You may (relatively) easily design something like the SSRC's lowpass in software but it is virtually impossible using analogue circuit.
That's why almost all modern ADCs use oversampling and digital filtering. I think it's the need for low latency that restricts the complexity of digital filtering in recording equipment.

Oh, sorry, I completely forgot that they are mostly based on delta-sigma. I must have been outside the audio territory for far too long. 
But I guess the claim about (lowpass) filtering quality and its impact still holds, be it digital or analogue.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-18 16:29:51
No, because the output of reconstruction will be the same in both cases, given a bandlimited signal. Kotelnikov's original paper (one of the first in the field) actually discusses this, and it can be proven without difficulty. There is a small theoretical problem with the turn on condition (the beginning of time), but this can be ignored in audio.

I've noticed in several other threads in other forums that when a "What about different phases?" question is raised, it is dealt with by reference to steady waveforms and Nyquist.  The argument goes that you can represent waveforms accurately with sampling at twice the maximum frequency of the Fourier series for a particular source.  The question is not dealt with of the quality of representing interactions between waveforms from independent sources with continuously varying phase relationships.  (I imagine I would not be in a position to understand a detailed mathematical explanation anyway!)  Perhaps my query does seek to explore the "turn on condition".

To summarise, the systems involved are linear, and it doesn't make any difference whether you have 6 ADCs and a digital mixer, or an analogue mixer followed by 1 ADC.


I do not understand the beginning of the explanation as these formulae appear to anticipate the conclusion reached:-

[blockquote]Situation 1: 2 ADCs, digital mixing
final output = f(x) + f(y)

Situation 2: analogue mixing, 1 ADC
final output = f(x+y)[/blockquote]

They seem to be declarations that a sampled output resolves to the same thing as an analogue output, for bandlimited input. 

A 96KHz extract

I have always found combined strings a good test for audio equipment.  I have come across a recording of an orchestra playing The Earth Overture by Kosuke Yamashita.

THe format is 7.1 channel 96KHz 24-bit linear PCM.  (The Blu-ray reference disc has been released by Q-TEC.)

The audio quality is very good.  I found that when I converted a short extract to 48KHz with Audition 3, the quality was reduced slightly (at least as played back by my AVR). In contrast, many other recordings I have experimented with have revealed no apparent (to me) audible differences when downsampled to 48KHz.

The 48KHz version is not quite as smooth sounding.  I find this noticeable in the harmony between the string sections.  With the 96KHz version, the sounds blend such that the strings taking the lower part are less noticeable.  I'll upload a 9 second extract in this post if possible.

Now I imagine 2Bdecided and many others will assume my playback equipment is responsible for the difference, and that is distinctly possible; but it is also possible that a conversion to 48KHz of this particular recording will impair it.

ABXing was not easy.  Loudspeakers revealed the differences (not my headphones).  Here are my results:

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/18 22:33:06

File A: C:\Users\Public\earthsong_9seconds.wav
File B: C:\Users\Public\earthsong_9secondsAuditionConvertedto48KHz.wav

22:33:06 : Test started.
22:35:11 : 01/01  50.0%
23:01:19 : 02/02  25.0%
23:02:28 : 03/03  12.5%
23:03:18 : 04/04  6.3%
23:03:37 : 05/05  3.1%
23:03:44 : Test finished.

----------
Total: 5/5 (3.1%)
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-18 19:25:05
A 44 kHz digital waveform PERFECTLY describes ANY signal (or mixture of signals), including phase, from 0 to 22049 Hz, if you do not consider distortion caused by finite number of amplitude quantization steps.
Just looking at the waveform, you might get suspicious about accuracy at frequencies near the Nyquist one, since the signal hardly gets 3-4 samples per period. Try zooming in the waveform in Cool Edit up to the sub-sample accuracy. There you will see some interpolated points between actual samples. These are calculated solely by upsampling. No information is lost, you may recalculate the "missing" samples any time. This "upsampling" also happens naturally in DAC upon conversion to continuous-time domain.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-19 08:33:55
I've noticed in several other threads in other forums that when a "What about different phases?" question is raised, it is dealt with by reference to steady waveforms and Nyquist.  The argument goes that you can represent waveforms accurately with sampling at twice the maximum frequency of the Fourier series for a particular source.  The question is not dealt with of the quality of representing interactions between waveforms from independent sources with continuously varying phase relationships.  (I imagine I would not be in a position to understand a detailed mathematical explanation anyway!)  Perhaps my query does seek to explore the "turn on condition".
You need to read some of the background theory, because I am not sure I can explain this clearly in a forum post. Essentially, the idea is that the sum of two bandlimited signals is a bandlimited signal. Therefore, in an ideal (no quantization, no clipping) system, if x would be properly sampled, and y would be properly sampled, then x+y will be properly sampled. With clipping and quantization, this becomes a little more grey, because (as detailed in 2Bdecided's post) we can't really assume the system is linear any more - but it's probably close enough. But the matter remains, there are no bandlimited signals whose "continuously varying phase relationships" cannot be captured by a sampled system - within the limits of the system SNR. It might seem logical that there are, but there really aren't.

As for the turn on condition - this is the question of, if your first discrete sample is sample x[0] of x(0), then what do you assume x[-1] to be during the reconstruction process? There is a mathematically correct way of doing it, and the way it's done in real systems.

To summarise, the systems involved are linear, and it doesn't make any difference whether you have 6 ADCs and a digital mixer, or an analogue mixer followed by 1 ADC.


I do not understand the beginning of the explanation as these formulae appear to anticipate the conclusion reached:-

[blockquote]Situation 1: 2 ADCs, digital mixing
final output = f(x) + f(y)

Situation 2: analogue mixing, 1 ADC
final output = f(x+y)[/blockquote]

They seem to be declarations that a sampled output resolves to the same thing as an analogue output, for bandlimited input. 
Yes, as 2Bdecided said in his (excellent) post - the process is for the most part linear. If f(x) is a linear function - then f(x+y) = f(x)+f(y) and f(ax) = af(x) for constant x. The post goes on to develop an argument why the sampling process can reasonably be considered to be linear - hence these relationships hold. Obviously this only holds up to clipping, and above the noise floor - but is a fair enough assumption about *reasonable* signals.

Please read his post again.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-19 12:06:54
Rereading the post leaves me with the same impression.  The conclusion of 2Bdecided's (excellent) post appears to flow from the mathematical basis it establishes at the beginning.

I note that in the analogue domain the sources to be mixed are not as severely bandlimited as they end up being when converted to the digital domain (assuming use of microphones that respond to frequencies exceeding 22050Hz, and assuming the use of a nominal digital sampling rate of 44.1KHz).

This difference between the bandwidths of the analogue and digital mixing processes must, I presume, be contemplated in the equations used at the beginning of the presentation, and must be considered to have no ultimate impact.

Try zooming in the waveform in Cool Edit up to the sub-sample accuracy.
With the particular sample clip, the 96Khz and 48KHz waveforms (at a given elapsed time from the start of the clip) often differ dramatically, presumably as there is so much content above 24KHz in the 96KHz version.

But I can see that if a continuous high frequency sine wave not far below the Nyquist limit were being sampled one could verify performance near the Nyquist limit by inspection of the Cool Edit produced waveform graphs, and this would be an interesting exercise.  The waveform would approximate a sine wave, possibly with a bit of phase delay introduced by digital filtering.  I guess the phase delay could be observed by generating a waveform at 10.5Khz with a weak 2nd harmonic and observing the [average] displacement of the zero crossing of the 21Khz component relative to the zero crossing of the fundamental, though I've never tried this.


[Will upload my sample clip if possible within the next 24 hours.]
Title: Resampling down to 44.1KHz
Post by: pdq on 2008-05-19 16:10:26
Let me see if I can provide an analog-domain equivalent to what we are discussing (and somebody correct me if I'm wrong).

Let's say that you start with some waveform, and then you add a 22.05 kHz sine wave to it. Now lowpass the result to 22049 Hz.

You will now have one of two things. Either the original waveform had no content above 22049 Hz, in which case you have back the original waveform, no matter how complex it was; or else the original waveform had content above 22049 Hz, in which case you now have intermodulation products between the original waveform and the 22.05 kHz sine wave.

When you translate this to A/D conversion followed by D/A conversion and bandwidth limiting the result is exactly the same except for clipping and quantization.


Apparently this only applies if you are multiplying by a 22050 Hz sine wave.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-19 16:56:54
Let me see if I can provide an analog-domain equivalent to what we are discussing (and somebody correct me if I'm wrong).

Let's say that you start with some waveform, and then you add a 22.05 kHz sine wave to it. Now lowpass the result to 22049 Hz.

You will now have one of two things. Either the original waveform had no content above 22049 Hz, in which case you have back the original waveform, no matter how complex it was; or else the original waveform had content above 22049 Hz, in which case you now have intermodulation products between the original waveform and the 22.05 kHz sine wave.
Why would you have intermodulation products? Is this analogue circuit broken or something?

As long as everything is working, and you choose a sensible filter (let's say 20kHz) you won't know whether you added a 22.05kHz sine wave before filtering, or not. It won't interact within anything, and it'll be gone after you filter.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: greynol on 2008-05-19 17:06:37
Key word here is product.

Simply summing two signals will not result in intermodulation.
Title: Resampling down to 44.1KHz
Post by: pdq on 2008-05-19 17:08:05
I could be wrong about this, but I thought that when you sum two frequencies the waveform is the same as if you had the sum and the difference of the two frequencies, but when you filter out the sum of the frequencies then you are left with the difference, which is an intermodulation product.
Title: Resampling down to 44.1KHz
Post by: greynol on 2008-05-19 17:18:21
You have to multiply the two signals or subject them to some other non-linear process during the summation in order to get sum and difference frequencies.
Title: Resampling down to 44.1KHz
Post by: pdq on 2008-05-19 17:25:48
Sorry, post corrected.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-19 18:11:26
A 44 kHz digital waveform PERFECTLY describes ANY signal (or mixture of signals), including phase, from 0 to 22049 Hz, if you do not consider distortion caused by finite number of amplitude quantization steps.
Just looking at the waveform, you might get suspicious about accuracy at frequencies near the Nyquist one, since the signal hardly gets 3-4 samples per period. Try zooming in the waveform in Cool Edit up to the sub-sample accuracy. There you will see some interpolated points between actual samples. These are calculated solely by upsampling. No information is lost, you may recalculate the "missing" samples any time. This "upsampling" also happens naturally in DAC upon conversion to continuous-time domain.

I do get suspicious when I look at a digital mixdown of 19KHz and 20Khz sinewaves that were created at 44.1KHz.  There are so few sample points and yet as you say cooledit manages to create a realistic graphical interpolation (with this relatively simple waveform). 

In contrast, when I look at 19KHz and 20KHz sinewaves created at 96KHz and mixed digitally in cooledit, there are so many more sample points in the mixdown that sophisticated interpolation would not be necessary: you could simply join the dots with a most basic form a of integration (a resistor and capacitor).  The undulations in overall amplitude at a rate of 1KHz appear to be relatively smooth, at this higher sampling rate.  I could readily imagine this undulating signal surviving, despite the addition of other high frequency signals into the digital mix each needing to be 'interpolated'.
Title: Resampling down to 44.1KHz
Post by: greynol on 2008-05-19 18:25:49
Reconstruction using a sinc pulse at every sample is perfect (ignoring quantization error and possible distortion at the edges) so long as the original signal is BW limited to half the sample rate.  I am pretty sure this is exactly what cool edit and adobe audition are doing with their graphical representation.  The software isn't Spice; it doesn't care about resistors and capacitors.

This is all that needs to be said.  The number of sample points used is extraneous and therefore irrelevant.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-20 08:38:10
I do get suspicious when I look at a digital mixdown of 19KHz and 20Khz sinewaves that were created at 44.1KHz.  There are so few sample points and yet as you say cooledit manages to create a realistic graphical interpolation (with this relatively simple waveform).
There's really no reason to get suspicious as there is EXACTLY ONE WAY how to fill in the missing samples, there's NO ambiguity. And this is by inserting arbitrary number of null samples between actual samples, then apply a digital lowpass filter which would eliminate any frequencies at and above the original Nyquist frequency. Well, the results may vary depending on the filter design quality but the principle is the same. If you have top quality filters, you are able to almost perfectly reconstruct any signal present in a 44.1kHz digital waveform, when going into the continuous-time domain (analogue signal). And this holds vice-versa as well (going from analogue to digital), as pointed out in my previous post.
In contrast, when I look at 19KHz and 20KHz sinewaves created at 96KHz and mixed digitally in cooledit, there are so many more sample points in the mixdown that sophisticated interpolation would not be necessary: you could simply join the dots with a most basic form a of integration (a resitor and capacitor).  The undulations in overall amplitude at a rate of 1KHz appear to be relatively smooth, at this higher sampling rate.  I could readily imagine this undulating signal surviving, despite the addition of other high frequency signals into the digital mix each needing to be 'interpolated'.

There is no "sophisticated" interpolation involved. I do not call lowpass filtering a sophisticated method. Well, perhaps the filter design itself might be "sophisticated" but the reconstruction process is not.
The samples that are present in the 96kHz wave and not in the 44.1kHz one are simply redundant and bring no additional information at all since they can be easilly (and almost perfectly, considering the digital filtering limits) recalculated.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-20 12:55:35
To me, it sounds like you still haven't read the relevant threads in the FAQ.
...the subject of there being very few sample points per cycle of a high frequency waveform is covered well in them.


This thread is like deja vu!

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-20 23:51:37
So that's one positive ABX result traced to faulty/poor equipment. I just need to convince MLXXX to look in the same direction, and we might get a sane conclusion to this discussion.

Just for the record, I find this type of wording mildly offensive, despite the habitual 'Cheers' tag.

[Will upload my sample clip if possible within the next 24 hours.]
There is an unresolved issue over whether the particular audio clip would meet Forum guidelines.  It may be that I will be unable to upload the extract as a test clip.  In that case, I guess I'll have to try to find another one that at least prima facie sounds different at 48KHz rather than 96KHz.  If it can be established the difference is simply due to deficiencies in the playback chain then so be it.  However that might still be a significant result if playback equipment generally available cannot play back well at 48KHz, despite theory indicating 48KHz should be sufficient.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-21 08:00:15
However that might still be a significant result if playback equipment generally available cannot play back well at 48KHz, despite theory indicating 48KHz should be sufficient.
A more likely conclusion is that the equipment works fine at 48kHz, and adds additional distortion when playing material with content above 20kHz. While it would be difficult to rate which one works "better", good recordings of the sounds as played will answer the question of which is more accurate. It wouldn't surprise me if the 48kHz version were more accurate.
Title: Resampling down to 44.1KHz
Post by: greynol on 2008-05-21 08:43:52
If we're talking sound cards, the problem is with playback at 44.1kHz whether it be through the analog out or the digital out.

Does Creative Labs even make a card that doesn't re-sample when fed a 44.1kHz signal???

Is it me or is this thread becoming increasingly tedious?
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-21 09:46:47
This thread started with the downsampling to 44.1KHz question.  But if downsampling to even 48KHz is a probem, 44.1KHz would be even more so.

If anyone can upload a clip of 96KHz audio that is apparently impaired when downsampled even to 48KHz, that might momentarily rescue this thread from tedium, for some participants anyway. 
Title: Resampling down to 44.1KHz
Post by: SebastianG on 2008-05-21 10:14:24
I do get suspicious when I look at a digital mixdown of 19KHz and 20Khz sinewaves that were created at 44.1KHz.  There are so few sample points and yet as you say cooledit manages to create a realistic graphical interpolation (with this relatively simple waveform).

In contrast, when I look at 19KHz and 20KHz sinewaves created at 96KHz and mixed digitally in cooledit, there are so many more sample points in the mixdown that sophisticated interpolation would not be necessary

Sounds like you still think that downmixing makes reconstruction somewhat harder. As 2B already pointed out all of the following operations are linear:
(1) sampling
(2) mixing
(3) reconstruction
It follows that
Code: [Select]
reconstruct(sample(x)) + reconstruct(sample(y))e
= reconstruct(sample(x) + sample(y))
= reconstruct(sample(x + y))


[at a higher sampling rate] you could simply join the dots

So? Relevance? Seriously. Grab a good DSP book that explains sampling and reconstruction. All what's been said here has been said many many times before.

Quote
This thread started with the downsampling to 44.1KHz question. But if downsampling to even 48KHz is a probem, 44.1KHz would be even more so.

There's no problem with downsampling to 44.1 kHz or 48 kHz (in theory). Reconstruction is also not a problem (in theory). There are simply soundcards out there that manage to screw up reconstruction. That's about it.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-21 10:28:21
SebastianG, thanks for taking the time to restate this, with precision.  However to me it was a side issue.  I only mentioned it in response to a post of Martel's [#64].  I may at some stage in my life try to immerse myself in the mathematics that you and others obviously understand so well.

At this point I would like to cut to the chase and ascertain whether there exists a section of a recording of music that people claim is impaired when downsampled to even 48KHz.

If no-one can provide such a clip, and if theory explains why this is so, then I can rest easy when purchasing material that is at 48KHz rather than 96KHz.  And I could make my own recordings of musical performances with confidence at only 48KHz.
Title: Resampling down to 44.1KHz
Post by: SebastianG on 2008-05-21 11:32:18
At this point I would like to cut to the chase and ascertain whether there exists a section of a recording of music that people claim is impaired when downsampled to even 48KHz.

If no-one can provide such a clip, and if theory explains why this is so, then I can rest easy when purchasing material that is at 48KHz rather than 96KHz.  And I could make my own recordings of musical performances with confidence at only 48KHz.

By "impaired" you probably meant "perceptually different".

I can't think of any reason why it should be perceptually different given a good reconstruction of both versions (48kHz versus 96kHz) simply because our human ears don't pick up ultrasonics -- at least to the best of our knowledge. This is fairly easy to test with pure tones (sine oscillator). But when people try to verify this with "normal music" instead possible reconstruction errors (aliasing or nonlinear distortions which lead to intermodulation) might make ultrasonic frequencies indirectly audible by polluting the audible spectrum. So, it's very likely that when people succeed in ABXing 48kHz versus 96kHz that something's wrong with the whole reconstruction process (from digital to air pressure). Then, the reconstructed 48kHz version could be even closer to the "original" in terms of perception.

IIRC, this is what 2B has said already.

Cheers,
SG
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-21 11:43:12

So that's one positive ABX result traced to faulty/poor equipment. I just need to convince MLXXX to look in the same direction, and we might get a sane conclusion to this discussion.

Just for the record, I find this type of wording mildly offensive, despite the habitual 'Cheers' tag.
MLXXX, there have been posts where I have been too harsh with you. I apologise. Please let me try to explain where I am coming from, and where my frustration at this discussion comes from!

A lot of people over the years have arrived at Hydrogenaudio, stated they have almost no knowledge or understanding of the subject, yet they feel sure that they have discovered some problem with some aspect of audio that has been missed by the entire audio industry.

The icing on the cake is that the "problem" is probably due to faulty hardware or software, but rather than investigating this possibility to rule it in or out, they prefer to embark on a discussion which implies "every mathematician and engineer who ever proved how this works was an idiot" or more simply "Nyquist was wrong", while saying they have no interest in acquiring the knowledge and understanding that they lack.


That's the offensive part - the "I know nothing about this, but I'm sure all these engineers and mathematicians were wrong". The people probably don't know enough about it to realise that's what they're implying - but that's exactly what they're implying. That's what's offensive - "engineering, science, maths, theory - pah - load of junk - the output of my Creative soundcard proves it's all wrong!". Actually, stated like that, it's not offensive, just funny.

You can find such threads in the FAQ; I hoped you'd see the parallels between them, and your own.



The thing that worries me the most is the kind of rubbish that plagues boards like Audio Asylum, where any problems that might exist in audio will never be solved because there's no acceptance of the basic science behind it. I really don't want to see that kind of thing at Hydrogenaudio.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: pdq on 2008-05-21 12:37:32
@MLXXX: You seem to want someone to prove to you that there is never a problem when downsampling 96 kHz to 48 kHz/44.1 kHz, but such a thing cannot be proven. It is only possible to prove that something does exist, not that it doesn't exist.

However, consider this. You have come to the HydrogenAudio forum, the place where all of the top experts in the field post regularly. The combined experience of these folks is hundreds if not thousands of years. So far nobody has come forward to say that they know of a case where downsampling resulted in an audible difference where it was not eventually shown to be a hardware problem. Can't you accept this as sufficient evidence that your worries are not justified?
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-21 12:58:23
2B, yes I can understand that some of my posts may have irritated you and others for the reason that they may have created the impression that I thought that sophisticated mathemathics can readily be disproven with some simple tests in a domestic environment with unsophisticated equipment.  That is not where I am coming from.

I do not for a moment question classic Nyquist Shannon concepts.  However it is not readily apparent to me how those concepts apply precisely to the human listening experience. It seems to be accepted that the upper frequency continuous sinewave response limit of the human ear (up to about 20KHz) is the relevant bandwidth limit.  I have to accept on blind faith that that is all that is relevant and sufficient for the human listening experience. 


I've recently spent many hours reading quite a few threads on the sampling rate topic and I would have to say the HA threads are way above the average standard for myself as a reader with an interest in the science (even if I do not fully understand it) as well as broader subjective comments.

Significantly, in no threads have I seen any upload of a file with a higher sample rate such as 96KHz claimed to be reduced in perceptible quality if played back after downsampling to a lower rate such as 48Khz.

That seems extraordinary to me.  If 96KHz can sound better as is claimed in so many threads, where is the concrete illustration of the claim???

The absence of such uploads to me significantly weakens the credibility of those who claim superiority of 96KHz per se.  [I exclude matters such as the fact that certain DSP operations may be more easily and accurately implemented with some software at 96KHz, e.g. a graphic equalizer function.]

However, consider this. You have come to the HydrogenAudio forum, the place where all of the top experts in the field post regularly. The combined experience of these folks is hundreds if not thousands of years. So far nobody has come forward to say that they know of a case where downsampling resulted in an audible difference where it was not eventually shown to be a hardware problem. Can't you accept this as sufficient evidence that your worries are not justified?

I have only just read this post of yours pdq, so am adding the following as an edit.

Indeed I think this is the thing.  If no-one can identify an instance where 96Khz sampled music sounds superior to 48Khz to the human ear, that really is damning for the 96KHz proponents.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-21 13:36:50
2B, yes I can understand that some of my posts may have irritated you and others for the reason that they may have created the impression that I thought that sophisticated mathemathics can readily be disproven with some simple tests in a domestic environment with unsophisticated equipment.  That is not where I am coming from.

I do not for a moment question classic Nyquist Shannon concepts.  However it is not readily apparent to me how those concepts apply precisely to the human listening experience. It seems to be accepted that the upper frequency continuous sinewave response limit of the human ear (up to about 20KHz) is the relevant bandwidth limit.  I have to accept on blind faith that that is all that is relevant and sufficient for the human listening experience. 


I've recently spent many hours reading quite a few threads on the sampling rate topic and I would have to say the HA threads are way above the average standard for myself as a reader with an interest in the science (even if I do not fully understand it) as well as broader subjective comments.

Significantly, in no threads have I seen any upload of a file with a higher sample rate such as 96KHz claimed to be reduced in perceptible quality if played back after downsampling to a lower rate such as 48Khz.

That seems extraordinary to me.  If 96KHz can sound better as is claimed in so many threads, where is the concrete illustration of the claim???

The absence of such uploads to me significantly weakens the credibility of those who claim superiority of 96KHz per se.  [I exclude matters such as the fact that certain DSP operations may be more easily and accurately implemented with some software at 96KHz, e.g. a graphic equalizer function.]


However, consider this. You have come to the HydrogenAudio forum, the place where all of the top experts in the field post regularly. The combined experience of these folks is hundreds if not thousands of years. So far nobody has come forward to say that they know of a case where downsampling resulted in an audible difference where it was not eventually shown to be a hardware problem. Can't you accept this as sufficient evidence that your worries are not justified?

I have only just read this post of yours pdq, so am adding the following as an edit.

Indeed I think this is the thing.  If no-one can identify an instance where 96Khz sampled music sounds superior to 48Khz to the human ear, that really is damning for the 96KHz proponents.

Please, let this end already... 
Someone might claim superiority of 96 kHz to 44,1 kHz but in reality, this is mostly NOT based upon capabilities of the format itself, only upon lame implementation of playback chain, unfounded rumors or general feeling that GREATER = BETTER. There is absolutely no guarantee that a 96 kHz equipment playing a 96 kHz material will sound better than a 44,1 kHz one. There is, perhaps, just higher probability that equipment playing a 44,1kHz content will screw something up because of poor/cheap playback chain design. So by going 96 kHz you are more likely to avoid (audible) issues caused by poor filter design.
If you do not trust your hardware, please go ahead and convert everything to 96kHz using SSRC, so you may find peace having "enough" samples per signal period.
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-05-21 13:53:02
If no-one can identify an instance where 96Khz sampled music sounds superior to 48Khz to the human ear, that really is damning for the 96KHz proponents.
I applaud your persistence, especially in this forum full of sceptics. Let's continue the search for a killer sample where the difference is obvious. I would be happy to record and host some (free) samples up to 24/192 kHz. Any suggestions ?
However, consider this. You have come to the HydrogenAudio forum, the place where all of the top experts in the field post regularly. The combined experience of these folks is hundreds if not thousands of years. So far nobody has come forward to say that they know of a case where downsampling resulted in an audible difference where it was not eventually shown to be a hardware problem. Can't you accept this as sufficient evidence that your worries are not justified?
Mind you, not "all of the top experts" are HA members. I find this kind of reasoning rather deceiving and even intimidating. Can't we just encourage curious people like MLXXX to perform tests and discuss the best ways to do so ? Thousands of audio professionals are moving to hi-res audio. They could all be wrong and wasting money and bandwidth. It can also be a motivation to search for (not necessarily perceptual) reasons why they prefer hi-res audio.
Title: Resampling down to 44.1KHz
Post by: pdq on 2008-05-21 13:57:56
It seems to be accepted that the upper frequency continuous sinewave response limit of the human ear (up to about 20KHz) is the relevant bandwidth limit.  I have to accept on blind faith that that is all that is relevant and sufficient for the human listening experience.

Just one more correction and then I think we can lay this topic to rest.

The relevant bandwidth limit is not the ability to hear continuous sinewaves, unless one is in the habit of listening to high frequency sinewaves. The ability to hear high frequencies in real music, even very synthetic music, is significantly lower. Being able to hear the difference after music has been lowpassed at about 16 to 17 kHz is actually quite rare, although I think there have been some verified cases. Recently someone with admitedly very unusual hearing claimed to hear much higher, but I don't recall that this was ever verified.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-21 14:52:08
Let's continue the search for a killer sample where the difference is obvious. I would be happy to record and host some (free) samples up to 24/192 kHz. Any suggestions ?
Thx for your kind remarks.

I suspect that even with a killer sample, the effect might not be all that obvious.

The only suggestion I have and it is one that would only apply where a large number of string players were available (and perhaps playing at a very high standard!) is a recording made with extended range microphone(s) of the violin section of an orchestra.*

As an easier alternative, perhaps people who have in their possession some high definition recordings [recent era; 96Khz+] might be inclined to downsample one or two tracks to 48Khz [44.1Khz could be  problematic for other reasons as has been mentioned in this thread] and compare the listening experience to the original sample rate.

There was one website I encountered (Mytek Digital) which had samples of different analogue to digital converters operating at the same sampling rate (192KHz) that had apparently processed the same performance of music.  The website invited visitors to compare the digital versions.  I could hear differences between the ADCs (which rather surprised me), but I could not hear any differences from converting the sampling from the various ADCs down to 48Khz using Audition 3.  The music genre was jazz.

___________

* In the recording I referred to in post #63, the harmony between string sections was sweeter and more fluid on my AVR at 96KHz than at 48KHz.  The effect was very subtle and very possibly due to hardware issues, but of a handful of high definition recordings I have evaluated it is the only one where I found I could hear a difference.  Subjectively it was similar to the difference between 24 bits and 24 bits truncated to 16 bits, i.e. a very subtle differerence to do with the smoothness of the sound.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-21 15:19:27
Mind you, not "all of the top experts" are HA members.
Clearly not, but I think you'd be amazed at some of the people who are (anonymously). You can catch more of them on various mailing lists, should you want to.

Quote
I find this kind of reasoning rather deceiving and even intimidating. Can't we just encourage curious people like MLXXX to perform tests and discuss the best ways to do so ? Thousands of audio professionals are moving to hi-res audio. They could all be wrong and wasting money and bandwidth. It can also be a motivation to search for (not necessarily perceptual) reasons why they prefer hi-res audio.
I don't discourage the investigation - investigation is good. There are, however, clear caveats we can't ignore - otherwise it's a pretty meaningless investigation.

As has been said, we've done that part to death now, so I shall shut up on that.

I think the other samples on PCABX are a good place to start.
http://64.41.69.21/technical/sample_rates/index.htm (http://64.41.69.21/technical/sample_rates/index.htm)
Pity there isn't a closely mic'd trumpet.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: krabapple on 2008-05-21 16:40:19
SebastianG, thanks for taking the time to restate this, with precision.  However to me it was a side issue.  I only mentioned it in response to a post of Martel's [#64].  I may at some stage in my life try to immerse myself in the mathematics that you and others obviously understand so well.

At this point I would like to cut to the chase and ascertain whether there exists a section of a recording of music that people claim is impaired when downsampled to even 48KHz.

If no-one can provide such a clip, and if theory explains why this is so, then I can rest easy when purchasing material that is at 48KHz rather than 96KHz.  And I could make my own recordings of musical performances with confidence at only 48KHz.



For recording, why not just split the diff and record at 88.2/24bit?  That's an even multiple SR of 44.1, computationally a snap if you need to downsample to CD rate.  And it's well above even the ~60kHz 'safety' rate proposed by Lavry and others for surmounting any real or theoretical problems with suboptimal antialias and anti-image filters.  At 88.2 you should have no pangs of anxiety (even though I think it's way overkill).


Quote
I do not for a moment question classic Nyquist Shannon concepts. However it is not readily apparent to me how those concepts apply precisely to the human listening experience. It seems to be accepted that the upper frequency continuous sinewave response limit of the human ear (up to about 20KHz) is the relevant bandwidth limit. I have to accept on blind faith that that is all that is relevant and sufficient for the human listening experience.



First, my impression is that understanding the maths behind DSP is really the only way to *truly* understand what's going on (which I do not  claim I do).  As you see some aspects of DSP really are counterintuitive on their face....like the 'few samples at high frequencies' thing.

Second, every attempt so far to argue for the physiological need for higher sample rates in order to produce realistic audio, founders at the blind test stage.  Thus proponents have to resort to arguments like:  it's a hypersonic effect that is only detectable by brain imaging! (though the 'effect curiously seems to last much longer than the stimulus, and requires custom made playback gear) or, some musical instruments have lots of energy above 20kHz! (and some visible light sources have lots of energy in the UV or infrared ranges...so?) or , what about bone conduction?! (what about it? it's a vibration effect that requires the source to be very close to the body).  The only argument with any solid foundation is: 44.1 puts the onus on engineers to make their brickwall filters very good indeed, or to use oversampling, because at 44.1 the cutoff  frequency (22.05) is so close to the audible limit.  So shoddy implementation at recording or playback could lead to audible artifacts.

You don't have to accept on 'blind faith' that the ear's passband for sounds transmitted through air extends 'only' up to the mid-20's at very best,  these numbers weren't pulled from thin air, there is a scientific literature on psychoacoustics and the physiology of audition dating back a century.

Quote
The absence of such uploads to me significantly weakens the credibility of those who claim superiority of 96KHz per se


Well, no kidding!      I don't know where you get the impression that '96 kHz per se is superior' is the consensus on HA.org.  I'd say it's quite the opposite.  Of course, once we travel beyond the confines of 'the village' here, and out into the woods of other 'audiophile' forums, then we start to see claims that have more foundation in belief than evidence. 
Title: Resampling down to 44.1KHz
Post by: krabapple on 2008-05-21 17:08:12

Let's continue the search for a killer sample where the difference is obvious. I would be happy to record and host some (free) samples up to 24/192 kHz. Any suggestions ?
Thx for your kind remarks.


As an easier alternative, perhaps people who have in their possession some high definition recordings [recent era; 96Khz+] might be inclined to downsample one or two tracks to 48Khz [44.1Khz could be  problematic for other reasons as has been mentioned in this thread] and compare the listening experience to the original sample rate.
[/size]



Here is a site that claims to offer the same sample recorded in 96/24 and 44/16. 


http://www.soundkeeperrecordings.com/format.htm (http://www.soundkeeperrecordings.com/format.htm)


(FWIW, the engineer, Barry Diament, spouts a considerably amount of audio woo on Hoffman's board, so take that in advisement)
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-21 17:23:58
You don't have to accept on 'blind faith' that the ear's passband for sounds transmitted through air extends 'only' up to the mid-20's at very best,  these numbers weren't pulled from thin air, there is a scientific literature on psychoacoustics and the physiology of audition dating back a century.

Thx for your various comments krabapple.  On this particular aspect, the point I was trying to make is that although it can be said based on decades of testing that the human ear has a bandwidth of around 20KHz when tested with continuous tones, I am obliged to accept on blind faith that that is all that is required as the bandwidth of a digital reconstruction process.

To many people, the two bandwidths are equivalent and no further analysis is necessary.

I am hesitant as real life audio sources can start and stop abruptly and asynchronously.  We have a perception of the direction of a sound source as well as its pitch and tonal quality.  All of this strikes me as very complex.  It is not clear to me (but has to be accepted with blind faith) that because our ears can only hear a continuous tone up to about 20KHz, a bandwidth of 20KHz in the electronics is sufficient for recording and reproducing music.
Title: Resampling down to 44.1KHz
Post by: sld on 2008-05-21 18:27:34
Don't you think that it is futile to try to force a point home without backing it up with the necessary objective testing data as well as some semblance of knowledge of the mathematics behind longitudinal waveforms and the human ear's response to them?
Title: Resampling down to 44.1KHz
Post by: krabapple on 2008-05-22 05:16:18

You don't have to accept on 'blind faith' that the ear's passband for sounds transmitted through air extends 'only' up to the mid-20's at very best,  these numbers weren't pulled from thin air, there is a scientific literature on psychoacoustics and the physiology of audition dating back a century.

Thx for your various comments krabapple.  On this particular aspect, the point I was trying to make is that although it can be said based on decades of testing that the human ear has a bandwidth of around 20KHz when tested with continuous tones, I am obliged to accept on blind faith that that is all that is required as the bandwidth of a digital reconstruction process.

To many people, the two bandwidths are equivalent and no further analysis is necessary.


Well, if anything, 'test tones" can be *more* useful for discriminating differnces, than complex samples like music, where psychoacoustic masking effects kick in.


Quote
I am hesitant as real life audio sources can start and stop abruptly and asynchronously.  We have a perception of the direction of a sound source as well as its pitch and tonal quality.  All of this strikes me as very complex.  It is not clear to me (but has to be accepted with blind faith) that because our ears can only hear a continuous tone up to about 20KHz, a bandwidth of 20KHz in the electronics is sufficient for recording and reproducing music.


Again, the documented upper limit of hearing is more like 24, not 20, kHz, but this is for children and exceptional adults.  Typically adult hearing's increasingly degraded from 16 kHz on up, and even at best in our youth, we are always more sensitive to some ranges than others.  That's the way our hearing works, and it makes good evolutionary sense for us to be more sensitive to midrange (speech, vocalization) than to what bats hear. By contrast, the delivered response of CD audio is essentially *flat* to about 20.

The 'it strikes me as complex' is close to an argument from personal incredulity. and the answer is: more reading about how digital audio *works*.  The argument about transients  ('abrupt starts and stops'), phase ('asynchonicity') and directionality have been done to death and tend to devolve back to one side refusing to accept the science and the maths 'on faith' , though they haven't really grasped the science and the maths in the first place.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-22 11:49:54
Thx for your various comments krabapple.  On this particular aspect, the point I was trying to make is that although it can be said based on decades of testing that the human ear has a bandwidth of around 20KHz when tested with continuous tones, I am obliged to accept on blind faith that that is all that is required as the bandwidth of a digital reconstruction process.

To many people, the two bandwidths are equivalent and no further analysis is necessary.

I am hesitant as real life audio sources can start and stop abruptly and asynchronously.  We have a perception of the direction of a sound source as well as its pitch and tonal quality.  All of this strikes me as very complex.  It is not clear to me (but has to be accepted with blind faith) that because our ears can only hear a continuous tone up to about 20KHz, a bandwidth of 20KHz in the electronics is sufficient for recording and reproducing music.
You're doing it again - you're assuming that, in the entire history of psychoacoustics, no one tried low pass filtering impulses to check the limit that way; no one tried manipulating interaural level, time, and frequency to determine the effects; and no one tried recording audio at high bandwidth, and checked for the audibility of various low pass filters.

krabapple put it succinctly...
Quote
The 'it strikes me as complex' is close to an argument from personal incredulity. and the answer is: more reading about how digital audio *works*.

...though you're moving into psychoacoustics now. A very fascinating field. Try these:

http://www.amazon.co.uk/Introduction-Psych...e/dp/0125056281 (http://www.amazon.co.uk/Introduction-Psychology-Hearing-Brian-Moore/dp/0125056281)
(the £5 link is a bargain!)
http://www.amazon.co.uk/Psychoacoustics-Mo...595/ref=ed_oe_h (http://www.amazon.co.uk/Psychoacoustics-Models-Springer-Information-Sciences/dp/3540231595/ref=ed_oe_h)
http://www.amazon.co.uk/Hearing-Handbook-P...2980&sr=1-1 (http://www.amazon.co.uk/Hearing-Handbook-Perception-Cognition-Second/dp/0125056265/ref=sr_1_1?ie=UTF8&s=books&qid=1211452980&sr=1-1)

These are probably superseded by more modern publications - it's 10 years since I read them.

Finally, this is far away from what you're looking for, but it's so great at explaining the in-band limits of human hearing wrt audio coding that I had to include it...

http://www.ece.rochester.edu/~gsharma/SPS_...AudioCoding.pdf (http://www.ece.rochester.edu/~gsharma/SPS_Rochester/presentations/JohnstonPerceptualAudioCoding.pdf)

Cheers,
David.

P.S. EDIT: Sampling Theory
http://groups.google.com/group/comp.dsp/ms...hl=en&fwc=1 (http://groups.google.com/group/comp.dsp/msg/e9b6488aef1e2580?hl=en&fwc=1)
Forget the maths if you want - the conclusions themselves are interesting. You'll find some of them quoted in this thread. Doubting them is about as useful as doubting that 2+2=4.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-22 15:24:03

However that might still be a significant result if playback equipment generally available cannot play back well at 48KHz, despite theory indicating 48KHz should be sufficient.
A more likely conclusion is that the equipment works fine at 48kHz, and adds additional distortion when playing material with content above 20kHz. While it would be difficult to rate which one works "better", good recordings of the sounds as played will answer the question of which is more accurate. It wouldn't surprise me if the 48kHz version were more accurate.

Thanks Cabbagerat.  It seems that even if a particular file did seem to sound better when played at one sample rate than another, using a particular playback chain, there could be any number of possible reasons for that outcome.

You're doing it again - you're assuming that, in the entire history of psychoacoustics, no one tried low pass filtering impulses to check the limit that way; no one tried manipulating interaural level, time, and frequency to determine the effects; and no one tried recording audio at high bandwidth, and checked for the audibility of various low pass filters.

2B, I have for years assumed such tests would have been done.

_____________________


Perhaps this thread has reached a natural end, unless there are any actual audio clips at around 96KHz or more that have been identified (and can be linked to, or uploaded ) that appear to sound better at the higher sampling rate than when downsampled to 48Khz. Though if anyone has the courage to identify such a clip, they should be ready for their claim to be challenged!

Thanks again for the various helpful comments,
MLXXX
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-23 08:08:34
Perhaps this thread has reached a natural end, unless there are any actual audio clips at around 96KHz or more that have been identified (and can be linked to, or uploaded ) that appear to sound better at the higher sampling rate than when downsampled to 48Khz. Though if anyone has the courage to identify such a clip, they should be ready for their claim to be challenged!

Man, this is not about courage, this would be about breaking the human body limits! 
And, please, rule out the sampling rate from your considerations, a 96kHz waveform lowpassed at 24 kHz bears exactly the same information as the same waveform downsampled to 48 kHz (if downsampled ideally).
If I were you, I would first "investigate" the possibility of identifying a 24 kHz lowpass filter. Try and start a new thread.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-25 16:10:05
If I were you, I would first "investigate" the possibility of identifying a 24 kHz lowpass filter. Try and start a new thread.
Martel, I've already spent a lot of time in this thread and what I am about to write is relevant to what has gone before.

1st test:
File 1: Audition 3 used to generate a tone of 8333Hz at -20dB @ 192KHz sample rate (single channel).
File 2: Audition 3 used to generate a third harmonic of 8333Hz at -20dB (i.e. a tone at 24999Hz)@ 192KHz.
File 3 created (single channel): file 1 + file 2, @ 192KHz

When file 1 and file 3 were attempted to be ABXd a problem arose as the tweeter in trying to handle the 24999Hz tone was not able to reproduce the 8333Hz at full amplitude.

[With a microphone at 1m from the tweeter and using an oscilliscope connected to the output of the analogue mixer, the peak to peak voltage was slightly less when playing file 3 compared with file 1.  The waveform shape was different as well.]

After temporarily reducing the amplitude of file 1 by a small amount, the files still sounded different when an ABX was attempted. 

However I was concerned that the tweeter might be creating spurious effects, so I changed the experimental setup.

2nd test:
Stereo file A created with file 1 (8333Hz) as the left channel and file 2 (24999Hz) as the right channel. 
Stereo file B created with file 1 (8333Hz) as the left channel and zero signal for the right channel.

Playback volume of the left speaker was tested with the microphone 1 metre in front of the tweeter and feeding the oscilloscope.  Amplitude of the waveform from the left speaker remained constant whether or not the right channel was playing, i.e. whether file A or file B was played.

At a reasonable listening distance, A and B sounded different (file A seemed louder and a little richer).

I was concerned that the separation of the speakers was so great it was creating a sound field full of peaks and troughs.  The wavelength of 8333Hz is only a little over 4 centimetres.

3rd test:
In the interest of science, I moved the front left and front right home theatre speaker enclosures so their sides were touching, and played files A and B in an endless loop.

Even at a distance of 8 meters on axis from the speakers there were very noticeable nodes in the sound field.  As in test 2, file A seemed louder and little richer.  This was clearcut.  (However it was important not to move as the loop played.)

As a type of control, I created a file 3L, which had the contents of file 3 in the right channel, and nothing in the left channel.  When this was played, the sound field was full of nodes.  This was to be expected.  Our living room is not an anechoic chamber.

Also, with the speakers adjacent, I positioned the microphone about 1.5 metres away and observed the oscilloscope.  The waveform was not perfect but it was very different to a sine wave when one speaker was reproducing the 8333Hz tone and the other the 3rd harmonic.

Conclusions:

Although the third harmonic of a tone at 8333Hz cannot be heard when played by itself (i.e. as a tone of 24999Hz) by adult human beings, it can have an impact on the human listening experience, when the fundamental frequency is also being reproduced by a loudspeaker system in a home environment.

If the 24999Hz tone is absent, the listening experience can be different.  Subjectively (for me) it is slightly less rich.  Also I found that when the harmonic was present, I perceived the pitch as sounding slighter sharper if my ears were fresh, but flatter if I had been ABXing for a while.  [This certainly didn't assist the ABX process!]

The effect was subtle.

Some audio cannot be downsampled to 44.1KHz, or even 48KHz,  without affecting the perceived sound.

Equipment used:

Software: Audition 3, Cooledit 2, foobar
AVR driven from PC with coaxial SPDIF at 192Khz.
Medium price hi-fi speakers [Magnat "Vintage 350", rated 20Hz - 35KHz]
Rode NT1-A microphone
Behringer analogue mixer
Dated oscilloscope

*******************

As it is late, I will not attempt to upload any of the test files.  They are quite easy to generate using cooledit or audition, anyway. [Edit: Stereo test files are now at post #105.]

I imagine that these results are no surprise to many readers, but will surprise some others.

How this type of experiment relates to the proposition that an audio bandwidth of around 20KHz is sufficient for the human listening experience I will leave to others to comment on, if they so wish.

I note that by sending the third harmonic through a separate amplifier, I avoided the issue of intermodulation distortion in the amplifier and the speakers [though not any possible IMD in my own hearing].  I listened at what I'd term a 'moderate' level, certainly not a loud level for listening to music.  The 24999Hz waveform when displayed on the oscilloscope looked quite smooth (a sinusoid)  when only it was being played.  Similarly when only the 8333Hz waveform was played, there was a smooth sinusoid.  However when the combined waveform was played through one speaker [or when played with two adjacent speakers each taking a separate frequency], the shape of the waveform altered on the oscilliscope, and the quality of the sound changed slightly for my ears.


ABX results:

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/26 00:20:21

File A: \\star8\shareddocs\sineplus3rdharmonic\8333inleft&8333_3rdharmonicinright@192.wav
File B: \\star8\shareddocs\sineplus3rdharmonic\8333inleftnothinginright@192.wav

00:20:21 : Test started.
00:20:40 : 01/01  50.0%
00:20:52 : 02/02  25.0%
00:23:49 : 03/03  12.5%
00:43:17 : 04/04  6.3%
00:43:57 : 05/05  3.1%
00:44:12 : Test finished.

----------
Total: 5/5 (3.1%)
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-25 18:16:38
It would be better to get some NATURAL sound samples and try to ABX a 24 kHz lowpass applied to them. If the third harmonics had the same amplitude as the base frequency, then it is very unnatural sound and unlike anything you're going to listen to. The tweeters are usually not dimensioned to take such amplitudes, so be cautious!
Also, try doing a similar test you did with headphones to rule out environment interaction.

If a 24 kHz lowpass degrades your general listening experience, then you can make a clear conclusion - 48 kHz (and less) sampling rate is not enough for you.
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-05-25 19:13:44
It would be better to get some NATURAL sound samples and try to ABX a 24 kHz lowpass applied to them. If the third harmonics had the same amplitude as the base frequency, then it is very unnatural sound and unlike anything you're going to listen to. The tweeters are usually not dimensioned to take such amplitudes, so be cautious!
Also, try doing a similar test you did with headphones to rule out environment interaction.
Well, yeah - real music would be better, but test signals do tell you something interesting - that it's possible to hear the difference in some cases. This makes the search for other (possibly real music) cases more interesting. I agree that a headphone test would be interesting.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-26 04:06:54
It would be better to get some NATURAL sound samples and try to ABX a 24 kHz lowpass applied to them. If the third harmonics had the same amplitude as the base frequency, then it is very unnatural sound and unlike anything you're going to listen to.


The problem with that approach is that it would probably be inconclusive.  Loudspeakers differ in their responses to complex waveforms.  A filter to separate out the above 24Khz components from the below 24Khz components would be problematic to implement if an attempt were made to separately amplify and reproduce the above 24KHz components.

The test I did was quite clinical, in creating the harmonic separate to the fundamental.  I note that some natural instruments do create strong odd order harmonics.

I did actually find an orchestral recording that sounded better to my ears at its 96KHz sample rate on my system, and I made reference to it earlier in this thread.  I did not receive permission to upload it, but even if I had been able to upload it, I'm sure opinions would have varied as to why the sound was different.

The differences involved are very small.  I feel I have to move on to other things rather than carry out further amateur 'investigations' into the effects of differing sample rates, with my domestic equipment.  I feel satisfied that for many recordings, 48KHz is quite adequate when tested practically, and I note that this is underpinned by a body of expert opinion that 48KHz is quite adequate in theory; as for example expressed in this thread.

One aspect is that studio microphone design may not attach much importance to frequencies much above 20KHz, as such frequencies have been considered of little relevance, and not worth the design effort, and cost of manufacture.

Cheers.
Title: Resampling down to 44.1KHz
Post by: greynol on 2008-05-26 06:29:26
I did actually find an orchestral recording that sounded better to my ears at its 96KHz sample rate on my system, and I made reference to it earlier in this thread.  I did not receive permission to upload it, but even if I had been able to upload it, I'm sure opinions would have varied as to why the sound was different.

You don't need permission to upload.  Maybe you've exceeded your allotment, in which case delete some of your older stuff or use a third party provider.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-26 13:48:39
Thanks greynol.

A 96KHz extract

I have always found combined strings a good test for audio equipment.  I have come across a recording of an orchestra playing The Earth Overture by Kosuke Yamashita.

The format is 7.1 channel 96KHz 24-bit linear PCM.  (The Blu-ray reference disc has been released by Q-TEC.)

The audio quality is very good.  I found that when I converted a short extract to 48KHz with Audition 3, the quality was reduced slightly (at least as played back by my AVR). In contrast, many other recordings I have experimented with have revealed no apparent (to me) audible differences when downsampled to 48KHz.

The 48KHz version is not quite as smooth sounding.  I find this noticeable in the harmony between the string sections.  With the 96KHz version, the sounds blend such that the strings taking the lower part are less noticeable.  I'll upload a 9 second extract in this post if possible.

And now it is possible:
[attachment=4515:attachment]
[attachment=4522:attachment]

There will no doubt be arguments that any differences are due to the playback equipment and indeed that may be so.  On my equipment I prefer the unconverted version, but the difference is extremely slight, and only noticeable at all if my ears are fresh. (Note: to keep the upload to a manageable file size, only front left and front right of the 7.1 channels have been extracted for this test file.)

I agree that a headphone test would be interesting.

*if sending 8333Hz to one ear and switching 24999Hz on and off for the other ear, there was no change I could perceive, whether the 24999 was on or off.
*if sending the 24999Hz to the same ear as the 8333Hz, the sound volume seemed slightly less and the sound quality was different when the 24999 signal was on.


If anyone wants to experiment with the 8333Hz and 24999Hz tones, be mindful that 8333Hz is an irritating high frequency and is likely to lead to a headache for anyone in earshot, if ABXing is attempted.  The playback volume should be kept at a modest setting.  The 24999Hz tone will not be audible by itself but will stress any tweeter that is attempting to reproduce it - another reason to avoid a high playback volume.

Here are test files in stereo:
[attachment=4519:attachment]
[attachment=4520:attachment]
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-05-26 22:41:36
There will no doubt be arguments that any differences are due to the playback equipment and indeed that may be so.
When I re-convert your 48kHz sample to 96kHz and subtract it from the original, there is almost exclusively energy above 22kHz. The part of the signal in the (presumed) audible band below 22kHz is noise at about 2 LSB@24bit level. This will most likely be swamped by the DAC's internal noise during playback. It seems that if you hear differences between the 48 and 96kHz version, it's caused by the >22kHz part of the signal.
Would you be able to record your speaker's output with a microphone? Needless to say that both need to have sufficient bandwidth for this. It would be interesting to analyse the signal that is hitting your ear.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-27 02:36:55
My equipment is not suitable for measuring high frequency performance.  The Rode microphone has very low noise but is limited in its response above 20KHz.  I measured the response of the speaker and microphone combination to a sine wave at 25KHz.  It was 15dB down compared with 8333Hz.  (Also, the oscilloscope revealed a sinusoidal response at 25KHz even if the waveform input was changed from a sine wave to a triangular wave!)

If the human listening experience is different when frequencies above 20KHz are allowed to pass through the recording and reproduction chain, this may be because of indirect effects, e.g. instantaneous changes in standing wave patterns in the listening room that an ‘acoustically athletic’ system is able to stimulate rapidly and with fine detail.  I am only speculating. We may perceive the higher frequencies (and perhaps the faster response time to change) indirectly because of the effect on the amplitude and apparent timbre of frequencies we can hear.

As I have indicated, I can perceive a strong third harmonic of an 8333Hz tone indirectly by its effect on the tonal quality I hear.  Of course I cannot hear the third harmonic (24999Hz) if it is presented as a continuous tone, without the fundamental.

By the way, with the particular orchestral sample I have uploaded, I sometimes hear the 96Khz version as a little cleaner (less muddied) than the 48KHz version.  The difference is slight.

I'd be interested in whether others perceive a difference when ABXing (at moderate volume!) the stereo files I provided at the end of post #105.
Title: Resampling down to 44.1KHz
Post by: SebastianG on 2008-05-27 09:20:06
If the human listening experience is different when frequencies above 20KHz are allowed to pass through the recording and reproduction chain, this may be because of indirect effects

say "nonlinear".

Remember the linearity property?
f is linear => f(x) + f(y) = f(x + y)
Consider f to be your sound reproduction system. 'x' would be the 8 kHz tone and 'y' the 24 kHz tone.
Since you can differentiate between f(x) and f(x+y) the system 'f' might be nonlinear.

e.g. instantaneous changes in standing wave patterns in the listening room

huh?!

We may perceive the higher frequencies [...] indirectly because of the effect on the amplitude and apparent timbre of frequencies we can hear. As I have indicated, I can hear a strong third harmonic of an 8333Hz tone indirectly by its effect on the tonal quality I perceive.  Of course I cannot hear the third harmonic (24999Hz) if it is presented as a continuous tone, without the fundamental.

Be careful with the wording here. There's no indication that you can hear the third harmonic. It's probably only its effect on the recording/playback chain. A nonlinear chain would explain the change in amplitude of the fundamental frequency you experienced. This would be an artefact, an imperfection of the chain.

The interesting question is now: Is your playback chain really to blame? If yes, this would mean that what you hear is not really what it's supposed to sound. You could try to test this by recording the sound you're hearing and analyze whether the third harmonic somehow affects the audible spectrum measurably.

In the presumably unlikely case your playback system is not distorting the sound you might be on to something. I guess this would be an indication for nonlinear effects in the human auditory system that cause intermodulation or something. This is something that I can't completely rule out but is unlikely and to my knowledge nobody has reported such findings.

Cheers,
SG
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-27 10:10:48
In my 2nd and 3rd tests in post #100, the 24999Hz tone was amplified in a separate channel in the Audio Video Receiver and sent to its own speaker. 

I was nevertheless concerned that the power drain of reproducing the 24999 tone in addition to the 8333 tone might be slightly affecting the performance of the channel carrying the 8333Hz tone,  so in a supplementary test I'll label test 4, I played the stereo test files on an older AVR for the harmonic and a portable cassette recorder for the fundamental.  (I used my Audigy 4 hub to decode the 192KHz SPDIF, and used its line level outputs to feed the independent amplifiers.)

I could still hear a difference when 24999Hz started coming from a hi-fi speaker driven by the old AVR, provided 8333Hz was coming from the portable cassette player.

I conclude from this that either or both of my ears are in fact slightly non-linear for a tone of 8333Hz that has a 24999Hz tone piggy-backed on top.  If others can try out the stereo test files I have provided, they may find that they too have slightly non-linear hearing.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-27 12:47:55
So you say that:
1) If you play back just the 25kHz sinus, you hear nothing
2) If you play the 8333 Hz sinus in addition, you hear something
3) Now, if you turn off the 25kHz one (which you're not able to hear), you clearly hear SOMETHING ELSE?!!

This is kind of weird.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-05-27 13:03:12
Test 4 is interesting - thank you MLXXX.

It doesn't work for me, but maybe it'll work for someone.

Of course, the Audigy can still come under suspicion, but I have no idea how realistic such an explanation would be. You could re-capture the output and examine it to remove any possible suspicion.

FWIW The definitive test is with two separate signal generators, feeding two separate amplifiers and speakers.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-27 14:29:08
So you say that:
1) If you play back just the 25kHz sinus, you hear nothing
2) If you play the 8333 Hz sinus in addition, you hear something
3) Now, if you turn off the 25kHz one (which you're not able to hear), you clearly hear SOMETHING ELSE?!!

This is kind of weird.

Yes but the 24999 Hz was strictly synchronised with the 8333 Hz.  On the oscilliscope, the signal from the microphone had a very different waveshape when the 8333 and 24999 from the separate hi-fi speakers combined (the mic was at about 1.5m).  It was not the perfect waveshape that cooledit displays for a fundamental plus third harmonic.  This was, I presume, because the microphone (and to a degree the tweeter) would have struggled to operate at 25KHz.

There may have been a small intermodulation product at 16666Hz (the difference frequency, and the 2nd harmonic) causing the different timbre for my hearing, but I really don't know how non-linearity works with human hearing.

For my ears, the effect was slight, but ABXable.

It doesn't work for me, but maybe it'll work for someone.

Indeed.  Thanks for trying it out.

______________________

EDIT:
Using Google and the search words 'non-linear' and 'ear' I've found a dozen or so relevant sites, though many sources are only fully downloadable for a fee.  Here are a couple of weblinks that refer to difference tones that can be heard because of the non-linearity of human hearing:-

[blockquote]http://www.isvr.soton.ac.uk/SPCG/Tutorial/...-difference.htm (http://www.isvr.soton.ac.uk/SPCG/Tutorial/Tutorial/Tutorial_files/Web-hearing-difference.htm) - note: the embedded test file on this webpage does not appear to be available

http://www.mp3-tech.org/programmer/docs/no...man_hearing.pdf (http://www.mp3-tech.org/programmer/docs/non-linear_human_hearing.pdf) - a more technical article[/blockquote]
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-27 18:01:13
Yes but the 24999 Hz was strictly synchronised with the 8333 Hz.  On the oscilliscope, the signal from the microphone had a very different waveshape when the 8333 and 24999 from the separate hi-fi speakers combined (the mic was at about 1.5m).
It might be strictly synchronized on the soundcard output, but as soon as those signals come from two places far apart, there will always be a phase-shift field. The 8333 Hz signal should have the wave-length of about 4 centimeters (1/8333 s *330 m/s), so, theoretically (loudspeaker being an ideal point source of sound), it spawns a concentric spheric field (any sphere with center in the loudpeaker should exhibit the same signal phase). The same holds for the 25 kHz sine coming from another speaker but the wavelength is one-third (so the imaginary spherical field is more "dense"). By walking around with the microphone, the shape of the recorded wave should vary alot, depending on the relative phase of the signals at the microphone position. Moreover, wall reflections may actually reinforce or damp each of the signals and make the sound field even more complicated.
You would have to test that in an anechoic chamber but I doubt you have any at your disposal.
Borrow some good headphones and try listening to the mixed signal (both signal in both channels) versus just 8333. If you don't hear the difference like you did with the loudspeakers, you may rule out your ear nonlinearity.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-05-28 08:49:35
Martel, I think I have already established that non-linearity applies to my hearing, by using separate loudspeakers and two separate power amplifers (see my reference to test 4 in my response to SebastianG a few posts above).

If I remain perfectly still whilst the stereo test files are playing end to end, the quality and volume of the sound I hear varies slightly when the 24999Hz wave (from speaker 2) starts.  There appear to be only two explanations for that:
[blockquote](a) I can hear the 24999Hz from speaker 2 directly in its own right (this surely must be discounted as I cannot hear the 24999Hz when it is played in isolation);
(b) I hear the 24999Hz indirectly because of a slight non-linearity in my hearing (a recognised characteristic of human hearing) that allows my ear's response to speaker 1 to be influenced by the sound coming from speaker 2.[/blockquote]

After reading the internet material on the non-linearity of human hearing, I do not find the experimental outcome surprising.  However I would emphasise that the effect was very slight.
Title: Resampling down to 44.1KHz
Post by: Martel on 2008-05-28 09:37:44
After reading the internet material on the non-linearity of human hearing, I do not find the experimental outcome surprising.  However I would emphasise that the effect was very slight.
Do not trust everything you read. The fact that the paper was presented in a conference doesn't mean much. Do not hesitate to confirm the claims using headphones to rule out environment infulences. I would also advise to analyze the signal coming out of the DAC directly with the oscilloscope just to be sure it was as intended.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-06-09 08:45:37
Martel, I did some tests with simultaneous tones around 22KHz and 23Khz.  No difference frequency (e.g. 1KHz) was audible if separate speakers were used.

This contrasts with tones much lower down such as 2500Hz and 3000Hz which for my ears produce a 500Hz difference if I listen carefully, even with a separate speaker for each tone.  [There are many webpages that refer to this phenomenon of a difference frequency being audible at these lower test frequencies.]

What the explanation is for 24999 affecting the hearing of 8333 using separate loudspeakers I cannot say.  Since making my previous post in this thread, I've tested using an entirely separate tone generator and found it is not necessary for the frequency to be exact for the presence of the higher frequency to be detectable.

For my ears, it is not so much a case of 'hearing' the higher tone, as being conscious of it.  The effect on hearing can be a little unpleasant, and not dissimilar to having a cold.  Anyway I am now looking at another aspect: time resolution affecting the downsampling process.

_____________________________________________

MLXXX's 96Khz 'repeating clicks' test file

96/24 is very topical in the home theatre community.  Sound cards are only just now coming on to the market that will enable high-definition sound from Blu-ray or H-DVD to be played back on a Home Theatre PC through HDMI.

"How 'useful' is this?", we might ask, if it cannot be shown that 96Khz is superior to 48Khz.

I thought it relevant to enquire into the ability of human hearing to detect slight differences in timing as between the sound reaching one ear, and the other.  I discovered articles on interaural timing differences, and interaural level differences.  Some articles referred to the capacity to detect differences of tens of microseconds, whereas others referred to differences in the order of milliseconds.

With my own stereo speakers, I discovered I could detect a difference of one sampling interval at 48Khz, i.e. 20.83μS (microseconds), when listening to a click coming from both speakers, if the click was advanced or retarded by one sampling interval. 

With difficulty I could even detect a difference of advancement or retardation of one sampling interval at 96Khz, i.e. 10.42μS.

'Aha!', I thought. Now I will be able to create a test to demonstrate that 48KHz is inadequate.  So I ran the 96KHz test file containing the stereo click through a resampling conversion (using Adobe Audition) to 48Khz and played the click back, expecting not to be able to hear the difference of one sampling position between the left and right channels.  But  I was wrong.  It was audible, possibly even more audible than in the original 96KHz file!

How was this possible?  Well, on examining the dowsampled file I saw that Adobe Audition had flattened out the waveform if it was displaced by one sampling position.  Cooledit did the same.  So did Audacity. (The resampling algorithm used by N-track did not produce as noticeable a difference.)

The click sounded different at 48Khz than at 96Khz but it depended on whether the click occupied an odd sample number or an even sample number in the 96KHz file, as to how different the click at 48KHz sounded!

It is geekish to listen to single sample clicks.  To make this phenomenon more accessible, I decided to create them in a regular rhythm so that they could be heard as a continuous tone.  I found it easier to write my own code rather than attempt to use the tone generators in Cooledit etc..  I chose to repeat the click every 30 samples at 96Khz, and to alternate the polarity .  This was equivalent to a 1600Hz square wave (with a very low duty cycle).

Knowing that readers of HA tend to be [very] keen on proof, I supply the following code that will generate a test file at 96KHz in three bursts, using GNU Octave as the software.  The middle burst is displaced one sample position compared with the first and third bursts:

Code: [Select]
% Creates a waveform in accordance with the waveform values for one cycle appearing in line 5.
% The final output is at a sample rate of 96KHz in  three bursts with the middle burst offset by one sample compared with the other two.  The offset can make a difference when subsequently downsampling to 48KHz.
totalsamples=150000
c=zeros(totalsamples,1);
d=[0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;.5;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
;0;0;0;0;0;0;0;0;0;0;-.5;0] #60 values
wavespace=int16(totalsamples/60)
tranche2=int16(wavespace/3)
tranche3=int16(wavespace*2/3)
offset=0
for e=1:wavespace -2
if (e==tranche2) offset=1
     elseif (e==tranche3) offset=0;
         endif
f=(60 * e)+offset
for h=1:60
c(f+h,1)=d(h);
endfor
endfor
%The next section adds fadeins and fadeouts.
for z=1:totalsamples
v=c(z,1);
% To disable the optional sine wave, add a percentage sign at the start of the next line
v=v+.05*sin(1.995*pi*z/60);
if (z==1) | (z==50000) | (z==100000) fade=0
     elseif (z==46000) | (z==96000) | (z==146000)  fade=-4000
     endif
     fade=fade+1;
     fd=abs(fade);
     if (fd<2000) v=0;
         elseif (fd<4000) v=v*(fd-2000)/2000;
endif
    c(z,1)=v;
    endfor
wavwrite('1samplewidthat96KHz---10.42microseconds---clickswithrepetitionrateof1600Hz---Optionalsinewaveat1595Hz.wav',c,96000,16)
disp('***Completed***')


[I find QtOctave convenient as a graphical user interface.  It is available as a free all-inclusive zip download, as a binary for Win32.  The code above runs in about 30 seconds on a 3.0 GHz single core PC using Windows XP.]

For those preferring to download the file direct, here it is as a 96Khz 16-bit wav file:
[attachment=4552:attachment]

Here it is resampled to 48Khz 16-bit (no dither) using Audition at the maximum quality setting:
[attachment=4553:attachment]

There is a wavering quality because of the inclusion of a low level sine wave at approximately 1595Hz. This interacts with the clicks.  The clicks come every 30 samples, and last for 1 sample period, and alternate in polarity.  I find the sine wave acts as a reference level for the ear making it easier to pick that the middle burst is different.  (If desired, in the code, the sine wave can be disabled by using the comment character '%').

You will probably find that with file 2, the middle burst is softer and has a different tonal quality.

For those who like a graphical presentation, here is how Audition displays the files after the conversion to 48KHz:
[attachment=4554:attachment]

The top right-hand graph is without any sine wave added (simply a click every 30 sampling periods at 96KHz, resampled to 48KHz).  The drop in level according to the audition output meters is about 3.5dB for the middle burst.

So what do I conclude?  I conclude that if a brief click is captured at 96KHz, and then downsampled to 48Khz there may be a change in the tonal quality and/or apparent volume.  An electronic instrument could produce such sounds continuously, just as the Octave software has done.

I would conjecture, though I haven't tried this out on anything other than files of repeating clicks, that if downsampling a sound file from 96Khz to 48Khz, there can be a different result [both in the arithmetic values, and how they sound when reconstructed] if one sample of silence is added at the beginning, prior to commencement of the resampling process.  This would presumably only be the case for brief clicks or other transient content.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-06-09 11:48:54
You won't be surprised to hear that...

a) I can't hear any difference, and
b) I still think there's some other explanation


If you take your 48kHz file, and resample it back to 96kHz, you'll find there's almost no difference in the waveform of the centre burst compared with the first and last bursts.

I'd be interested to know if you hear a difference between the three bursts in that
"re-sampled back to 96kHz" version.

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-06-09 12:03:02
For those who like a graphical presentation, here is how Audition displays the files after the conversion to 48KHz:
[attachment=4554:attachment]
The top right-hand graph is without any sine wave added (simply a click every 30 sampling periods at 96KHz, resampled to 48KHz).  The drop in level according to the audition output meters is about 3.5dB for the middle burst.
Apparently Audition displays the sample values, which isn't an accurate representation of the analog output of the DAC. iZotope RX has an option to display (in red) the reconstructed analog waveform.[attachment=4555:attachment]As you can see the difference is gone now. Looking at waveforms can be very instructive, but never forget that they are just an approximation of the DAC's output.
Title: Resampling down to 44.1KHz
Post by: SebastianG on 2008-06-09 12:38:23
[...] I am now looking at another aspect: time resolution affecting the downsampling process.

btw: I'm sure you'll find plenty of "time resolution" threads on HA.org. Basically people who don't understand the sampling theorem start these kinds of discussions.

With difficulty I could even detect a difference of advancement or retardation of one sampling interval at 96Khz, i.e. 10.42?S.

'Aha!', I thought. Now I will be able to create a test to demonstrate that 48KHz is inadequate.

Where's the connection? It sounds like you think sub sample delays can't be represented in a discrete time signal. (A related question in many "time resolution" threads.)

How was this possible?  Well, on examining the dowsampled file I saw that Adobe Audition had flattened out the waveform if it was displaced by one sampling position.  Cooledit did the same.  So did Audacity. (The resampling algorithm used by N-track did not produce as noticeable a difference.)

Does it matter how it looks? I'm sure that Cooledit and Audacity (when zoomed out) only show max and min samples which doesn't tell you the actual max/min of the reconstructed waveform.

[...] I chose to repeat the click every 30 samples at 96Khz, and to alternate the polarity .  This was equivalent to a 1600Hz square wave (with a very low duty cycle).

It's not equivalent (in terms of the harmonics' magnitues) to a square wave unless you feed a leaky integrator with this.

[...] (no dither) [...]

Why not?

There is a wavering quality because of the inclusion of a low level sine wave at approximately 1595Hz.

What is "wavering quality"?
Let me recap: Pulses at 96000 Hz sampling frequency, alternating polarity, placed 30 samples apart.
This certainly leads to a waveform period of 60 samples => fundamental frequency = 96000/60 = 1600 Hz.
This kind of periodical wave form contains the frequencies 1600 Hz, 1600*3 Hz, 1600*5 Hz, ..., 1600*29 Hz which share the same amplitude (unlike square waves). When properly downsampled to 48 kHz it should reduce to frequencies 1600 Hz, 1600*3 Hz, ..., 1600*13 Hz. The anti-alias filter might damp the last one (1600*13=20800) slightly.

I find the sine wave acts as a reference level for the ear making it easier to pick that the middle burst is different.

Is it different apart from the way it looks?

You will probably find that with file 2, the middle burst is softer and has a different tonal quality.

What would your explanation be?

Cheers,
SG
Title: Resampling down to 44.1KHz
Post by: cabbagerat on 2008-06-09 14:27:36
MLXXX, can you please record the output of your sound card when playing these samples? Using a microphone to record what is playing through your speakers would be best, but recording the output of the DAC with a loopback cable at 96kHz would probably be best.


[...] I am now looking at another aspect: time resolution affecting the downsampling process.

btw: I'm sure you'll find plenty of "time resolution" threads on HA.org. Basically people who don't understand the sampling theorem start these kinds of discussions.

To save you the effort of searching, this (http://www.hydrogenaudio.org/forums/index.php?showtopic=49043) is one of the more recent, and more complete threads. Woodinville (amongst others) cover the time resolution thing really well in some posts to that thread.
Title: Resampling down to 44.1KHz
Post by: 2Bdecided on 2008-06-09 15:46:36
To save you the effort of searching, this (http://www.hydrogenaudio.org/forums/index.php?showtopic=49043) is one of the more recent, and more complete threads.
Are you trying to torture us?!

Cheers,
David.
Title: Resampling down to 44.1KHz
Post by: MLXXX on 2008-06-09 20:43:15
... If you take your 48kHz file, and resample it back to 96kHz, you'll find there's almost no difference in the waveform of the centre burst compared with the first and last bursts.

I'd be interested to know if you hear a difference between the three bursts in that
"re-sampled back to 96kHz" version. ...
I think so, but I will have to do a proper ABX test and report results.


Apparently Audition displays the sample values, which isn't an accurate representation of the analog output of the DAC. iZotope RX has an option to display (in red) the reconstructed analog waveform. As you can see the difference is gone now. Looking at waveforms can be very instructive, but never forget that they are just an approximation of the DAC's output.
I think Audition does an approximation to a reconstruction.  In contrast, Audacity makes no attempt at reconstruction.  (This evening, I downloaded a trial of iZotope RX but wasn't able see how to activate an advanced reconstruction display.)

Using an oscilloscope to view the analogue output, there was no drop in apparent level when the samples were offset, so indeed the Audition reconstruction (similar to Cooledit) must only go some distance towards indicating the actual reconstruction in the hardware DAC.



Where's the connection? It sounds like you think sub sample delays can't be represented in a discrete time signal. (A related question in many "time resolution" threads.)
Yes, I was doubtful, but am beginning to understand the concept of reconstruction.

There is a wavering quality because of the inclusion of a low level sine wave at approximately 1595Hz.
What is "wavering quality"?
Simply the beat at approximately 5Hz between the 1600Hz and the 1595hz (approx).  [For some tests this beat is an unnecessary distraction.  I have been doing further testing with the sine wave off.]

You will probably find that with file 2, the middle burst is softer and has a different tonal quality.
What would your explanation be?
Am still investigating.  There may be a slight fault somewhere in the signal processing.  Certainly the audible differences are very small.

You may be referring to the fact that suggestion can play a surprisingly powerful role in hearing perception.  I recognize that.  It is one reason ABX testing is so important, even when we think we can hear a difference. 



To save you the effort of searching, this (http://www.hydrogenaudio.org/forums/index.php?showtopic=49043) is one of the more recent, and more complete threads. Woodinville (amongst others) cover the time resolution thing really well in some posts to that thread.
Very relevant indeed, that 18 month old thread. Thanks, cabbagerat.  I'm half way through it and I think I am beginning to understand how subsample timing is effectively encoded to and recoverable from wav files.
Title: Resampling down to 44.1KHz
Post by: AndyH-ha on 2008-06-10 04:03:21
I am pretty sure that the CoolEdit/Audition code produces the theoretically perfect reconstruction display. There is no reference to any particular hardware, you get the same display no matter what soundcard you use. There was some extended discussion of this on the Audiomaster’s forum, should you care to look for it.
http://www.audiomastersforum.net/ (http://www.audiomastersforum.net/)
Title: Resampling down to 44.1KHz
Post by: Kees de Visser on 2008-06-10 06:40:30
I downloaded a trial of iZotope RX but wasn't able see how to activate an advanced reconstruction display.
I'm using the Mac version, but presume the pc version to be similar. Go to Preferences, click the Misc tab and enable the option Show analog waveform.
Quotes from the iZotope RX user manual:
Quote
When digital audio is played back, it gets converted to analog. The peak values in the analog waveform can be larger than the peaks in the digital waveform, leading to "analog clipping" which can be problematic in some cases. When "show analog waveform" is enabled, RX will compute an analog waveform in the background. Any peaks will be highlighted in red on top of the existing digital waveform.
In the Misc tab there's also a "Waveform interpolation order" setting, ranging from 0 to 64.
Quote
If you zoom into the waveform so that individual samples become visible, RX will display an upsampled analog waveform as well as the individual digital samples. The interpolation order controls the quality of upsampling. Higher values yield more accurate analog waveforms at the expense of CPU usage.
Title: Resampling down to 44.1KHz
Post by: krabapple on 2008-06-10 07:16:41
I am pretty sure that the CoolEdit/Audition code produces the theoretically perfect reconstruction display. There is no reference to any particular hardware, you get the same display no matter what soundcard you use. There was some extended discussion of this on the Audiomaster’s forum, should you care to look for it.
http://www.audiomastersforum.net/ (http://www.audiomastersforum.net/)


In corroboration of this, I have compared waveforms of rips to those of ADC recordings (m_audio 2496) of the same track, using Audition, and they look and measure extremely similarly once peak levels are matched.