Hydrogenaudio Forums

Hydrogenaudio Forum => Listening Tests => Topic started by: 2Bdecided on 2010-07-16 17:07:42

Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-16 17:07:42
Here:

http://www.aes.org/events/128/papers/?ID=2252 (http://www.aes.org/events/128/papers/?ID=2252)

Quote
P18-6 Sampling Rate Discrimination: 44.1 kHz vs. 88.2 kHz—Amandine Pras, Catherine Guastavino, McGill University - Montreal, Quebec, Canada
It is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1 kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1 kHz and 88.2 kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1 kHz, 88.2 kHz, and the 88.2 kHz version down-sampled to 44.1 kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2 kHz and files recorded at 44.1 kHz.
Convention Paper 8101


Was anyone at the presentation? Has anyone bought the paper?

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Dologan on 2010-07-16 18:18:49
Interesting. The fact that the discrimination was mostly limited to the downsampled version would seem to indicate that any audibility issues are with the downsampling procedure rather than the sampling rate itself. There is, however, the case of the orchestral music...

Searching for the paper I also found that the same authors have published a test comparing CD-quality vs MP3:

Quote
Subjective Evaluation of MP3 Compression for Different Musical Genres

Mp3 compression is commonly used to reduce the size of digital music files but introduces a number of potentially audible artifacts, especially at low bitrates. We investigated whether listeners prefer CD quality to mp3 files at various bitrates (96 kb/s to 320 kb/s), and whether this preference is affected by musical genre. Thirteen trained listeners completed an A/B comparison task judging CD quality and compressed files. Listeners significantly preferred CD quality to mp3 files up to 192 kb/s for all musical genres. In addition, we observed a significant effect of expertise (sound engineers vs. musicians) and musical genres (electric v.s acoustic music).


Again, frustratingly little information given about the details of the test...
Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2010-07-16 19:33:30
http://coltrane.music.mcgill.ca/MAQ/experiments (http://coltrane.music.mcgill.ca/MAQ/experiments) contains the PowerPoint presentation of the MP3-vs.-CD talk.

(Edit after reading: quite interesting results, but nothing new really, and the results for musicians should be treated with care since there were only 4 of them.)

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-16 19:45:01
Here:

http://www.aes.org/events/128/papers/?ID=2252 (http://www.aes.org/events/128/papers/?ID=2252)

Quote
P18-6 Sampling Rate Discrimination: 44.1 kHz vs. 88.2 kHz—Amandine Pras, Catherine Guastavino, McGill University - Montreal, Quebec, Canada
It is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1 kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1 kHz and 88.2 kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1 kHz, 88.2 kHz, and the 88.2 kHz version down-sampled to 44.1 kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2 kHz and files recorded at 44.1 kHz.
Convention Paper 8101


Was anyone at the presentation? Has anyone bought the paper?


Some friends from here went to the convention and we discussed a number of the papers last Saturday. 

Problems I see with the tests is that they were done rather awkwardly and with the DACs running at different sample rates. Recording with parallel ADCs running at different sample rates introduces more potential for artifacts.

It is easy to show that recording and playing at different sample rates sounds different if the equipment has artifacts that are associated with different sample rates, which is not uncommon.

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.
Title: 44.1 vs 88.2 ABX report at AES
Post by: pbelkner on 2010-07-16 20:01:06
The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Dologan on 2010-07-16 20:03:55
The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

No. Upsampling shouldn't have any impact on the sound, but would avoid any potential issues with the playback hardware.
Title: 44.1 vs 88.2 ABX report at AES
Post by: pbelkner on 2010-07-16 20:19:58
but would avoid any potential issues with the playback hardware.

That's exactly what I mean. Probably you always have some kind of playback hardware (including their issues) ... 
Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2010-07-16 20:25:55
Quote
Overall, participants were able to discriminate between files recorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2 kHz and files recorded at 44.1 kHz.

So this basically means it's more important to make the DAC high-resolution than to make the ADC high-resolution. Interesting. So pbelkner's question is legitimate. Maybe a 88.2-kHz DAC tends to reproduce a 44.1-kHz recording more faithfully than a 44.1-kHz DAC? Assuming a good upsampling algorithm, of course.

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: Dologan on 2010-07-16 20:30:40
http://coltrane.music.mcgill.ca/MAQ/experiments (http://coltrane.music.mcgill.ca/MAQ/experiments) contains the PowerPoint presentation of the MP3-vs.-CD talk.

Hey, thanks for the link. I tried finding something, but failed.

I guess I'll post a few of the details and observations here so that people don't need to go to the PDF to check them:

- Trained listeners were 9 sound engineers and 4 musicians of age ~28 (SD 5.6 yr)
- LAME used, unknown version. 96, 128, 192, 256 and 320. Alas, no VBR it seems. Pity.
- Genres: Pop, Metal/Rock, "Contemporary", Classical, Opera. <10sec excerpt.
- HQ speaker setup (not headphones). Wonder if it would have made much difference?
- 150 randomized trials (per excerpt? overall? not clear). Pairwise A/B. Testers asked to "prefer" one or the other and then the overall % tested for statistical significance.
- For 256 and 320 preference was 50/50 (so not significant). For 192, 128 and 96 it was 60/40, 75/25 and 80/20, respectively (significant).
- Sound engineers were more likely to prefer the higher quality version than musicians. Electric genres (pop/metal) were more frequently preferred in their HQ version than acoustic ones.
- Order of problems cited in decreasing frequency were: high freq artefacts, general distortion, transient artefacts, stereo image, dynamic range, reverb, background noise.
- No correlation between listening habits and performance.
- In the conclusion it is stated that trained listeners can not discriminate between CD quality and mp3 compression  at 256-320 kb/s, while expert listeners could. Not sure who these "expert listeners" are supposed to be, but probably the test subjects from a referenced paper (Sutherland 2007?) who reportedly could do so even at 320.

I suppose that the results aren't very surprising, really. If VBR had been used, I imagine the threshold bitrate would likely end up being around those obtained by V2, supporting its status as "best value" setting.
Title: 44.1 vs 88.2 ABX report at AES
Post by: benski on 2010-07-16 20:47:34
Yes, exactly.  It is cheaper to design a 192kHz DAC than 44.1kHz DAC because the reconstruction filter can be a lower order.  If you compared two theoretically perfect DACs, 192 vs 44.1, perfect upsampling would make no audible difference.  But at the consumer-end of the market, a 192kHz DAC is likely to have better audio performance characteristics than a 44.1kHz DAC.  How much impact this has in practice is fairly small, and will only decrease as technology becomes cheaper.

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

No. Upsampling shouldn't have any impact on the sound, but would avoid any potential issues with the playback hardware.
Title: 44.1 vs 88.2 ABX report at AES
Post by: pbelkner on 2010-07-16 20:59:49
So this basically means it's more important to make the DAC high-resolution than to make the ADC high-resolution. Interesting. So pbelkner's question is legitimate. Maybe a 88.2-kHz DAC tends to reproduce a 44.1-kHz recording more faithfully than a 44.1-kHz DAC? Assuming a good upsampling algorithm, of course.

My preference is to up-sample 44.1 kHz to 88.2 kHz and 48 kHz to 96 kHz, respectively, depending on the original sample rate, i.e. no fixed rate for up-sampling but an integer multiple of the input sample rate.

My preferred algorithms are the ones provided by the SoX library (http://sox.sourceforge.net/ (http://sox.sourceforge.net/)).
Title: 44.1 vs 88.2 ABX report at AES
Post by: krabapple on 2010-07-16 22:23:58
1)  I suggest this thread be focused on the 44.1 vs 88.2 paper.  A separate thread can be about the MP3 study.

2) I bought the paper.  Here's a paraphrase of the methods and results. Note that the test signals were recorded by the authors.


subjects : 15 male, 1 female, all having at least 3 yrs of sound engineering experience, six being pros, ten being students. All but one were musically trained.

 
equipment:  the recording microphones (a pair of Sennheiser MKH 8020) had a FR of  10Hz-60kHz.  Two stereo feeds from the mic preamp (Millennia HV-3D) to two Micstasy ADCs, one set to 44.1/24 the other to 88.2/24 ; then the 44.1/24 digital signal was recorded (at 44.1) on a Sound Devices 744T portable recorder, while the 88.2 output was recorded on a MacBook Pro at 88.2 using Logic Studio software.  The recording diagram also shows that the 44.1 ADC used its internal clock, while the 88.2 ADC's master clock was a Mutec .


test signals:  five musical/instrumental  (orchestra, classical guitar, cymbals , voice, violin) recordings by the authors, from live performances at McGill that took place in several halls/rooms with varying dimensions & acoustics.  For use in tests, 5-8 sec excerpts were used, with no processing except fade in.out via Pyramix 6 software.  Care was taken to make the fades the same on the 44.1 and 88.2 examples.  The 88.2 excerpts were also then downsampled to 44.1 via Pyramix software, so there were three sets of signals, native 44.1, native 88.2, and downsampled 88.2-->44.1. 


playback:  5 blocks corresponding to the 5 excerpts, 12 trials per block ( i.e. all pairwise combinations of the three versions, each presented 4 times, twice in each of the two presentation orders) .  Randomized ABX protocol was used.  Listening occurred in an ITU standard room (the Critical Listening Lab of the CIRMMT, Montreal).  Plaback hardware was an RME Fireface 800 DAC, a Grace m906 monitor controller, and a Classe CA-5200 stereo amp, feeding a pair of B&W 802D loudspeakers (FR 70Hz-33kHz).  The authors picked the Fireface because 'it was the only converter that allowed us to switch sample rates between 44.1 and 88.2 in a respectable amount of time."  To avoid clipping a 750ms switching interval was employed, set in the user interface which was Cycle '74's Max/MSP/Jitter software package.  (I'm not quite clear from this how ABX switching was done, though I'm guessing the UI allows it?).  All playback was at 24 bits.


Results:  Considering cumulative binomial test results ( i.e., for all comparisons of all excerpts),  3/16 individuals achieved significant results (p < 0.05, 2-tailed )  but 'they significantly selected the wrong answer'  (?!).  The other 13 didn't perform better than chance, either individually or as a group. 
But for these  13 subjects as a group, IF one considers each format comparison separately (rather than combining all comparisons)  significant results were observed for 88.2 vs downsampled  (p = .04, 2-tailed) .  For the same group a 'tendency' was observed for 88.2 vs native 44.1 (p = 0.1) .  No significant diff between native 44.1 and downsampled 44.1 (p = .2, I guess this doesn't constitute a 'tendency'). 
Results for the 13 subjects (grouped) were also re-analysed by musical type. Significant: Orchestral excerpt  for 88.2 vs native 44.1 (p=.02).  Classical Guitar and Voice excerpts for 88.2 vs downsampled (p = .004, p= .04). Not significant for any excerpt:  44.1 vs downsampled 44.1

The results of the three subjects who signficantly picked the wrong answer were also analyzed further, on a by-format basis (presumably the results of the three subjects were grouped).  Significant:  88.2 vs native 44.1 (p = 0.02); native 44.1 vs downsampled 44.1 (p = 0.02). Not significant: 88.2 vs native 44.1 (p = .15).  On a musical type basis, significant:  Violin excerpt for 88.2 vs 44.1 (p = .006); Guitar and Violin excerpts, 44.1 vs downsampled 44.1 (p = .02 , p = .006).  (I presume these stats are all wrt  RIGHT answers, not WRONG answers, in contrast to the cumulative results, but it's not clear to me if that's true).

Collapsed results of all 16 subjects showed significant different for 88.2 vs native 44.1 Orchestral excerpt (p =.01


All subjects reported (in post-test questionnaires) that it was a very demanding test and they had lots of doubts about their choices. 


I won't summarize the discussion, where the authors try to explain some of the implications and curiosa of the test results.  You should pay yer $20 for that!
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-16 22:25:45
Why upsample? 99.9% of all DACs oversample anyway.

I also don't understand why a 192kHz DAC is supposedly cheaper to build. It is cheaper to build a good sounding 96kHz ADC than a 44.1kHz one, since the latter needs brickwall filtering. But oversampling DACs have quite relaxed filtering requirements already. So why should a 192kHz version be cheaper to build than a 44.1kHz version?

PS

Thank you, krabapple! The data itself shows not a proof but a 'tendency' that the summary is quite fishy...  For anyone claiming that p=0.1 is a 'tendency' that could not be called a baseless offense, I guess.

Have they documented the exact method of downsampling?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Alex B on 2010-07-16 22:33:34
I think the least incorrect way to compare 88.2 KHz and 44.1 Khz would be to record at a higher sample rate, e.g. 176.4 kHz, downsample to 88.2 kHz & 44.1 kHz, and then upsample both versions to a yet another sample rate before the listening test, for instance to 192 KHz. This would apply two sample rate conversions to each test sample.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-16 23:21:19
The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

No. Upsampling shouldn't have any impact on the sound, but would avoid any potential issues with the playback hardware.


Agreed. Some audiophiles will start a controversy at this point because they believe that upsampling improves sound quality even though the music actuall remains the same, even though it is produced by more samples. Must equipment has been sold based on this belief.  In fact if you do a proper job of upsampling, nothing changes in the end when you convert it back to audio.

There have been many improper jobs of resampling that have lead to misleading audible differences. However we have long (over a decade) had good resampling software such as Cool Edit Pro and Audition.

The types of issues I'm specifically trying to avoid are situations where there is some quirk in the playback hardware that is audible at one sample rate, but not the other.

If you are doing ABX that involves comparing two files that are at two different sample rates, switching between them may and has produced transients that are peculiar to one sample rate or the other. For example, if you are switching from A to X and X is B and has a different sample rate than A, the transition may sound different than switching from A to X where X is A and so the sample rate is the same.

Another example might be where the audio interface has slightly different response into a long cable at different sample rates. These sorts of things aren't intentional, don't show up on spec sheets, and may not show up in typical use, but once you set up an ABX testing environment, there you have it!
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-16 23:30:33
I think the least incorrect way to compare 88.2 KHz and 44.1 KHz would be to record at a higher sample rate, e.g. 176.4 kHz, down sample to 88.2 kHz & 44.1 kHz, and then up sample both versions to a yet another sample rate before the listening test, for instance to 192 KHz. This would apply two sample rate conversions to each test sample.


If that floats your boat then its not worth arguing about.

However good re sampling, particularly with 24 bit data, can and often is very  transparent as long as the sample rates are high enough, e.g. >= 44.1 KHz. 

There's no need to balance the number of re sampling steps within reason.

Using 96 KHz as your upper sampling frequency is just as good as 88 KHz and vice versa.

There are audiophile myths about re sampling, involving some perceived need for sample rates that are integer multiples of each other. With good re sampling hardware or software, it is all moot.

Heck even Creative Labs eventually fixed up the re sampling in their SB Live! so that its re sampling was OK, at least for a few passes.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Dologan on 2010-07-16 23:40:08
I think the least incorrect way to compare 88.2 KHz and 44.1 Khz would be to record at a higher sample rate, e.g. 176.4 kHz, downsample to 88.2 kHz & 44.1 kHz, and then upsample both versions to a yet another sample rate before the listening test, for instance to 192 KHz. This would apply two sample rate conversions to each test sample.

I believe it would be preferable to minimize the number of unnecessary resamplings in such a test, but in this case I think it might be better to upsample the lower rate one to the higher one (a process which, according to my understanding, should be quite straightforward and transparent, at least for integer multiples) in the interest of keeping the playback hardware constant, for the reasons Arnold described. I don't know much about the internal workings of DACs, but I would be concerned that having one switch quickly and repeatedly between sampling rates as expected in such a test might push them into behaving in a non-ideal and unintended manner, especially if I'm not sure how other components might respond either. Keeping the possible sources of unintended or unpredictable variation at a minimum is an important goal in experiment design, and if you upsample something, at least you can know precisely what went on.


BTW, could some moderator perhaps split the post about the MP3 vs CD test please?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-16 23:47:25
Why upsample? 99.9% of all DACs oversample anyway.


Upsampling and oversampling differ, at least in terms of purpose and implementation.

Quote
I also don't understand why a 192kHz DAC is supposedly cheaper to build. It is cheaper to build a good sounding 96kHz ADC than a 44.1kHz one, since the latter needs brickwall filtering.


Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)

Chip speed and real estate is so cheap that people will put features like 192 KHz sample rate into the chip to help make sure it gets bought no matter what. A chip that does not work well at 44.1 is crap in any design engineers book because that is the bread and butter. If you make it run at 192 then maybe it iwill sell a few more < $200 surround receivers to ignorant Joe six-pack types who think that anything with bigger numbers is better.

Quote
But oversampling DACs have quite relaxed filtering requirements already. So why should a 192kHz version be cheaper to build than a 44.1kHz version?


In fact what has happened is that 192/24 DAC chips have been under $1.00 in production quantities for years. I've found them in $50 DVD players.  The circuit board space they sit on costs about as much as the chip if not more. You no longer pay a signficiant premium for > 44 KHz  audio DACs. If you pay a premium, you pay it for improved dynamic range and low distortion. The premium you pay for magnificant performance is far less than it was even 5 years ago.  There is stuff out there with 120 dB converters for less than $200.
Title: 44.1 vs 88.2 ABX report at AES
Post by: benski on 2010-07-17 03:12:39
Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)


That's not the way it works.  The reconstruction filter is necessarily analog.  A 192khz DAC doesn't need 90dB of rejection at 20kHz.  It only needs it at 96kHz!  On a DAC with that sampling rate, the filter is very gentle.  It may hit the -3dB point at 20-25kHz and gently roll down to -90dB at 96kHz.  But the 44.1kHz sampling rate DAC needs to roll down from 20kHz to 22.05 khZ - that's steep!
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-17 06:39:02
Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)


That's not the way it works.


You've denied quite a bit, surely its not *all* wrong.

Quote
The reconstruction filter is necessarily analog.


Yes, but it is designed for the oversampling frequency. This puts it octaves above the nyquist frequency for the wave being converted.

Quote
A 192khz DAC doesn't need 90dB of rejection at 20kHz.  It only needs it at 96kHz!


I never meant to say that and on review, I didn't say that. I said "They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist." Note, I said nyquist, not 20 KHz. What I did say about 20 KHz is  "Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist."  Now what I said was a bit in error about the "6 dB down at Nyquist", what I really meant is "6 dB down just a bit below Nyquist".


Quote
On a DAC with that sampling rate, the filter is very gentle.  It may hit the -3dB point at 20-25kHz and gently roll down to -90dB at 96kHz.


It may do a lot of things (Like the Wolfson 8741) including exctly what I said.

Quote
But the 44.1kHz sampling rate DAC needs to roll down from 20kHz to 22.05 khZ - that's steep!


Actually, a number of AKM audio DACs are only 6 dB at Nyquist (22.05 KHz for 44.1 SR) The steep part of the slope is above Nyquist.
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-17 09:09:19
Benski, I just saw that you did not actually write "analog" filter. So the oversampling issue I have brought up does not make sense regarding costs. Oversampling severely limits costs for the analog part of the reconstruction filter, but you where talking about the required complexity of the digital side. And indeed a 192 kHz only DAC (no 44.1Khz support) could have a much simpler digital filter. The point, why I didn't get it, was that I did not consider digital filtering "expensive" these days. Still the misunderstanding was on my side. I hope I could clear that up.
Title: 44.1 vs 88.2 ABX report at AES
Post by: AndyH-ha on 2010-07-17 09:55:45
It seems that what I wrote in the first part of my post here has been addressed since I read the thread, but the second part still might bare consideration.

I’m not following the discussions about brick-wall filters since I am under the impression that the majority of modern audio converters are some version of delta-sigma, which does not use such filters. Actual sample rates are high, 64X and 128X being typical, with digital decimation being used to achieve the final sample rate and prevent aliasing. The front-end analogue filters are described at “rather mild,” far short of what would be required to prevent aliasing in an non-oversamplng design.

My main reference is copyright 2000 (Ken Pohlmann’s Principles of Digital Audio, 4th edition), so something might have changed in this regard recently, but I’ve never seen anything about it. Is the discussion about brick-wall filters aimed at a special market or do I have some basic misunderstanding on this topic?

I would also like to throw something more into the discussion. Maybe explicit recognition of it could lead to some way to account for it in the testing. I’ve written about it in two earlier threads. Possibly some of the links to screen shots still exist in one of those.

I generated a sweep frequency signal at 88.2kz, covering the entire available frequency range. Playing it out the DAC of one soundcard and into the ADC of another, I found that I got back something very close to what was initially generated in CoolEdit (except with a SB Live).

Then, playing it at 88.2 and recording at 44.1kHz, I found a definite alias image. For the first couple kHz below the Nyquist Limit, hen recording at 44.2kHz, the image was quite strong. If I turned up the resolution in CoolEdit’s Spectral View, I could see the alias trace almost to zero Hz. Resampling the generated sweep tone in CoolEdit from 88.22 to 44.1 produced none of the aliasing.

I tested several reputable sound cards. All were the same (except the SB, which was much worse). When I initially posted these results in another forum there was a lot of noise about my failures until a couple of people with more expensive professional converters duplicated my results. This happened again later in another forum. While this is still only a small sample of all soundcards, I suspect they all do the same.

As discussed earlier in HA, I don’t think this is a significant audio defect. Certainly I cannot hear any distortion that may be produced, especially at 20kHz and above. Very little music has much content at the higher frequencies necessary to produce such aliasing anyway, and any such hf is likely to be rather low in intensity.

Still, this does make for a real difference in recording at higher sample rates vs. 44.1kHz. Aliasing cannot be removed from the recording. Since the alias image is only very strong (relative to the input signal) for a couple of kHz below the Nyquist Limit (meaning about 42kHz to 44kHz at a sample rate of 88.2kHz), even the little distortion that might be present in 44.1kHz recordings will be absent at 2X that, and above.
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-17 10:49:34
AndyH-ha, that might be on purpose. For two lowpass filters of identical computational complexity, with A being allowed to inject more imaging above f than B, et ceteris paribus, A has potentially higher passband performance (0-f Hz) than B.
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-17 16:39:48
I think the 88.2 vs. downsampled 44.1 test had at least some significance. Krabapple, have they documented how downsampling has been done exactly?
Title: 44.1 vs 88.2 ABX report at AES
Post by: greynol on 2010-07-17 18:33:52
Quote
The reconstruction filter is necessarily analog.

Yes, but it is designed for the oversampling frequency. This puts it octaves above the nyquist frequency for the wave being converted.

Are you trying to say that the analog lowpass filter at the end of the DAC on my old 8x oversampling CD player has a corner frequency at somewhere around 160 kHz?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-17 20:35:58
Quote
The reconstruction filter is necessarily analog.

Yes, but it is designed for the oversampling frequency. This puts it octaves above the Nyquist frequency for the wave being converted.

Are you trying to say that the analog low pass filter at the end of the DAC on my old 8x oversampling CD player has a corner frequency at somewhere around 160 Hz?


Not at all, unless there was a typo above and you meant 160 KHz, not the 160 Hz we see abpve. 160 KHz is a probable number, if a little high.

What was the Nyquist frequency for the sample rate of the digital data that your old CD player reconstructed - 44.1 KHz, right?

What would *octaves* above 44.1 KHz be?  Well at least 160 KHz  (2 octaves).

You said 8x, right? So the oversampling frequency is about 320 KHz, and the corner frequency of the low slope analog reconstruction filter for 320 KHz might be in the 60 KHz - 160 KHz range.

These days the analog reconstruction filter is implemented right on the DAC chip, but in times past (maybe 5 years ago)  it was outboard. Its corner frequency could be calculated from schematics in the application notes. Sometimes the AN stated it explicitly. For example, the AD1853 AN shows an analog reconstruction filer with a -3 dB point of 75 KHz and a third order Gaussian characteristic. The AD1853 AN shows some very interesting broadband plots of the response of the digital filter that provides the brick wall filtering.

AD 1853 AN (http://www.analog.com/static/imported-files/data_sheets/AD1853.pdf)
Title: 44.1 vs 88.2 ABX report at AES
Post by: greynol on 2010-07-17 20:48:46
160 KHz is a probable number, if a little high.

How can it be a little high when it is already lower than what you specify as to how these things are done?

So please, tell us, what is the point in a CD player having a lowpass at 160 kHz when the audio will never be above 22.05 kHz?

Sorry Arnold, but I don't think you know what you're talking about.
Title: 44.1 vs 88.2 ABX report at AES
Post by: AndyH-ha on 2010-07-17 22:02:43
I also may not know what I'm talking about, but that high frequency cutoff seems quite in line with what the whole purpose of oversampling is about -- moving the alias image way, way up above the audio band so it is easy to filter off. Since 30kHz, for example, is already well above the audio band, 160kHz could be considered a bit higher than necessary to adequately preserve the audio band.

Quote
AndyH-ha, that might be on purpose. For two lowpass filters of identical computational complexity, with A being allowed to inject more imaging above f than B, et ceteris paribus, A has potentially higher passband performance (0-f Hz) than B

I have no idea if it is a design tradeoff or an intrinsic, unavoidable aspect of sigma-delta. I've never seen it mentioned in print. My point is that it is a real difference in results between sampling at 44.1kHz vs sampling at 88.2kHz or 96kHz (although the actual sample taking is at 2.82MHz vs 5.64MHz).

It may be true, as Arnold stated, that there is no difference in final results between downsampling to 44.1kHz from 88.2kHz vs from 96kHz, but the process is surely different. It takes much longer for CoolEdit to dowwnsample from 96kHz.
Title: 44.1 vs 88.2 ABX report at AES
Post by: greynol on 2010-07-17 22:13:34
The reason for oversampling in old CD players is as you said, to push the images far enough out so that one can good results using a much simpler and cheaper reconstruction filter.  Where is the corner frequency of this filter?  Far closer to 22 kHz than 160 kHz.  If Arnold takes issue with this then I'll direct him to speak with all the professors I had at the university where I learned about digital communications and signal processing.  Although I've forgotten much since I earned my BS, I do remember this quite vividly.

I noticed that you mentioned aliasing.  It's and issue when it comes to initial sampling and down-sampling, not with reconstruction.  Aliasing is when images overlap; it is not the name of the image.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-17 23:04:25
160 KHz is a probable number, if a little high.

How can it be a little high when it is already lower than what you specify as to how these things are done?


?????????????

If I said something like that you'll have to quote it to me. I surely had no such intentions.

Quote
So please, tell us, what is the point in a CD player having a low pass at 160 kHz when the audio will never be above 22.05 kHz?


The corner frequency of the analog low pass is set high so that it doesn't cause losses in the pass band. It is usually a relatively simple filter with gentle roll-off.

The analog low pass relates to the oversampling frequency, not the sampling frequency of the data from the media. 

If you take the time to read the reference I've already cited, they show how this works with detailed spectral analysis.

You must have missed the link the first time I posted it - here it is again: second Link to detail;ed technical data for an oversampling DAC including spectral analysis (http://www.analog.com/static/imported-files/data_sheets/AD1853.pdf)
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-17 23:14:50
The reason for oversampling in old CD players is as you said, to push the images far enough out so that one can good results using a much simpler and cheaper reconstruction filter.  Where is the corner frequency of this filter?  Far closer to 22 kHz than 160 kHz.  If Arnold takes issue with this then I'll direct him to speak with all the professors I had at the university where I learned about digital communications and signal processing.  to initial sampling and down-sampling, not with reconstruction.


I don't get all the smoke and fire about profs and BS.

Before you posted this stuff I had previously provided a referene that explicity says that the analog reconstruction filter of a certain high quality, fairly recent DAC is 65 KHz. This will work for a 44 KHz DAC, and can work for a 96 KHz DAC that is not intended for instrumentation. It is probably inappropriate for a 192 KHz DAC.

Here's the reference for the third time:

AD 1853 DAC data sheet (http://www.analog.com/static/imported-files/data_sheets/AD1853.pdf)

The specifc note about the corner frequency of the analog filter is on page 15.  Figures 10, 14 and 15 show the wideband output of the device, and the spurious responses that the analog filter targets.  They aren't much.


Title: 44.1 vs 88.2 ABX report at AES
Post by: greynol on 2010-07-17 23:27:27
I'm taking issue with what I thought was a blanket statement about oversampling and filtering, though in your response to benski, I did see you restated what you said which included qualifiers, which I overlooked.

As it concerns my CD player and your notion that the corner frequency of the lowpass is around 160 kHz, the smoke and fire isn't coming from my profs.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-18 01:52:51
I'm taking issue with what I thought was a blanket statement about oversampling and filtering, though in your response to benski, I did see you restated what you said which included qualifiers, which I overlooked.

As it concerns my CD player and your notion that the corner frequency of the lowpass is around 160 kHz, the smoke and fire isn't coming from my profs.


Here's another data sheet, this time for the AD 1955:

AD 1955 data sheet (http://www.analog.com/static/imported-files/data_sheets/AD1955.pdf)

Figure 7 page 18 shows an analog filter -3 dB point of 100 KHz.

BTW, your original number for the corner frequency of this filter was 160 Hz...
Title: 44.1 vs 88.2 ABX report at AES
Post by: amandinepras on 2010-07-19 00:52:50
Thanks all for your interest in our paper,
I received an invitation from hydrogenaudio to provide further details on our work.
Thus I will do my best to answer a few questions I could extract from the discussion.

- We used Pyramix 6.0 for down-sampling, as this software is currently used by a lot of audio professionals who produce HD recordings.
- Regarding the statistics, the "p" we provided for the results refers to the probability that we got the result by chance. Traditionally for this kind of test (here an ABX), researchers consider that if p<.05, the result is not obtained by chance (as the probability is below 5%), thus participants could discriminate. If .05<p<.1, it may be that the result was not obtained by chance but it's not for sure, that's what is called "a tendency".
If the test was easy, we would not need statistics, as participants would have almost 100% of good answers. But this test was extremely challenging for the expert listeners, implying a lot of errors even if some of them could perceive some differences between formats in specific cases (musical excerpt, type of format comparison).
- There is no proof that upsampling doesn't introduce artifacts.
- Regarding our choice of format comparison and technical chain, our purpose was to investigate perceptive differences between 88.2 vs. 44.1 in "real-life" use of the equipment, thus by taking into consideration what happens in music production and release: in a few cases, music is produced and released in high-resolution (thus playback in high-resolution); in more cases, music is produced in high-resolution and then down-sample into 44.1 for commercial release (thus playback in 44.1); in a lot of cases, music is produced and released in 44.1 (thus playback in 44.1).
We used the Fireface DAC as it was the only one that allowed us to switch sample rate with a reasonable delay for the test (less than 1sec.). I wish we could use a better one. However, the Fireface is still pretty good compared to most playback systems people use in their house.

I am a sound engineer myself and started working in research as a part time job 3 years ago.  I was glad to work on the high-resolution project as I have heard a lot of discussions in studios and during my sound recording studies on the topic. My main question was if it was worth working in High-Res when the project was to be released in 44.1.
This AES paper is the first publication for this study and provides a few answers, maybe not enough for most of us. There will be more stuff coming up. And maybe other labs will work on that topic too as they are A LOT of tests to be done.

Bottom line, although the topic is interesting, mainly these days when the Blue Ray Pure Audio is to be defined, never forget that differences between formats, ADC, DAC,... remain extremely subtle compared to differences between miking techniques, room acoustics, and of courses musicians and their instruments!
Best,
Amandine
Title: 44.1 vs 88.2 ABX report at AES
Post by: greynol on 2010-07-19 00:56:45
Here's another data sheet, this time for the AD 1955:

I see your point, though I don't think it's the one used in my CD player.

BTW, your original number for the corner frequency of this filter was 160 Hz...

...and you typed  "abpve", though I didn't decide to make much of it.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-19 03:35:17
- There is no proof that upsampling doesn't introduce artifacts.


By stating the question as a negative hypothesis, you've automatically made the proof that you have demanded exceedingly difficult or impossible.

Rather, I'll try to restate the problem in a fair and balanced way: Is it possible to upsample without introducing audible artifacts?

(1) If you believe in the efectiveness traditional audio measurements and known audible thresholds for the things they measure, it is easy to show that there are upsamplers whose added noise and distortion are well below established audible thresholds. I'm talking a safety factor of several orders of magnitude.

(2) If you believe that there are other competent people who know how to do sensitive listening tests, then we have the results of numerous listening tests that show that upsampling can be totally undetectable.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-19 11:35:38
- We used Pyramix 6.0 for down-sampling, as this software is currently used by a lot of audio professionals who produce HD recordings.


You might want to compare this product's capabilties along these lines by checking out the results posted at

Infinitewave's SRC technical tests (http://src.infinitewave.ca/)

Many of us use Adobe Audition, and its clean processing is one reason why. Audition and a number of other products appear to at least match Pyramix 6 in all areas tested, and beat it in several areas by up to 30 dB.  On balance I don't see any flaws in the Pyramix that would necessarily invalidate results obtained by using it. The Infinitewave tests put a number of highly regarded products in an even poorer light.

IME the tools that audio professionals use can have a lot to do with hype and tradition. Perhaps the most widely used and highly regarded DAW software of all time has been Pro Tools, but at times in the past, it had pretty substandard performance.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-19 11:48:49
We used the Fireface DAC as it was the only one that allowed us to switch sample rate with a reasonable delay for the test (less than 1sec.). I wish we could use a better one. However, the Fireface is still pretty good compared to most playback systems people use in their house.


It's not clear which Fireface DAC that you used as there are several models. I searched around on the web for some technical tests. I found this test of the Fireface 800:

Link to Fireface 800 technical test (http://www.grandmasteraudio.com/ms_matrix.htm)

These results don't look particularly impressive to me. They appear to be to be representitive of hi rez digital players at the lower end of the market.

I've done many similar tests with this audio interface:

Lynx Two 24/192 technical tests (http://audio.rightmark.org/test/lynx-two-b-32192.html)


Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-19 12:19:27
Link to Fireface 800 technical test (http://www.grandmasteraudio.com/ms_matrix.htm)

These results don't look particularly impressive to me. They appear to be to be representitive of hi rez digital players at the lower end of the market.


Lets not criticize where no critic is due.

In any other context, you would have (rightfully) smashed TOS8, ABX, and your contradicting experience into anyones face, who had claimed that a DAC with the capabilities of the Fireface would not be sufficient for testing and comparing real world music.

The reasons, which have been brought forward for the choice of the Fireface, especially fast sample rate switching and real world setup similarity, are sensible. They wouldn't be, if "real world" had been an excuse for a sub-standard device, but it isn't by any means.

By stating the question as a negative hypothesis, you've automatically made the proof that you have demanded exceedingly difficult or impossible.

Rather, I'll try to restate the problem in a fair and balanced way: Is it possible to upsample without introducing audible artifacts?


Completely agree.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-07-19 12:31:57
Thanks for joining the discussion, Amandine.

Your work is very interesting. But I think that it is necessary to run all the raw data into a global analysis instead of picking seemingly significant results among all possible comparisons.

I see that you calculated the p values for many different cases. Basically, for the 16 listeners as individuals, then for all of them as a group, then for 3 of them, then for 13 of them, in each case for 3 formats times 5 samples, and that you also included two-tailed results in addition to one-tailed results.

It gives a total of [(16 x 5) + (16 x 3) + (1 +1 + 1) x 3 x 5] x 2 = 346 possible p-values, out of which you got 12 significant ones. However, out of 346 p-values, we should expect in average 346 / 20 = 17.3 of them to be significant by chance, that is false positives !

In reality, it is more complicated, because the 346 p values are not independant at all. The values for the 13 listeners group are close to the ones for the 16-listeners group. The values by format for the 3-listeners group are also conditionned by the fact that this group was selected according to their individual p-values for all formats etc.

But it is right that results by format, genre, and listeners should be evaluated independantly. The trick is to use a suited analysis that takes into account all these variables at once, but is not prone to false positive picking.

I don't know what method should be used in this case, but i'm sure that some forum members, more knoledgeable than me in statistics, can help.


Edit : correction of calculus.
Title: 44.1 vs 88.2 ABX report at AES
Post by: krabapple on 2010-07-19 14:08:15
We used the Fireface DAC as it was the only one that allowed us to switch sample rate with a reasonable delay for the test (less than 1sec.). I wish we could use a better one. However, the Fireface is still pretty good compared to most playback systems people use in their house.


It's not clear which Fireface DAC that you used



Arny, it's named right there in my long summary post of the article, earlier in this thread:


"RME Fireface 800 DAC"


I also named the downsampling software (which someone else asked about)


amandine:
"there is no proof that upsampling doesn't introduce artifacts".  There is reasonable inference to be made that upsampling can be *audibly transparent* , if not measurably perfect, based on what it does, how it's implemented,  and how hearing works.  What is known about those things does not go out the window just because  we have no rigorous published test of that exact proposition to prove it.  There's lots of more or less unlikely things for which there is no 'proof' in that sense, including the famous teacup orbiting the sun.


The burden on groups like yours is to demonstrate that reality confounds such reasonable models (and to replace it with a better reasonable model).  My main issue with your paper is that your strongest evidence appears to be a statistically unlikely pattern of *incorrect* answers achieved by four subjects, on certain test signals only.  Also troubled by the weird, inconsistent pattern of results or, e.g., 44.1 native vs 44.1 downsampled.  And finally, it's not clear to me how switching was actually done.  I get that it was via the Fireface, but did someone manually switch back and forth between SRs, or was this done via software?  (The main effect I'd expect of slow switching times would be to *decrease* sensitivity to difference, so in itself this doesn't case doubt on the 'positive' results achieved.  Too, one has to be very, very, very careful of subtly audible switching artifacts correlated with A and B, though again I would expect this to result in more *true positives* than you saw.)

As you say, clearly more work is in order.  Your idea (in disucssion section) of focusing more on reverb tails in interesting but I would expect that to make manifest differences in *bit depth*, more than SR.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-19 14:33:15
Link to Fireface 800 technical test (http://www.grandmasteraudio.com/ms_matrix.htm)

These results don't look particularly impressive to me. They appear to be to be representitive of hi rez digital players at the lower end of the market.


Lets not criticize where no critic is due.

In any other context, you would have (rightfully) smashed TOS8, ABX, and your contradicting experience into anyones face, who had claimed that a DAC with the capabilities of the Fireface would not be sufficient for testing and comparing real world music.


Talking with hi-rez proponents is difficult because for openers, they flaunt TOS8.  Their work inherently critizes the idea that conventional measures and criteria are sufficient. They must disresepct the work of the careful experimenters that have gone before them.

When dealing with them I am sometimes motivated for pity for the technological equivalent of dead horses. ;-)

In short, its hard to deal with hi rez proponents on the grounds of science and reason as we understand them.

So, searching about for some common ground, I noticed the claim that the performance of the Fireface 800 (which I have been cricitized for not discerning even though the evidence I presented about it was 100% on target) was characteristic of hi resolultion music players. My point was that it isn't. Check some relevant Stereophile test reports and see what I mean.


Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-19 16:28:22
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-19 19:11:02
[lI don't see how criticizing the use of the Fireface helps here.


Doesn't anybody else see a problem with using an audio interface with A-weighted SNR of only 89 dB in an attempt to prove that the CD format with *unweighted* SNR of more like 96 dB sounds worse?

If I was designing a test that designed to show the inadequacies of the CD format, I would think that the test system should have a SNR that is much better (at least 10 dB better)  than that of the CD format. Otherwise, the audio interface in the test system is the weakest link, not the so-called "Low rez" CD format.

There are reasons why I own audio interfaces with 110 and 116 dB SNR...


Title: 44.1 vs 88.2 ABX report at AES
Post by: Juha on 2010-07-19 19:25:11
Quote
Doesn't anybody else see a problem with using an audio interface with A-weighted SNR of only 89 dB in an attempt to prove that the CD format with *unweighted* SNR of more like 96 dB sounds worse?


What makes you trust those RMAA results for FF800 you linked are valid? 


Juha
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-19 20:22:53
The manufacturer's specs (http://www.rme-audio.de/en_products_fireface_800.php#5) for the FF 800. Unweighted SNR is claimed to be <-100 dB, might be lower in 44.1 kHz mode, though.

There are reasons why I own audio interfaces with 110 and 116 dB SNR...


Do you expect even more significant differences between high rez and low rez with better gear? Do you expect that the music material used had anywhere close to -100 dB SNR? No & no. I smell much more desire for pissing match than reason, let alone any contribution to clear up why there might be a reported difference against all orthodox knowledge about the thresholds of human hearing.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Soap on 2010-07-19 20:24:50
[lI don't see how criticizing the use of the Fireface helps here.


Doesn't anybody else see a problem with using an audio interface with A-weighted SNR of only 89 dB in an attempt to prove that the CD format with *unweighted* SNR of more like 96 dB sounds worse?

If I was designing a test that designed to show the inadequacies of the CD format, I would think that the test system should have a SNR that is much better (at least 10 dB better)  than that of the CD format. Otherwise, the audio interface in the test system is the weakest link, not the so-called "Low rez" CD format.

There are reasons why I own audio interfaces with 110 and 116 dB SNR...


I was under the impression this was about 44.1 vs 88.2, not 16 bit vs 24 bit.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-19 21:26:36
The manufacturer's specs (http://www.rme-audio.de/en_products_fireface_800.php#5) for the FF 800. Unweighted SNR is claimed to be <-100 dB, might be lower in 44.1 kHz mode, though.

There are reasons why I own audio interfaces with 110 and 116 dB SNR...


Do you expect even more significant differences between high rez and low rez with better gear?


In terms of measured differences, I would expect more significant differences between high rez and low rez with better gear.  I would hope that my views on the adequacy of 44/16 as a distribution format for listening to music are well known. That means that I don't expect any audible differences to be found if the 44/16 stuff is up to snuff.


Quote
Do you expect that the music material used had anywhere close to -100 dB SNR?


Of course not!  To get a reliable outcome not only does the music have to have a > 100 dB SNR, so does the listening room and the entire reproduction chain. Then there are the problems of the ear itself.

There are commercial recordings kicking around that have dynamic range on the order of 85 dB, but that is it. To listen to them in a normal listening room with say a 45 dB SPL noise level the peaks would have to be about 130 dB SPL which is more than enough to cause a signficiant threshold shift in the listener's ears.

Quote
No & no.


Of course, but we're dealing with people who don't know that they are on mission impossible and don't believe us when we suggest that they are.  The only way they are going to change their minds is if they prove it to themselves. As long as their experiement has holes big enough for me to drive a truck through, their hope can always spring eternal. I'm just trying to help them do the cleanest experiment they can from their own viewpoint.

Quote
I smell much more desire for pissing match than reason, let alone any contribution to clear up why there might be a reported difference against all orthodox knowledge about the thresholds of human hearing.


I think you need to step back a second and look at the facts that are before you. When I was doing experiments like these I did exactly what I'm trying to suggest to these folks. I used audio interfaces that had 105, 110 and 116 dB dynamic range. Do I need to post the S/N of my LynxTWO to prove that I still have it?  If I'm the person you seem to want to libelling me as being, why did I do that (at my own personal expense)? Hint: it had nothing to do with piss. It was all about a search for truth.

IMO part of a good clean job of studying a situation may include overkilling it from a quality standpoint, even when "You know better".

When I did these experiments back about 9-10 years back, I didn't know all of the reasons why it couldn't work that I know today. Once I had the results before me, I gathered relevant facts to confirm or deny the result that I had before me. Then theory matched results and I could confidently move on.

Title: 44.1 vs 88.2 ABX report at AES
Post by: Soap on 2010-07-19 23:07:20
Doesn't anybody else see a problem with using an audio interface with A-weighted SNR of only 89 dB in an attempt to prove that the CD format with *unweighted* SNR of more like 96 dB sounds worse?

I still don't see this apparent attack justified.

How does the interface's A-weighted SNR of only 89 dB factor into the ability to distinguish 44.1 from 88.2 material?

I thought this was not a 16 v 24 test but a 44.1 v 88.2 test.

Title: 44.1 vs 88.2 ABX report at AES
Post by: Cavaille on 2010-07-20 08:40:03
Forgive me to interrupt this discussion about the Fireface 800 (which seems to be a decent interface - or does it contain another matter?!?) but I have a question.

Arnie, you suggested to improve this test by upsampling the downsampled material to the samplerate the original had. I very much see the logic in your suggestion. However, may it be that brick wall filtering could introduce audible obstacles into the signal that are unwanted? I´m referring to the thread "Audibility of 20kHz brick wall filtering" (http://www.hydrogenaudio.org/forums/index.php?showtopic=68524). So far (only three people have participated, including me) it seems that brick wall filtering may be audible. Further tests by several people are required however. And it is my impression that both downsampling & upsampling use brick wall filtering to avoid aliasing artifacts for downsampling and imaging products for upsampling. Is that assumption correct?

So, if I´m using brick wall filtering two times, wouldn´t that be even more audible? Or am I getting this wrong?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-20 13:13:35
Arnie, you suggested to improve this test by upsampling the downsampled material to the samplerate the original had. I very much see the logic in your suggestion. However, may it be that brick wall filtering could introduce audible obstacles into the signal that are unwanted?


I think that the audibility of brick wall filtering in the downsampling is the actual object of the test.

Quote
I´m referring to the thread "Audibility of 20kHz brick wall filtering" (http://www.hydrogenaudio.org/forums/index.php?showtopic=68524). So far (only three people have participated, including me) it seems that brick wall filtering may be audible. Further tests by several people are required however.


If I understand that test properly, it has a serious limitation - the program material being used is implulses, not real world music. Even impulsive sounds in music far fall short of the extreme spectral content of a steady stream of impulses. Listening to impulses is about as much fun as listening to white noise.

Quote
And it is my impression that both downsampling & upsampling use brick wall filtering to avoid aliasing artifacts for downsampling and imaging products for upsampling. Is that assumption correct?


It is my understanding that downsampling uses brick wall filtering, but that upsampling either uses no brick wall filtering at all, or uses  brick wall filtering at the nyquist frequency of the higher sample rate. It is very hard to avoid brick wall filtering in digital, so that's rarely the goal. As I understand it, the major goal of higher sample rates is raising the frequency of any brick wall filters.

Quote
So, if I´m using brick wall filtering two times, wouldn´t that be even more audible? Or am I getting this wrong?


I think that the frequency of the corner frequency of the brick walls is highly significant. I don't think that anybody disagrees with the idea that in general, the higher the better. The only questions I'm aware of are how high, and what phase response is required for sonic transparency.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-20 19:30:50
It is my understanding that downsampling uses brick wall filtering, but that upsampling either uses no brick wall filtering at all, or uses  brick wall filtering at the nyquist frequency of the higher sample rate. It is very hard to avoid brick wall filtering in digital, so that's rarely the goal. As I understand it, the major goal of higher sample rates is raising the frequency of any brick wall filters.

]

I was fooling around with upsampling and found some stuff that varied from my previous understanding.

In CEP 2.0  the upsampling function has an option for using pre/post filtering or not. Furthermore, the program breaks resampling down into steps and gives progress messages.

With the pre/post filtering, upsampling is followed by filtering.

I tried upsampling 21 KHz @ 0 dB @ 44.1/32  to 96/32.  Without pre/post filtering there was a spurious response at 23.090 KHz about 120 dB down. With pre/post filtering there was no spurious response of any kind, just a noise floor 160 dB down. 

Niether the -160 dB noise floor nor the spurious response could ever imagninably be heard in a relevant listening test. However, the fact that it disappeared after post filering is pretty good evidence that the post filtering is a brick wall filter at 22.05 KHz. 

So upsampling 44.1 KHz sampled information could result in yet another brick wall filter being applied at 22.05 KHz. Any potentially audible artifacts would thus be increased.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-07-20 22:36:39
I tried upsampling 21 KHz @ 0 dB @ 44.1/32  to 96/32.  Without pre/post filtering there was a spurious response at 23.090 KHz about 120 dB down.


Hello,
I didn't follow all the discussion about the FireFace, but one thing is sure : if all you got was a -120 dB alias at 23.09 kHz, there was some kind of filtering involved. What CEP calls upsampling can involve any kind of algorithm that, mathematically, involve more or less filtering. Then the program proposes another filtering process than cleans the result.

Pure resampling from 44100 Hz to 96000 Hz without filtering of any kind would consist in inserting 319 null samples between every original sample, then pick one sample out of 147 in the result.

It would give a pulse train that would feature a flood of harmonics.

Thus, don't draw any conclusion from the hypothesis that no filtering would only introduce a quiet 23 kHz alias. You just applied filtering without knowing it.
SoundForge 4.5 worked the same way : there was resampling, with 4 quality levels, then optional filtering. But in reality, the 4 quality levels were actually 4 filters, from weak to strong.
Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2010-07-20 22:53:50
Niether the -160 dB noise floor nor the spurious response could ever imagninably be heard in a relevant listening test. However, the fact that it disappeared after post filering is pretty good evidence that the post filtering is a brick wall filter at 22.05 KHz.

Yes, from my understanding, it is. Though I don't see the point of doing it in two steps. Judging from the impulse responses with-vs-without-pre-post-filter at http://src.infinitewave.ca (http://src.infinitewave.ca) [and my own experiments] the post filter just increases the effective filter length = steepness. A steeper interpolation (anti-imaging) filter with [more stop-band attenuation and] its -6 dB frequency a little below the original Nyquist frequency would do the same job.

Chris

[ ]: update
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-20 23:57:10
I tried up sampling 21 KHz @ 0 dB @ 44.1/32  to 96/32.  Without pre/post filtering there was a spurious response at 23.090 KHz about 120 dB down.


...if all you got was a -120 dB alias at 23.09 kHz, there was some kind of filtering involved.


I see your point. The frequency is right for being an image. 22.05 - 21.0 = 1.05. 22.05 + 1.05 = 23.10 - close enough to 23.09 given that the analysis tool was a  16k point FFT. If I use a larger FFT, I get almost exactly 23.10

Quote
What CEP calls up sampling can involve any kind of algorithm that, mathematically, involve more or less filtering. Then the program proposes another filtering process than cleans the result.


So it would seem.

Quote
Pure re sampling from 44100 Hz to 96000 Hz without filtering of any kind would consist in inserting 319 null samples between every original sample, then pick one sample out of 147 in the result.

It would give a pulse train that would feature a flood of harmonics.

Thus, don't draw any conclusion from the hypothesis that no filtering would only introduce a quiet 23 kHz alias. You just applied filtering without knowing it.
Sound Forge 4.5 worked the same way : there was resampling, with 4 quality levels, then optional filtering. But in reality, the 4 quality levels were actually 4 filters, from weak to strong.


Got it.  There is at least one layer of brick wall filtering in the CEP2.1 up sampling, and you have the option of adding one more.  The brick wall filters are at the Nyquist frequency of the source data.

In the end, you get a very clean job of up sampling, even without the "post filtering".
Title: 44.1 vs 88.2 ABX report at AES
Post by: Notat on 2010-07-21 04:08:37
- Regarding our choice of format comparison and technical chain, our purpose was to investigate perceptive differences between 88.2 vs. 44.1 in "real-life" use of the equipment, thus by taking into consideration what happens in music production and release.

Not surprising that you'd have marginally detectable differences when both stuff at the recording and playback sides were changed for the different trials. If I understand the results (as summarized by krabapple), the higher sample rate was not reliably identified as sounding better.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-07-21 12:16:35
The strange thing is that there are ABX results with two-tailed statistics. I don't have the article, but it rather seems that these particular listeners mistook A and B with such consistency that they got a score like 2 right answer out of 12, rather than preferring the low resolution version. That's the only way I interpret "two-tailed ABX results" with listeners "significantly selecting the wrong answer".
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-21 13:13:02
I see that you calculated the p values for many different cases. Basically, for the 16 listeners as individuals, then for all of them as a group, then for 3 of them, then for 13 of them, in each case for 3 formats times 5 samples, and that you also included two-tailed results in addition to one-tailed results.

It gives a total of [(16 x 5) + (16 x 3) + (1 +1 + 1) x 3 x 5] x 2 = 346 possible p-values, out of which you got 12 significant ones. However, out of 346 p-values, we should expect in average 346 / 20 = 17.3 of them to be significant by chance, that is false positives !
I spotted that too. I didn't do the maths to get the numbers, but it was quite apparent that the paper looked for so many possible results that, with a 5% probability of each one ocurring at random, it would have been amazing if a positive results hadn't been found.

Or, to put it more simply, unless I've misunderstood the stats, there were a couple of people who seemed gifted in giving the wrong answer consistently, and everything else was basically random.

Quote
The trick is to use a suited analysis that takes into account all these variables at once, but is not prone to false positive picking.

I don't know what method should be used in this case, but i'm sure that some forum members, more knoledgeable than me in statistics, can help.
Hope so - but we don't have the raw data, do we?

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-21 13:20:22
- We used Pyramix 6.0 for down-sampling, as this software is currently used by a lot of audio professionals who produce HD recordings.


You might want to compare this product's capabilties along these lines by checking out the results posted at

Infinitewave's SRC technical tests (http://src.infinitewave.ca/)
Good grief. It's hardly state-of-the-art! Thanks for pointing that out Arny.

Quote
On balance I don't see any flaws in the Pyramix that would necessarily invalidate results obtained by using it.
I don't know. Check out the passband. There seems to be a ~0.1dB error across most of it. Admittedly it's a linear frequency plot, so it's not "across most of it" as we hear things, but still - you'd want to level match down to 0.1dB wouldn't you? A signal with most energy in the 6-16kHz region (unlikely!) would be reduced by ~0.1dB by this device. That's not really good enough IMO to give robust ABX results. Especially when it's arguably not the actual thing under test - it could easily amplify an otherwise inaudible fault in the thing under test.

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-21 13:22:53
Why upsample? 99.9% of all DACs oversample anyway.


Upsampling and oversampling differ, at least in terms of purpose and implementation.

Quote
I also don't understand why a 192kHz DAC is supposedly cheaper to build. It is cheaper to build a good sounding 96kHz ADC than a 44.1kHz one, since the latter needs brickwall filtering.


Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)
I don't agree.

1. It doesn't need to be brick wall if you have a potential 76kHz transition band
2. 6dB down and then a brick wall is different from just a brick wall. Any ringing due to the brick wall will be 6dB down!

I'm not claiming any of this is audible, but it's all real and measurable.

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-21 13:25:33
I´m referring to the thread "Audibility of 20kHz brick wall filtering" (http://www.hydrogenaudio.org/forums/index.php?showtopic=68524). So far (only three people have participated, including me) it seems that brick wall filtering may be audible. Further tests by several people are required however.


If I understand that test properly, it has a serious limitation - the program material being used is impulses, not real world music. Even impulsive sounds in music far fall short of the extreme spectral content of a steady stream of impulses. Listening to impulses is about as much fun as listening to white noise.
No, it's real music. Well, it's Limehouse street blues played in a New York jazz club - whether you count that as real music or not is up to you. There's just one impulse at the end of the file as a check. You don't have to listen to that part if you don't want to.

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Kees de Visser on 2010-07-23 09:26:50
- We used Pyramix 6.0 for down-sampling, as this software is currently used by a lot of audio professionals who produce HD recordings.
You might want to compare this product's capabilties along these lines by checking out the results posted at Infinitewave's SRC technical tests (http://src.infinitewave.ca/)
Good grief. It's hardly state-of-the-art!
Pyramix is a great DAW but I agree that their SRC isn't amongst the best. AFAIK many Pyramix users (like me) use external SRC applications like Weiss Saracon or iZotope.
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-23 10:55:10
The fact, that the study's authors have registered here, but eventually did not really contribute much more than 'hello', might suggest that they cannot clear up the raised statistical concerns. If we assume that they do not deliver anything further for said reason, could the claimed significance be dismissed?
Title: 44.1 vs 88.2 ABX report at AES
Post by: krabapple on 2010-07-23 19:20:25
Sheesh, it's only been four days.  They may actually have other responsibilities that take priority over monitoring HA.
.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-07-24 13:58:54
could the claimed significance be dismissed?


My analysis was based on the very partial summary posted here. I can't tell anything for sure unless I buy the article and read it entierly.
Title: 44.1 vs 88.2 ABX report at AES
Post by: googlebot on 2010-07-26 19:43:57
In dubio pro reo is generally a good principle. But when you look at the data by krabapple, I think it is allowed to say that, even if one cannot be 100% sure, doubt overweights by a good margin. My gut tells me that we won't see the study's authors bringing more light to this. Time will tell.
Title: 44.1 vs 88.2 ABX report at AES
Post by: hciman77 on 2010-07-26 19:48:56
I read the full paper and I think there may be some transitive errors here.

It looks like in some cases listeners can detect a difference between 88.1 and 44.1 native , not between 44.1 down and 44.1 native and not between 88.1 and 44.1 down for the same material. This implies that 44.1 native lacks something but 88.1 to 44.1 retains what was lost in the 44.1 native. This does not appear to make sense since both 44.1 native and 44.1 down have the same limits (give or take dithering concerns) - I am confused by this. ???
Title: 44.1 vs 88.2 ABX report at AES
Post by: hciman77 on 2010-07-26 21:05:13
In dubio pro reo is generally a good principle. But when you look at the data by krabapple, I think it is allowed to say that, even if one cannot be 100% sure, doubt overweights by a good margin. My gut tells me that we won't see the study's authors bringing more light to this. Time will tell.


If we had the raw data we could run our own analyses. One thing stll puzzles me that the three who showed a significant ability to detect a difference were fantastically unable to correctly match A to X or B to X as required. when I run DBTs in fooBar I can do worse than chance but I never do so badly that FooBar decides that I was not guessing after all, I just did a faked 4/12 run and my probability of guessing was 93% ??? - when I redid and got 0/12 my guessing probability went up to 100% ???
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-27 13:16:01
I read the full paper and I think there may be some transitive errors here.

It looks like in some cases listeners can detect a difference between 88.1 and 44.1 native , not between 44.1 down and 44.1 native and not between 88.1 and 44.1 down for the same material. This implies that 44.1 native lacks something but 88.1 to 44.1 retains what was lost in the 44.1 native. This does not appear to make sense since both 44.1 native and 44.1 down have the same limits (give or take dithering concerns) - I am confused by this. ???



When you have confusion like this, the problems are usually procedural. I don't know if it was the statistics or the technical details, and it could be both. It appears that projects involving so-called "hi rez" audio  are popular senior year/graduate thesis projects. The problem has been studied for at least a decade without conclusive results that satisfied enough people to put an end to this sort of thing.

Right now, it seems clear that projects like this are a great way to show how sighted evaluations create strong perceptions that become subtle or non-existent when sufficient experimental rigor is added.

IME there is nothing different about hearing as compared to any other human performance issue. Do enough clean trials and do your statistics right and the results are asymptotic to the same result, over and over again.  Historically, the asymptote in this realm is that high rez past the CD format either doesn't matter or it matters very little.

We need to be chasing the big, slow, meaty rabbits that hop about all over the place like micing and speakers and rooms; not the skinny fast rabbits that only come out in vanishing numbers at night.
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-27 22:45:26
We need to be chasing the big, slow, meaty rabbits that hop about all over the place like micing and speakers and rooms; not the skinny fast rabbits that only come out in vanishing numbers at night.
I like that analogy.

Problem is (for the industry) is that it's very easy to create and sell you something to "solve" the "problems" caused by the skinny fast rabbits. I mean, how hard is it to handle twice as much data, twice as fast? Wait a couple of years, and most of the problem is solved by default. Whereas micing, speakers and rooms? That would need real research. It's audio FFS - who wants to throw money at that?

(with all due credit to those companies who are out there solving real problems.)

Cheers,
David.

Title: 44.1 vs 88.2 ABX report at AES
Post by: hciman77 on 2010-07-27 22:56:03
The strange thing is that there are ABX results with two-tailed statistics. I don't have the article, but it rather seems that these particular listeners mistook A and B with such consistency that they got a score like 2 right answer out of 12, rather than preferring the low resolution version. That's the only way I interpret "two-tailed ABX results" with listeners "significantly selecting the wrong answer".


In the case of the 3 outliers when tested with violin samples 44.1 native or 88.2 downsampled to 44.1 there were 12 trials (4 per person)  of those 12 trials there was a total of precisley 1 correct answer, (identifying A = X or B = X)  i.e  1 person got 1 out of 4 correct, the other 2 got 0/4 correct, however with only 4 trials per sample pair this is not a big stretch.

Separating the 3 outliers out does make a big difference to the overall results, it always increases the overall score, in at least one case pushing it above the threshold.

Interestingly, perhaps,  in some cases the difference between 88.2 and 44.1 native can be detected but not 88.1 vs 88.2 downsampled, in other cases the reverse is true.

They set the threshold at 63% for trials with an n of 52 , this seems a bit low, but stats is not my forte. In toto , all samples , all modes the correct hit rate seems to be between 53% and 55%

Biut I would love to see the raw data...

Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-07-28 11:39:21
This does not appear to make sense since both 44.1 native and 44.1 down have the same limits (give or take dithering concerns) - I am confused by this. ???


If the statistics were significant, it would make sense, because downsampling 88.2 to 44.1, you can use perfect digital antialias filters, while recording directly at 44.1, you have to use analog antialias filters, in order for your signal to be lowpassed before it reaches the ADC.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-28 13:52:22
This does not appear to make sense since both 44.1 native and 44.1 down have the same limits (give or take dithering concerns) - I am confused by this. ???


If the statistics were significant, it would make sense, because downsampling 88.2 to 44.1, you can use perfect digital antialias filters, while recording directly at 44.1, you have to use analog antialias filters, in order for your signal to be lowpassed before it reaches the ADC.


Modern ADCs do have analog anti-aliasing filters, but they are relatively simple and operate at ultrasonic frequencies. The brick wall that is right up against the audio band is digital and therefore the overall performance can be very similar to what you get if you record at a higher sample rate and downsample in the digital domain. Note that there  can be considerable techncal variation in the details of how the digital filtering is implemented, whether in the ADC or applied later on.
Title: 44.1 vs 88.2 ABX report at AES
Post by: hciman77 on 2010-07-28 15:18:47
Thanks all for your interest in our paper,
I received an invitation from hydrogenaudio to provide further details on our work.
Thus I will do my best to answer a few questions I could extract from the discussion.

- We used Pyramix 6.0 for down-sampling, as this software is currently used by a lot of audio professionals who produce HD recordings.
- Regarding the statistics, the "p" we provided for the results refers to the probability that we got the result by chance. Traditionally for this kind of test (here an ABX), researchers consider that if p<.05, the result is not obtained by chance (as the probability is below 5%), thus participants could discriminate. If .05<p<.1, it may be that the result was not obtained by chance but it's not for sure, that's what is called "a tendency".
If the test was easy, we would not need statistics, as participants would have almost 100% of good answers. But this test was extremely challenging for the expert listeners, implying a lot of errors even if some of them could perceive some differences between formats in specific cases (musical excerpt, type of format comparison).
- There is no proof that upsampling doesn't introduce artifacts.
- Regarding our choice of format comparison and technical chain, our purpose was to investigate perceptive differences between 88.2 vs. 44.1 in "real-life" use of the equipment, thus by taking into consideration what happens in music production and release: in a few cases, music is produced and released in high-resolution (thus playback in high-resolution); in more cases, music is produced in high-resolution and then down-sample into 44.1 for commercial release (thus playback in 44.1); in a lot of cases, music is produced and released in 44.1 (thus playback in 44.1).
We used the Fireface DAC as it was the only one that allowed us to switch sample rate with a reasonable delay for the test (less than 1sec.). I wish we could use a better one. However, the Fireface is still pretty good compared to most playback systems people use in their house.

I am a sound engineer myself and started working in research as a part time job 3 years ago.  I was glad to work on the high-resolution project as I have heard a lot of discussions in studios and during my sound recording studies on the topic. My main question was if it was worth working in High-Res when the project was to be released in 44.1.
This AES paper is the first publication for this study and provides a few answers, maybe not enough for most of us. There will be more stuff coming up. And maybe other labs will work on that topic too as they are A LOT of tests to be done.

Bottom line, although the topic is interesting, mainly these days when the Blue Ray Pure Audio is to be defined, never forget that differences between formats, ADC, DAC,... remain extremely subtle compared to differences between miking techniques, room acoustics, and of courses musicians and their instruments!
Best,
Amandine


Many thanks for your reply, is there any chance we could get access to the raw data ?
Title: 44.1 vs 88.2 ABX report at AES
Post by: AndyH-ha on 2010-07-28 22:42:35
Modern ADCs do have analog anti-aliasing filters, but they are relatively simple and operate at ultrasonic frequencies. The brick wall that is right up against the audio band is digital and therefore the overall performance can be very similar to what you get if you record at a higher sample rate and downsample in the digital domain. Note that there  can be considerable techncal variation in the details of how the digital filtering is implemented, whether in the ADC or applied later on.


The part about the final filtering being digital seems right, as far as I understand from reading, but based on my experiments, and those of a few others, the result of recording at 44.1 is never like that from recording at a 88.2 or 96 and downsampling with good software, as I pointed out earlier in this thread and in at least two others in HA (based on results using test tones, the only way to actually observe the final product). Do you have evidence that some soundcards really do better?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-29 13:04:08
[quote author=AndyH-ha link=msg=715977 date=1280353355]
Modern ADCs do have analog anti-aliasing filters, but they are relatively simple and operate at ultrasonic frequencies. The brick wall that is right up against the audio band is digital and therefore the overall performance can be very similar to what you get if you record at a higher sample rate and downsample in the digital domain. Note that there  can be considerable techncal variation in the details of how the digital filtering is implemented, whether in the ADC or applied later on.


The part about the final filtering being digital seems right, as far as I understand from reading, but based on my experiments, and those of a few others, the result of recording at 44.1 is never like that from recording at a 88.2 or 96 and downsampling with good software, as I pointed out earlier in this thread and in at least two others in HA (based on results using test tones, the only way to actually observe the final product). Do you have evidence that some soundcards really do better?
[/quote]

Virtually every audio interface is different from all the rest at some level of detail. So, the question is not whether they are different in any way but rather whether the differences are signficiant.

These days most audio products are sold without complete or even representive specifications and technical tests are rare compared to the size of the marketplace.

One of the ares in which I have observed possibly signficiant differences among audio interfaces is high frequency nonlinear distoriton.

In general, audio interfaces aren't significantly nonlinear above their nomal passband.

It has long been observed that in audio, "The wider you open the windows, the more dirt blows in".
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-07-29 22:11:46
[quote author=AndyH-ha link=msg=715977 date=1280353355]The part about the final filtering being digital seems right, as far as I understand from reading, but based on my experiments, and those of a few others, the result of recording at 44.1 is never like that from recording at a 88.2 or 96 and downsampling with good software, as I pointed out earlier in this thread and in at least two others in HA (based on results using test tones, the only way to actually observe the final product). Do you have evidence that some soundcards really do better?[/quote]I agree Andy - I've never seen an A>D (or D>A) that comes close to achieving the kind of truly brick wall filtering that you get in Cool Edit Pro's resampling.

Whether this matters at all is another question, but it's an easily measurable difference.

I take Arny's point that they're all different (+ many are programmable), but none seem to make the effort to include a several thousand tap FIR filter.

Cheers,
David.

Title: 44.1 vs 88.2 ABX report at AES
Post by: SebastianG on 2010-07-30 09:55:42
but none seem to make the effort to include a several thousand tap FIR filter.

Right. And I don't see a real need for those in A/D or D/A. As a rule of thumb, for a filter with a transition band width of W (kilo)Hertz you need an impulse response of length L (milli)seconds with L*W = 12 (for about 100 dB stopband rejection). So, with fs=44 kHz and a transition band width of 2 kHz this comes down to 6 milliseconds. Of course, if you can tolerate a bit of aliasing within 20-22 kHz or some imaging during interpolation within 22-24 kHz you can halve that to 3 milliseconds. That's rather short in comparison to what Cool Edit is doing by default.

Cheers!
SG
Title: 44.1 vs 88.2 ABX report at AES
Post by: lvqcl on 2010-07-30 11:42:10
Audition 1.5: 44.1 -> 96 kHz resampling, Quality = 999, Pre/Post Filter = off.

Impulse response length = 542 samples = 5.6 ms.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-07-30 12:26:28
I take Arny's point that they're all different (+ many are programmable), but none seem to make the effort to include a several thousand tap FIR filter.


That's because DACs are usually designed by engineers with some notion of costs and benefits.  Some of the best converter chips seem to sell for under $7 (single unit!) and that's still too low to allow including a 2 GHz processor on the same chip.

I can still remember when a first rate converter chip cost more than $20!


Title: 44.1 vs 88.2 ABX report at AES
Post by: WernerO on 2010-08-02 08:19:48
It is my understanding that downsampling uses brick wall filtering, but that upsampling either uses no brick wall filtering at all, or uses  brick wall filtering at the nyquist frequency of the higher sample rate. It is very hard to avoid brick wall filtering in digital, so that's rarely the goal. As I understand it, the major goal of higher sample rates is raising the frequency of any brick wall filters.

I think that the frequency of the corner frequency of the brick walls is highly significant. I don't think that anybody disagrees with the idea that in general, the higher the better. The only questions I'm aware of are how high, and what phase response is required for sonic transparency.


Your understanding is wrong. In fact so fundamentally wrong that I urge you to re-assess your complete knowledge of signal theory and of the sampling theorem.

Up- and oversampling still are both terms for the same mathematical process, and if that process is executed with the intent to obey the sampling theorem (i.e. contrary to doing funny stuff for the sake of it), then it will include a brickwall filter at half the original sampling rate. It simply has to, as this constitutes the bulk of that signal's reconstruction.

When oversampling the goal is not to raise the cutoff frequency of the brickwall (reconstruction) filter. That cutoff has to remain at half the original rate, as otherwise the first images are allowed to creep out. The goal, at least for a DAC, is to make most of the reconstructor in the digital domain (i.e. cheap, steep, linear, and linear-phase), with only the remainder in the analogue domain (there to suppress the images of the oversampled signal), indeed potentially at a higher cut-off frequency and with a shallow slope (i.e. cheap and simple).


It is similar in ADCs, where the modulator runs at several MHz and hard aliasing can be avoided with a simple analogue filter cutting in at a couple of 100kHz (which does not mean that designing analogue front-ends for today's DS ADCs is simple). After the conversion in the low-bit modulator the signal is then noise-shaped and decimated, which (ignoring the noise shaping) comprises of brickwall anti-alias filtering at the target Nyquist frequency.

In this sense running a Delta-Sigma ADC at 44.1kHz is the same as running it at 88.2kHz followed with off-line downsampling. The only difference is in the implementation of the two (actuall three!) anti-alias filters involved, where the off-line solution can be of arbitrarily high quality, while the hardware solution often is not that good at all. Indeed, most commercial ADC chips use half-band AA filters and thus allow some aliasing to happen in the, say, 20-22kHz band.




Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2010-08-02 11:56:07
It is my understanding that downsampling uses brick wall filtering, but that upsampling either uses no brick wall filtering at all, or uses  brick wall filtering at the nyquist frequency of the higher sample rate. It is very hard to avoid brick wall filtering in digital, so that's rarely the goal. As I understand it, the major goal of higher sample rates is raising the frequency of any brick wall filters.

I think that the frequency of the corner frequency of the brick walls is highly significant. I don't think that anybody disagrees with the idea that in general, the higher the better. The only questions I'm aware of are how high, and what phase response is required for sonic transparency.


Your understanding is wrong. In fact so fundamentally wrong that I urge you to re-assess your complete knowledge of signal theory and of the sampling theorem.


Right and this was all corrected a week ago. Read on...
Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2010-08-09 21:41:12
I finally found time to read the entire paper. It's quite well written in my opinion, but there are three points I'd like to add to the discussion by krabapple and hciman77.


Which leaves us with: a difference between different recording sampling rates (and different clocks) was heard on the Orchestra item with a significant level of confidence. Which is interesting. The authors themselves speculate that this might be due to more detailed reproduction of transients in case of high-resolution sampling rates, which seems to be in line with our own 20-kHz brick wall test (http://www.hydrogenaudio.org/forums/index.php?showtopic=68524) and which leads me to the question:

Amandine, would you mind sharing the low- and high-resolution Orchestra excerpt?

Comments welcome

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2010-08-09 22:44:06
I think before you pull out one positive result and say it's interesting, you must go back to this post...
http://www.hydrogenaudio.org/forums/index....st&p=714650 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=82264&view=findpost&p=714650)
...which shows how amazing it would be to have no positive results, considering how many different ways the results are combined.

I'm really not an expert, but it seems that the statistical analysis isn't sufficient.

Another way to make the paper more satisfactory would be to re-do the apparently "good" combinations in isolation, using a pre-defined number of trials. No picking and choosing in this "second round".

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-08-09 23:33:26
I finaly got the article. Actually, they say something about considering the "false alarm rate" (false positives) when picking positive results.

They refer to Boley and Lester, "Statistical Analysis of ABX results Using Signal Detection Theory" (AES 127th convention), and to Macmillan and Creelman, "Detection theory : a user's guide", University Press Cambridge, 1991, without givin further details.

I don't have these references.

Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2010-08-10 01:37:56
I think before you pull out one positive result and say it's interesting, you must go back to this post...
http://www.hydrogenaudio.org/forums/index....st&p=714650 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=82264&view=findpost&p=714650)
...which shows how amazing it would be to have no positive results, considering how many different ways the results are combined.

I saw that but didn't quite get the message. Does it mean that the supposedly significant results (p = .01 etc.) are in fact not significant, i.e. the calculation of the level of confidence is wrong in the paper? At least the separation by musical excerpt sounds reasonable to me.

Quote
Another way to make the paper more satisfactory would be to re-do the apparently "good" combinations in isolation, using a pre-defined number of trials. No picking and choosing in this "second round".

That's essentially what I'm aiming at by asking Amandine for the items. Then we can try to repeat the experiment here. But it's worth adding that for the paper, the number of trials was also fixed in advance (each possible combination of sampling rate configurations was listened to four times for each musical excerpt).

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: Pio2001 on 2010-08-10 12:33:33
The unknown thing is the origin of the p values. If they are associated with the ABX tests, like the ones given in ABX software, then nothing is significant, because p = 0.05 means, by definition, that such a success occurs by chance only one time out of 20 in average. P = 0.01 by chance one time out of 100 etc.

If on the other hand, some kind of "signal detection" algorithm was applied to all comparisons, including the subsampling of the listeners, in accordance with references [1] and [4] in the paper, and if the p values result of this signal detection technique, that might work.
Title: 44.1 vs 88.2 ABX report at AES
Post by: WernerO on 2010-08-11 13:51:58
The 88.2 excerpts were also then downsampled to 44.1 via Pyramix software,


I don't know  which version of Pyramix they used, but somehow 6.2.3 leaves me less than impressed:

(http://src.infinitewave.ca/images/Sweep/Pyramix.png)

(http://src.infinitewave.ca/images/Tone/Pyramix.png)


This is a missed opportunity, as a software SRC offers the chance to improve numerically on the performance of the 44.1kHz anti-alias filter of the average ADC chip.
Title: 44.1 vs 88.2 ABX report at AES
Post by: lrossouw on 2010-09-08 09:39:38
Did they test for difference (Check if you can tell X=A or X=B, i.e. ABX ) or for preference (prefer A or B) on the 44 vs 88 testing?  Don't have the paper for this but on the slides of the mp3 vs CD testing slides they tested for preference:  http://www.music.mcgill.ca/~hockman/docume...ntation2009.pdf (http://www.music.mcgill.ca/~hockman/documents/Pras_presentation2009.pdf)

If they repeated that method here then it could explain how people got it consistenly wrong and also why a two-tail value needs to be used.  The "wrong" answer is to prefer the lower quality version and if you consistenly chose the lower quality version it means there is a differences but you prefer the lower quality version.  I.e. if you consistently pick one you can tell a difference, but it may not be good or bad.

However if they want to look at probabilities statistics of sub-groups, the sub-groups need to be chosen on prior information.  Shouldn't chose sub-groups based on the results of the tests.  So they shouldn't look at stats on the 3 guys who's results were different and then try to chose a tail value for them as they are introducing bias.
Title: 44.1 vs 88.2 ABX report at AES
Post by: mzil on 2012-07-22 02:47:52
2) I bought the paper.  Here's a paraphrase of the methods and results. Note that the test signals were recorded by the authors...

equipment:  the recording microphones (a pair of Sennheiser MKH 8020) had a FR of  10Hz-60kHz.  Two stereo feeds from the mic preamp (Millennia HV-3D) to two Micstasy ADCs, one set to 44.1/24 the other to 88.2/24 ; then the 44.1/24 digital signal was recorded (at 44.1) on a Sound Devices 744T portable recorder, while the 88.2 output was recorded on a MacBook Pro at 88.2 using Logic Studio software.  The recording diagram also shows that the 44.1 ADC used its internal clock, while the 88.2 ADC's master clock was a Mutec .


test signals:  five musical/instrumental  (orchestra, classical guitar, cymbals , voice, violin) recordings by the authors, from live performances ...
[bold texting added by me]

Krabapple, do you still have the paper? [And are you still subscribed to this old thread and seeing this question, I wonder?] Do the authors make any mention of level matching (using instrumentation) for the stage I have indicated above in bold text?

They rather oddly decided to use live music as their test source, and not a high resolution recording as Arnold Krueger correctly mentions they "should have" (that is then manipulated to create the different competing signals), but what assurance do we have that the input stages of the two Micstasy ADCs (having variable levels of gain with both manual and "auto" modes, as I understand it), successfully recorded the two analog signals at exactly the same, precise level in the digital domain? If one was a fraction of a dB different than the other, that could be the difference listeners heard, right there!  Furthermore, even if both machine's inputs were set to the exact same attenuation values, do we know for a fact that simply selecting a different sampling rate won't in itself alter the actual level of the digital signal, by a small amount?

I often see people in the subjective evaluation world naively assume that there's no need to introduce level matching when comparing, say, the output of two CD players, because "the spec sheets say they both have a fixed, 2.0V output", but in truth they often do vary (slightly) when measured using instrumentation, and that small level change could easily be the difference they are hearing (but often mistakenly attribute it to being one of "quality", not simply level). I am wonder if maybe this study suffered from the same sort of flaw [an assuption that levels are matched by default, yet they aren't]?

edit to add: Of course the playback chain also would need be tested for exactly matched output level. Just like CD players vary in output, despite almost all claiming "2.0V", DACs also may vary slightly depending on the sampling rate they receive.
Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2014-03-08 00:21:05
Sorry for bumping this thread, but I thought the following might fit in nicely:

At the next AES Convention in Berlin, some researchers from Tokyo present a paper on the results of a "double-blind A/B comparison listening test" which supposedly revealed statistically significant differences between PCM (192 kHz, 24-bit) and DSD (2.8 and 5.6 MHz).

http://www.aes.org/events/136/papers/?displayall (http://www.aes.org/events/136/papers/?displayall)

P1-2 Subjective Evaluation of High Resolution Recordings in PCM and DSD Audio Formats—Atsushi Marui, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Kazuhiko Endo, TEAC Corporation - Tokyo, Japan; Erisa Sato, TEAC Corporation - Tokyo, Japan

Abstract:

High-resolution audio production and consumption are increasing attraction supported by releases of the relatively affordable audio recorders from multiple manufacturers and broader bandwidth of the Internet. However, differences in audio quality between high-resolution audio formats are still not well known, especially between the different audio formats available for the audio recorders. In order to evaluate the differences between subjective impression of the sounds recorded using high resolution audio formats, three audio formats—PCM (192 kHz/24 bits), DSD (2.8 MHz), and DSD (5.6 MHz)—recorded with multiple studio-quality audio recorders were evaluated in a double-blind A/B comparison listening test. Six sound programs evaluated by forty-six participants on eight attributes revealed statistically significant differences between PCM and DSD but not between the two sampling frequencies (2.8 MHz and 5.6 MHz) of DSD.

Convention Paper 9019

Is anyone here planning to attend the conference and the presentation?

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: Mach-X on 2014-03-08 07:08:44
I am going to loosely quote a well known poster here "all this fuss about what happens at 22.05 kHz when 99% of the world has no issue with16khz brickwalled lossy encoding". Arnie?
Title: 44.1 vs 88.2 ABX report at AES
Post by: bandpass on 2014-03-08 08:06:19
TEAC -- slight COI perhaps?

Guessing their method falls short of M&M - the gold standard.
Title: 44.1 vs 88.2 ABX report at AES
Post by: [JAZ] on 2014-03-08 17:38:15
...recorded with multiple studio-quality audio recorders ...


So... were they evaluating the formats, or the ADC's on those recorders?

(And that is assuming that when they say "Six sound programs" they mean six sound samples.  Not so sure what the "eight attributes" part really means... where they answering at "which one sounds fuller?" type of questions? That would not be an ABX in any way)
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2014-03-10 11:58:34
[quote author=Mach-X link=msg=860104 date=1394262524]I am going to loosely quote a well known poster here "all this fuss about what happens at 22.05 kHz when 99% of the world has no issue with16khz brickwalled lossy encoding". Arnie?[/quote]


If I didn't say it, I'll take credit for it anyway, because it is relevant and factual. ;-)

The classic trap related to high sample rate tests is monitoring equipment with IM.

Title: 44.1 vs 88.2 ABX report at AES
Post by: WernerO on 2014-03-11 07:04:15
Perhaps.

Another weakness is their apparent use of a less than blameless sample rate convertor (Pyramix 6).
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2014-03-11 12:44:01
Perhaps.

Another weakness is their apparent use of a less than blameless sample rate convertor (Pyramix 6).


Pyramix 6.2.3

http://src.infinitewave.ca/ (http://src.infinitewave.ca/)

(http://src.infinitewave.ca/images/Sweep/Pyramix.png)

Yeccch!
Title: 44.1 vs 88.2 ABX report at AES
Post by: Kees de Visser on 2014-03-11 13:28:03
Yeccch!
Are you sure ? I agree it's not state of the art anymore (Pyramix v.6 is an old version btw) and for this kind of testing there's no excuse for not using the best SRC available, even for DSD to 24/192.
The purple colour means the distortion products are around -120dBFS. You posted recently in another thread:
Seems to completely miss the point is that when measured performance is sufficiently high (e.g. the 100 dB rule) subjective tests are a complete and total waste of time.
I'd be interested to see more test details.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Wombat on 2014-03-11 14:51:51
Are you sure ? I agree it's not state of the art anymore (Pyramix v.6 is an old version btw) and for this kind of testing there's no excuse for not using the best SRC available, even for DSD to 24/192.
The purple colour means the distortion products are around -120dBFS.

Looking at the passband behaviour i really wonder what is happening there. I have no clue about math but none of the known as correctly working SRCs has this wierd pattern.
This may hint to a problem with the v.6 resampler we can't exactly gather with the 3 graphs offered.

(http://src.infinitewave.ca/images/Passband/Pyramix.png)
Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2014-03-11 22:55:53
Another weakness is their apparent use of a less than blameless sample rate convertor (Pyramix 6).

This might be a dumb question, but: How do you know this? Did you review the paper?

Btw, happy (belated) 35th birthday to the Compact Disc (http://www.ieeeghn.org/wiki/index.php/Milestones:Compact_Disc_Audio_Player,_1979)! That format was way ahead of its time, I would say (notice I say format, not implementations thereof).

Apparently Sony also presented some CD prototype at an AES convention 35 years ago today.

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: WernerO on 2014-03-12 06:35:57
This might be a dumb question, but: How do you know this? Did you review the paper?


I read the paper. It states they used Pyramix 6. It does not state explicitly they used it for SRC. Hence the 'apparent' in my posting.

As others already said, there is no excuse for not using fully blameless SRC in this sort of exercises.

Title: 44.1 vs 88.2 ABX report at AES
Post by: C.R.Helmrich on 2014-03-12 08:26:19
I read the paper.

But I thought the paper hasn't been published yet... Are we talking about the same paper? The one for the upcoming AES?

Chris
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2014-03-12 12:13:36
Yeccch!
Are you sure ? I agree it's not state of the art anymore (Pyramix v.6 is an old version btw) and for this kind of testing there's no excuse for not using the best SRC available, even for DSD to 24/192.
The purple colour means the distortion products are around -120dBFS. You posted recently in another thread:
Seems to completely miss the point is that when measured performance is sufficiently high (e.g. the 100 dB rule) subjective tests are a complete and total waste of time.
I'd be interested to see more test details.



Yes, I spoke prematurely without understanding the true meaning of the color scale.

As you say, the test results show a bunch of inaudible artifacts, even though there are plenty of resamplers that are far cleaner.

The artifacts look to me like evidence of a process that was not adequately dithered.
Title: 44.1 vs 88.2 ABX report at AES
Post by: WernerO on 2014-03-13 07:08:52
Sorry, I referred to the original paper, in the subject title. Seems like I have been spinning around slowly.
Title: 44.1 vs 88.2 ABX report at AES
Post by: CherylJosie on 2014-07-11 19:42:55
Excuse me for beating this topic to death again? I just want to be clear. Feel free to ignore this comment if it is too long for you.

I am not sure anyone ever addresses sample rate vs. dynamic range independently of each other, and comprehensively, or even discusses the way that audio production might have theoretically distinct need for these two distinct benefits at each step.

I see that omission as a major flaw in the entire audio industry. Instead we see audio implemented with increasingly wide bandwidth and dynamic range as technology enables each to expand, all the way along the production chain from initial capture to final playback, while enthusiasts and detractors battle it out in the popular media.

This trend seems mindless to me. These two separate factors in commercial audio formats have historically been linked to the bleeding edge of technology with no regard for the actual theoretical limitations of the benefits of each variable in the audio realm.

Even more alarming is the trend in video standards. 'Nuf said.

Maybe the audio formats need to implement both bandwidth and dynamic range to the limits of technology in order to preserve economies of scale for instrumentation applications that might also need that format. Does that mean every application of the format must maximize both bandwidth and dynamic range simultaneously to their technical limits, including applications intended only for end-of-chain human consumption?

It is a simple matter to include in any media format specification a full matrix of bandwidth and dynamic range such that the flexibility of storage is maximized. Computer media standards are independent of data (except for specialized audio/video formats that computers are also compatible with) and already incorporate a full matrix of audio sample rate and audio dynamic range. The matrix of bandwidth and dynamic range is limited only by the computer's supporting hardware and software.

If a library of commercially purchased physical media is not similarly optimized it is wasteful and/or inadequate.

At the very least, the preservation of a standardized range of data formats on commercial media allows for the trading off of performance versus program length. Surely this is an advantage with little to no inherent cost penalty?

I guess now that the possibilities have been opened up for discussion it is time to narrow down the range of selections that make theoretical sense for audio production and distribution. I will start with bandwidth since that is the easiest to conceptualize.

Human hearing has inherent physical limitations to its bandwidth. These limitations are implicitly known and understood after decades of testing with sensitive instrumentation. The limitations are tested from inaudibility up to beyond the threshold of pain in the laboratory equivalent of a soundproof high power home theater.

People suffered in the service of science when those limitations were first characterized. Any discussion of enhanced bandwidth for audio begs the question if it is worth the pain, because the only way any human can perceive the bandwidth that extends into the threshold of pain is if it is accompanied by suffering.

Who would reasonably go there, knowing that the experience will not only be painful, but also that their own hearing would be permanently compromised by repeated or perhaps even a single exposure to audible energy in such a band?

The question becomes even more poignant if we take into account the fact that most of the human population has not even a transient opportunity to experience the full spectrum of such joys, since average hearing falls short of optimal hearing. At some point, the bounds of reason are violated.

My own theoretical understanding of the state of the art in audio is that the bandwidth of human hearing is the only relevant limitation to the bandwidth of commercially released digital audio recordings. Either 44.1KHz or 48KHz oversampling converters already exceed the bandwidth of human hearing. Commercial formats that include ultrasonic bandwidth add nothing of value to audio.

Ultrasonic bandwidth is interesting for instrumentation, but when present in the home or professional audio realm, ultrasonics merely generate intermodulated difference frequencies that shift down to audio frequencies in the output of an analog power amplifier.

The addition of audible distortion is not a benefit to audio, period. No manufacturer is reasonably going to develop a wide-bandwidth instrumentation-quality audio power amplifier to drive your super-tweeters simply so that Fido can hear undistorted dog whistles at the delivery deadline for your next beer from the ice bucket. Ultrasonics have no place in human audio. Only dogs will benefit from ultrasonics. Is Fido worth the wasted resources?

I know of no jitter advantage or any other audible advantage that is even a weak function of bandwidth.

All further remarks are completely independent of the question of the benefits of ultrasonic bandwidth on either a theoretical or a practical level and they must be addressed independently of ultrasonic bandwidth.

Now I wish to address dynamic range. Here again, the question asked must be separated into two components: the production process of a commercial recording versus the final delivery format.

Dynamic range is a fancy term for signal-to-noise ratio. The more noise in each individual signal, the more noise in the final product and the less dynamic range it will have. At some point called the noise floor, the relevant signal content becomes buried in the noise and at that point the signal is either perceptibly contaminated or completely inaudible.

It is important that all audible components of a program should be preserved at every stage in the production regardless of their relative energy content. They must all be maximally preserved in the production process because it is impossible to know in advance which components will ultimately be present, enhanced, or filtered out in the final product.

When the final product is complete, some components will be audible and others will not. At that point, the inaudible components may be safely discarded for economy in the final product while losing nothing at all of value to the listener. In some if not all final mixes, the final s/n of the final product is probably below that of even 'low-res' media anyway due to the combination and modification of all the input sources, some of which are probably low-res and/or noisy initially anyway.

Maybe someone who samples commercial recordings for incorporation into derivative works might benefit from the extra dynamic range of high-res final product. Short of the sampling artist and possibly instrumentation applications, no one else will benefit from dynamic range that exceeds the limits of human hearing in a final product that is distributed on audio-only media.

There is one possible exception of the sound algorithms in receivers that synthesize surround channels from a program. Any benefit of more bits is probably inaudible at that late processing stage anyway. (Is it? Who cares?)

Even if one wants to experience levels beyond the threshold of pain to marginally increase the audible bandwidth, such person will experience immediate and possibly permanent reduction in acuity at the threshold of audibility way down at the other end of the range of hearing, making such excessive dynamic range useless for the purposes of actual listening.

Linking bandwidth and dynamic range as if they are the same thing is a theoretical mistake in the audio realm. They are not the same thing. Incorporating more of either one into the final commercial product with no regard for the limitations of hearing, technology, or increasingly scarce natural resources is folly.

The same independence applies to dynamic range of the final delivery format, taken distinctly from the dynamic range of the entire production chain up to (but not including) the removal of the unnecessary LSBs in the final product. These are also two distinct problems, with theoretically independent factors driving their distinct best practices.


High definition audio media format standards seem to universally incorporate both bandwidth and dynamic range enhancements with no regard to the theoretical and practical limitations of the analog hardware they attach to, let alone the bounds of human hearing or economic reason. They seem to be driven entirely by the bleeding edge of technology on all fronts and cost be damned, as if the final target of such media could somehow incorporate instrumentation hardware.

I cannot speak for anyone else but if I ever need to design instrumentation that depends on an audio storage format I will surely use a simple drive that works with a simple computer and not an audiophile-grade over-priced over-performing toy media format emanating from a black box with a remote control.

So much for my theoretical understanding. I guess I will pause at my understanding for a while and see if gored sacred cows fly at me from catapults. Anyone?


Regarding the study itself, I was obviously confused by the comments when viewed in light of my theoretical understanding that differs so obviously and frustratingly from current 'best practices'.

The first thing that confused me is, what question exactly were the researchers asking and did it make sense to ask it theoretically or should it have maybe been something else? The second is, did they answer any questions, raise any new questions, or was the whole study poorly designed?

Is it live or is it Memorex? This would be the case where we want to know if material sampled mixed processed and delivered at high resolution sounds better, or if the final down-conversion to low-res is audibly transparent. This is perhaps the easiest question to answer with commercially available formats so perhaps that is why it was asked.

IMO 'because it is there' is a better reason to do anything than 'because it was easy'. Converters are known to perform beyond the limits of hearing and the standard way to prove that is with instrumentation that also functions beyond the limits of hearing, otherwise there is no way to prove that what is being heard is a faithful reproduction because consensus relies on scientific testing.

The assumption of course is that the only difference between high-res product and low-res product is the final down-conversion. The study guaranteed this by design of the study using a single high-res source to generate a low-res copy?

Unfortunately, this is not the current practice of media distributors. The CD quality format is processed for broadcast to automobiles where dynamic range is useless so the recordings are compressed to make them sound better over the road noise. This enhances the marketability of artists.

The high resolution products are a combination of stereo and multichannel recordings with widely varying formats and their application is strictly for home theater or audiophile systems that have a demand based on the performance of the media rather than the performance of the artist.

So the study asked a question about the state of media that does not exist in practice, it completely neglected to isolate bandwidth from dynamic range, and it did not even address the finer aspects of the production process to see if current best practices as designed by audio production engineers are adequate and reasonable.

Multichannel audio was completely ignored. Is MPEG etc. acoustically transparent in commercial media also/not compared to HD-MA?

The study failed to ask any relevant questions at all except to verify hardware adherence to the theoretically inaudible difference between high resolution oversampling DAC versus low resolution oversampling DAC versus low resolution oversampling ADC, and it apparently did so in a room that was not soundproofed with listeners who had no hearing tests of any type and using a converter with audibly compromised performance as verified in specifications and/or instrumentation testing of the device itself and operating it in modes that are not necessarily optimal for the entire audio production and distribution chain.

Testing converters for their technical accuracy with human hearing is one thing, but doing it incorrectly with compromised hardware and uncharacterized humans in noisy rooms is something else entirely IMO.

The study also used a low-res source for comparison to a high-res source. I am not sure what the point of that low-res source is. What question does that answer? Is it intended to verify the necessity for more resolution in the production of a final product than in the delivery format of the final product? If so, how does this particular test accomplish that, given that the benefits are strictly in signal-to-noise ratio and that they may only be realized by applying mixing and processing to many discrete signals in a total production of a new product?


It seems obvious that extra bandwidth in commercial audio formats is stupid. Likewise extra dynamic range. I assert on a theoretical level so my assertion relies on an entire body of pre-existing science. Unless I have made a gross technical error somewhere, my assertions cannot be theoretically debated.

It seems to me that the correct A/D conversion mode for studio recording is 48KHz/24 bits and the correct delivery format for D/A conversion is 48KHz/16 bits. This places the actual theoretical performance at each stage of the audio production firmly within the bounds of theoretically reasonable necessity as well as relaxing the design space for frequency response of the actual implementations to the broad side of a barn including even eliminating the largely redundant sample frequency with the most demanding analog filter requirements. It also allows for the retention of all the additional high-bandwidth modes in hardware and media via firmware, so that all instrumentation converter applications remain within reach of design engineers.


It is important to know the question being asked by the study and the theoretical basis for the question in the first place. How can we intelligently discuss the study without first framing its questions in our (my? your?) current best understanding of reality?

The worst part of the whole question in the first place is how unnecessary the confusion really is, given that design engineers make this sort of trade-off evaluation every time they design a product. It is a cut-and-dry algorithmic process to evaluate human hearing and to develop instrumentation to accomplish the evaluation with. It has been done. It is over.

Just let a company advertise a product with 'only' audio bandwidth compared to one with 'true high definition' ultrasonic bandwidth that degrades the audio performance. See which one is deemed the 'inferior' product. It is a numbers game. Guys, didn't you get the memo? Size is not everything.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Thad E Ginathom on 2014-07-11 21:55:14
I am not sure anyone ever addresses sample rate vs. dynamic range independently of each other, and comprehensively, or even discusses the way that audio production might have theoretically distinct need for these two distinct benefits at each step.



Yes, often, frequently, in depth and in detail. Does that invalidate the rest of your post?
Title: 44.1 vs 88.2 ABX report at AES
Post by: Mach-X on 2014-07-12 03:57:38
This could all be avoided if a CD sized, laser read analog format was created. Didn't laserdiscs use analog like an LP but read by laser? Then the only argument would be the best way to rip it..?
Title: 44.1 vs 88.2 ABX report at AES
Post by: splice on 2014-07-12 07:09:20
[quote author=Mach-X link=msg=869481 date=1405133858]This could all be avoided if a CD sized, laser read analog format was created. Didn't laserdiscs use analog like an LP but read by laser? Then the only argument would be the best way to rip it..?[/quote]

Laserdiscs used PWM (Pulse Width Modulation) for video and FM audio. (Later discs had digital audio.) They weren't losslessly rippable.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Mach-X on 2014-07-12 20:14:19
Well of course they weren't rippable, nobody had laserdisc drives in their pcs back then
My point was about the feasability of a strictly analog disc format read by a fine laser such as bluray. Heck they could even do it dualdisc style with analog one side, cdda on the other, then put it in an oversized cardboard lp cover. Then everbody would be happy.
Title: 44.1 vs 88.2 ABX report at AES
Post by: CherylJosie on 2014-07-14 01:41:29
I am not sure anyone ever addresses sample rate vs. dynamic range independently of each other, and comprehensively, or even discusses the way that audio production might have theoretically distinct need for these two distinct benefits at each step.



Yes, often, frequently, in depth and in detail. Does that invalidate the rest of your post?


For all the discussion that is allegedly going on about splitting bandwidth from dynamic range and applying both intelligently to consumer products, there seems to be no one at home in the technical standards committees with a voice in the matter.

I guess that was who I meant by 'anyone'; anyone that might actually have been able to put a stop to this madness before the train left the station.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Arnold B. Krueger on 2014-07-15 14:47:46
[quote author=Mach-X link=msg=869481 date=1405133858]This could all be avoided if a CD sized, laser read analog format was created. Didn't laserdiscs use analog like an LP but read by laser? Then the only argument would be the best way to rip it..?


Laserdiscs used PWM (Pulse Width Modulation) for video and FM audio. (Later discs had digital audio.) They weren't losslessly rippable.
[/quote]

http://en.wikipedia.org/wiki/LaserDisc#Recordable_formats (http://en.wikipedia.org/wiki/LaserDisc#Recordable_formats)

"The two FM audio channels occupied the disc spectrum at 2.3 and 2.8 MHz on NTSC formatted discs and each channel had a 100 kHz FM deviation. The FM audio carrier frequencies were chosen to minimize their visibility in the video image, so that even with a poorly mastered disc, audio carrier beats in the video will be at least ?35 dB down, and thus, invisible. Due to the frequencies chosen, the 2.8 MHz audio carrier (Right Channel) and the lower edge of the chroma signal are very close together and if filters are not carefully set during mastering, there can be interference between the two. In addition, high audio levels combined with high chroma levels can cause mutual interference, leading to beats becoming visible in highly saturated areas of the image. To help deal with this, Pioneer decided to implement the CX Noise Reduction System on the analog tracks. By reducing the dynamic range and peak levels of the audio signals stored on the disc, filtering requirements were relaxed and visible beats greatly reduced or eliminated. The CX system gives a total NR effect of 20 dB, but in the interest of better compatibility for non-decoded playback, Pioneer reduced this to only 14 dB of noise reduction (the RCA CED system used the "original" 20 dB CX system). This also relaxed calibration tolerances in players and helped reduce audible pumping if the CX decoder was not calibrated correctly.
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2014-07-15 15:08:31
[quote author=Mach-X link=msg=869481 date=1405133858]This could all be avoided if a CD sized, laser read analog format was created.[/quote]Why would you want an analogue format? Encasing a reflective layer in plastic and reading it with a laser doesn't make the system perfect - ask any laser disc owner. Scratches would be audible. Pressing blemishes could be audible. Error correction would be impossible and imperfect error concealment would be the best you could do. Using FM helps in many ways, but brings its own problems.

"Ripping" them would be a digitisation process, not a 1:1 bit-perfect transfer process. Each "rip" would be different, just like each transfer of a vinyl LP is different. There's no way of accessing exactly what was pressed - such is the curse of analogue. That's why digital is a blessing!

btw, a (non-laser) analogue successor to vinyl records is...
http://en.wikipedia.org/wiki/Capacitance_Electronic_Disc (http://en.wikipedia.org/wiki/Capacitance_Electronic_Disc)

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2014-07-15 15:15:26
Even more alarming is the trend in video standards. 'Nuf said.
Given a suitable screen size and viewing distance, a human with normal visual acuity can see the benefits of more pixels. Wider colour gamut, higher dynamic range, and higher frame rate all bring easily visible benefits. I don't know why any of this should be "alarming". It's a much better use of technology than higher audio sample rates.

Quote
It seems to me that the correct A/D conversion mode for studio recording is 48KHz/24 bits and the correct delivery format for D/A conversion is 48KHz/16 bits.
You are preaching to the choir here.

Cheers,
David.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Moni on 2014-07-15 15:24:41
I am not sure anyone ever addresses sample rate vs. dynamic range independently of each other, and comprehensively, or even discusses the way that audio production might have theoretically distinct need for these two distinct benefits at each step.



Yes, often, frequently, in depth and in detail. Does that invalidate the rest of your post?


What might be more helpful is naming or linking a few.
Title: 44.1 vs 88.2 ABX report at AES
Post by: Mach-X on 2014-07-15 16:06:35
Why would you want an analogue format? Encasing a reflective layer in plastic and reading it with a laser doesn't make the system perfect - ask any laser disc owner. Scratches would be audible. Pressing blemishes could be audible. Error correction would be impossible and imperfect error concealment would be the best you could do. Using FM helps in many ways, but brings its own problems.

"Ripping" them would be a digitisation process, not a 1:1 bit-perfect transfer process. Each "rip" would be different, just like each transfer of a vinyl LP is different. There's no way of accessing exactly what was pressed - such is the curse of analogue. That's why digital is a blessing!

How is any of this different from a vinyl record? You forgot to include the part where I suggested it could be done dualdisc style...analog on one side for the nutjobs enthusiasts, digital CDDA on the other for the rest of us ease of use/1:1 ripping.

My whole point was mostly in jest, but now I think it could be really neat. Think about the ABX possibilities! Finally, assuming both sides used the same mastering the argument could be settled once and for all!
Title: 44.1 vs 88.2 ABX report at AES
Post by: 2Bdecided on 2014-07-15 16:31:12
[quote author=Mach-X link=msg=869683 date=1405436795]How is any of this different from a vinyl record?[/quote]It's not. That's why it's a bad idea!

Quote
My whole point was mostly in jest
OK, I thought you were being really serious.

Cheers,
David.
SimplePortal 1.0.0 RC1 © 2008-2019