Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Successful ABX of TPDF white dither vs. noise-shaping at normal listen (Read 35201 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #50
I've posted a clip [a href='index.php?showtopic=74658']on the parallel upload thread[/a] of one of the softest passages in my music collection, the beginning of the development section in the first movement of Beethoven's Symphony #3 "Eroica".  This is the Minnesota Orchestra again.  It's a BIS hybrid SACD, which means that the CD version is presumably mastered according to current best practices, and also that I can do informal comparisons with a high-resolution version (originally recorded in 24/44.1).  Reviewers faulted the conductor for making this passage so soft.  To me it sounds superb in SACD stereo, and the CD version is still pretty good, though a little less clean and with less sense of ambience.  Looking at a spectrogram, I find that, contrary to my expectations, it is noise-shaped.  The shaped noise completely obliterates any feature above 15 kHz.


The only reason why it appears that the features are obliterated is that they are so small in amplitude.  There is program material with strong content @15 KHz and up, this just isn't it. 

Furthermore, musical sounds that appear to be obliterated in a spectrogram may be heard as long as they are not too far below the noise level.

Quote
I am curious to know whether I could possibly perceive anything above 15 kHz in this soft a context.


That's what Fletcher Munson curves are for getting a worst-case analysis of.

Here's how you use them in this case.

Play the music at your preferred listening level, and adjust the 0 dB point of your spectrogram of the music playing accordingly. Let's say that 0 dB of the spectrogram of the track corresponds to the spectrogram you'd get playing the music at  95 dB SPL. 

Then -120 dB on the spectrogram (what I get for music around 15 KHz before the shaped noise cuts in) would correspond to -25 dB SPL.

Looking at the Fletcher Munson curves, I see that the threshold of hearing above 15 Khz is something like 10 dB SPL.  IOW, the music above 15 KHz is about 35 dB below the threshold of hearing. Nobody hears that!




Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #51
IOW, the music above 15 KHz is about 35 dB below the threshold of hearing. Nobody hears that!


This sounds plausible.  However, I would still be more comfortable with this result if I were allowed to fail an ABX using the 24-bit source with and without 15 kHz lowpass.  Of course, we would have to agree on a maximum volume setting I was allowed to use, otherwise the test would be meaningless.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #52
IOW, the music above 15 KHz is about 35 dB below the threshold of hearing. Nobody hears that!


This sounds plausible.  However, I would still be more comfortable with this result if I were allowed to fail an ABX using the 24-bit source with and without 15 kHz lowpass. 


What stops you?

Quote
Of course, we would have to agree on a maximum volume setting I was allowed to use, otherwise the test would be meaningless.


When you're trying to hear things with music that are 25 dB below the threshold of audibility for pure tones, you'd have to go way crazy with levels for it to make a difference.  All I ask is that you use a level setting that you would use for the entire piece of music.


I used Fletcher and Munson curves  for my example, but Fletcher and Munson's curves don't include masking and masking is a BIG dessensitizer.

Becuase they don't consider masking, the Fletcher and Munson curves are generally highly optimistic. They'd make you believe that you could hear things that you actually can't.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #53
IOW, the music above 15 KHz is about 35 dB below the threshold of hearing. Nobody hears that!


This sounds plausible.  However, I would still be more comfortable with this result if I were allowed to fail an ABX using the 24-bit source with and without 15 kHz lowpass. 


What stops you?

I don't have the 24-bit file, and I don't know how to get it or generate an equivalent program from the SACD layer.
Quote
Quote
Of course, we would have to agree on a maximum volume setting I was allowed to use, otherwise the test would be meaningless.


When you're trying to hear things with music that are 25 dB below the threshold of audibility for pure tones, you'd have to go way crazy with levels for it to make a difference.  All I ask is that you use a level setting that you would use for the entire piece of music.


I used Fletcher and Munson curves  for my example, but Fletcher and Munson's curves don't include masking and masking is a BIG dessensitizer.

Becuase they don't consider masking, the Fletcher and Munson curves are generally highly optimistic. They'd make you believe that you could hear things that you actually can't.


Is there a program that I could download that would map out my personal threshold-of-hearing levels for me?

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #54
I'd like to state my rationale for paying no attention to the predictions of psychoacoustic theory in the analysis of listening tests.

I have long been skeptical of the idea of using loudness curves for high frequencies.  They are mapped out using steady tones, but the main high-frequency content of what we listen to is transient and therefore useful mainly for information on the timing of events.  This is very different from the information we derive from the lower frequency content, which we resolve into pitches.  Thus when I encounter a high-frequency sine wave, my brain should be more inclined to either associate it with an object too small to be of concern, or else to assume that it's dealing with a malfunctioning sensor of some sort.  On the other hand, broadband transient events are more likely to be indicators of something worthy of attention.  Have there been studies of the relative audibility of broadband pulses compared to sine waves in the 10-20 kHz band?


Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #56
I don't have the 24-bit file, and I don't know how to get it or generate an equivalent program from the SACD layer.
With respect, this sounds like you are comparing different sources - or assuming that the 16/44.1 version that you have was mastered in the same way as the 24/96 version. This may not be the case.

What we really need is a sample from the 24/96 version which would then be resampled to 24/44.1 and then shortened to 16/44.1 with various permutations of dither, shaped dither, no dither....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #57
I don't have the 24-bit file, and I don't know how to get it or generate an equivalent program from the SACD layer.
With respect, this sounds like you are comparing different sources - or assuming that the 16/44.1 version that you have was mastered in the same way as the 24/96 version. This may not be the case.

What we really need is a sample from the 24/96 version which would then be resampled to 24/44.1 and then shortened to 16/44.1 with various permutations of dither, shaped dither, no dither....


I'm talking about BIS-SACD-1516, which was originally recorded in 24/44.1.  What I would like is a bit-for-bit copy of the original master, which has never been issued to the public.  A flat-dithered 20/48 version would also be acceptable.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #58
I'd like to state my rationale for paying no attention to the predictions of psychoacoustic theory in the analysis of listening tests.

I have long been skeptical of the idea of using loudness curves for high frequencies.  They are mapped out using steady tones, but the main high-frequency content of what we listen to is transient and therefore useful mainly for information on the timing of events.  This is very different from the information we derive from the lower frequency content, which we resolve into pitches.  Thus when I encounter a high-frequency sine wave, my brain should be more inclined to either associate it with an object too small to be of concern, or else to assume that it's dealing with a malfunctioning sensor of some sort.  On the other hand, broadband transient events are more likely to be indicators of something worthy of attention.  Have there been studies of the relative audibility of broadband pulses compared to sine waves in the 10-20 kHz band?


A good question.

It's not a good reason for "paying no attention to the predictions of psychoacoustic theory" since there are plenty of pschoacoustic models that accurately predict this kind of phenomenon. Once you calculate the spectrum level that reaches the inner ear, and throw in some basic signal detection theory in the known bandwidths, you get close. Depending on the signal, masking could be even more significant.

However, if you're seriously interested in finding references,you could ask on the auditory list...
http://www.auditory.org/
...politely - the gods of psychoacoustics subscribe to that, so it's not place for mortals who brag about "paying no attention to the predictions of psychoacoustic theory".

Cheers,
David.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #59
I'd like to state my rationale for paying no attention to the predictions of psychoacoustic theory in the analysis of listening tests.

I have long been skeptical of the idea of using loudness curves for high frequencies.


If you proceed from that idea  to  thinking that loudness curves usually give a highly optimistic view of what's actually audible, then you would be right. The ear generally responds to the energy level of sounds, and the energy levels of pulses are always far lower than a sine wave of the same amplitude.

Quote
They are mapped out using steady tones, but the main high-frequency content of what we listen to is transient


Where did you get that strange idea?


Quote
and therefore useful mainly for information on the timing of events.


Beyond a certain fairly conservative point, the ear is relatively insensitive to the timing of events, except as they might interfere with each other and thus add or subtract from each other.

For example, above a few KHz, the ear has very little sense of phase except as sounds add or subtract from each other.

Quote
This is very different from the information we derive from the lower frequency content, which we resolve into pitches.


I don't know what you're calling high frequencies, but the ear resolves pitch pretty well at higher frequencies.


Quote
Thus when I encounter a high-frequency sine wave, my brain should be more inclined to either associate it with an object too small to be of concern,


???????????????

Quote
or else to assume that it's dealing with a malfunctioning sensor of some sort. 


On the other hand, broadband transient events are more likely to be indicators of something worthy of attention.



Only insofar as broadband refers to bandwidth extension into the lower frequencies.

Quote
Have there been studies of the relative audibility of broadband pulses compared to sine waves in the 10-20 kHz band?


Yes, if a sine wave of a certain level in the 10-20 KHz is hard to hear, the corresponding narrow pulses are near impossible or straight out impossible to hear. Most high frequency pulses that we hear, we hear them only because they have a very high peak amplitude.

Sine waves are generally the easiest of all sounds to hear at a given amplitude.

The ear functions a lot like a spectrum analyzer with fairly broad bands and mediocre transient response.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #60
Quote
They are mapped out using steady tones, but the main high-frequency content of what we listen to is transient


Where did you get that strange idea?



From an a priori information-theoretic analysis of the act of musical listening.  Think of it this way: if you put a musical program through a 5 kHz lowpass, you lose 3/4 of the bandwidth but nowhere near 3/4 of the useful information, and none at all of what we normally understand as the pitch content of music.  Therefore, if the top two octaves are to be of any use, they must be processed in the brain in such a way that the vast majority of the information is simply discarded.  I maintain that periodic vibrations above 10kHz do not provide humans with any useful information; they're just redundant harmonics for the most part.  What musicians could learn to use this bandwidth for processing of a totally different kind than ordinary processing?  This skill would be a kind of sixth sense, unimaginable to people who haven't developed it.  It's called the sense of rhythm.  By the Heisenberg uncertainty principle, an event can be localized in time only in inverse proportion as it is delocalized in frequency-space; musicians need a sense of timing more precise than what you think is possible, and utilizing high frequencies for event timing is the only way to get that sense which is consistent with the mathematics.

Obviously, Mr. Krueger, you have no sense of rhythm.

Clarification:  By "main high-frequency content" I mean useful information content, not, say power content, in which sense the statement would obviously be false.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #61
Sorry about the tone of my last post.  I've been in brainstorming mode: when you have a phenomenon that seems impossible to explain by normal means, you think up all possible explanations without regard for conventional understanding, and reject ideas only when the evidence against them appears logically airtight.  Thus a dismissal of an idea because it is contrary to current understanding is offensive to somebody in that mode.  Again, my apologies for losing my cool.

I mapped out my equal loudness curve on the website provided; thank you for the link.  Unfortunately it only has one data point in the band where noise-shaping energy is concentrated, so its relevance to the subject of this thread is distinctly limited.  Hopefully this is a fairly normal curve.

30 Hz    -12
45 Hz    -18
60 Hz    -21
90 Hz    -27
125 Hz  -30
187 Hz  -39
250 Hz  -45
375 Hz  -51
500 Hz  -57
750 Hz  -60
1 kHz    -63
1.5 kHz -60
2 kHz    -60
3 kHz    -57
4 kHz    -54
6 kHz    -51
8 kHz    -51
12 kHz  -48
16 kHz  -3

Thank you for the pointer to the Auditory list.  I had never heard of it.  How do you access it?  The link gave me a Bad Request (Invalid Hostname) error on my browser.

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #62
Thank you for the pointer to the Auditory list.  I had never heard of it.  How do you access it?  The link gave me a Bad Request (Invalid Hostname) error on my browser.
The link works fine for me. I hadn't heard of the list either and have just subscribed to find out how much above my head the discussions are

 

Successful ABX of TPDF white dither vs. noise-shaping at normal listen

Reply #63
Quote
They are mapped out using steady tones, but the main high-frequency content of what we listen to is transient


Where did you get that strange idea?



From an a priori information-theoretic analysis of the act of musical listening.  Think of it this way: if you put a musical program through a 5 kHz lowpass, you lose 3/4 of the bandwidth but nowhere near 3/4 of the useful information, and none at all of what we normally understand as the pitch content of music.  Therefore, if the top two octaves are to be of any use, they must be processed in the brain in such a way that the vast majority of the information is simply discarded.  I maintain that periodic vibrations above 10kHz do not provide humans with any useful information; they're just redundant harmonics for the most part.  What musicians could learn to use this bandwidth for processing of a totally different kind than ordinary processing?  This skill would be a kind of sixth sense, unimaginable to people who haven't developed it.  It's called the sense of rhythm.  By the Heisenberg uncertainty principle, an event can be localized in time only in inverse proportion as it is delocalized in frequency-space; musicians need a sense of timing more precise than what you think is possible, and utilizing high frequencies for event timing is the only way to get that sense which is consistent with the mathematics.


Your argument fails because the top octave is often of no use. With very many musical works, you can bandpass the music at say 10 KHz and virtually nothing is lost either in terms of either tone or of rhythm. Therefore your entire argument fails because it is based on the idea that bandpassing music at 10 KHz substantially reduces the listener's ability to sense rhythm.

Quote
Obviously, Mr. Krueger, you have no sense of rhythm.


Your argument also fails to support or even remotely suggest this.

Quote
Clarification:  By "main high-frequency content" I mean useful information content, not, say power content, in which sense the statement would obviously be false.


Well, what is useful?  One likely difference between you and I is that as a recordist (literally 1,000s of live performances recorded for hire) and live sound mixer (Over 600 gigs) I am constantly working with applying various kinds of filters to music.

For example, people talk about the high frequency content of cymbals whether struck or brushed, but in either case the spectral energy usually peaks in the 3-7 KHz range. Brickwall filter cymbals with a 10 KHz filter and while some of the sheen is gone, very little that one would use to determine rhythm or timing will be missing.