Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Results for 24bit/96KHz test (Read 95447 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Results for 24bit/96KHz test

Reply #25
Quote
About volume, yes I use just the master volume on Revo control panel.  It's set on about 3/4.

Is this an analogue or digital control?

If it's like the typical M-audio controls, it's digital.

This means it will be scaling and re-dithering or truncating the signal. At a given bit-depth, over twice the dither or noise power will be falling into the audio band for 44.1kHz sampled material compared with 96kHz sampled material (since the dither noise or truncation artefacts will spread across the entire sampled bandwidth).

So, if the card dithers, there will be more noise at 44.1kHz than at 96kHz. If the card doesn't dither, then they'll be more distortion at 44.1kHz than at 96kHz, and that distortion will alias down, and hence be inharmonic.

Even if both samples are at 96kHz (as in most of these tests), a sample with energy above 22kHz may have a slightly different spectrum below 22kHz after truncation than a sample without anything above 22kHz.


This may or may not be an issue, but it seems it should be avoided.


Another interesting test will be to ABX dithered silence against a 20kHz high-pass filtered version.


Of course, the "picking" of ABX results needs to be clarified first.

Cheers,
David.

Results for 24bit/96KHz test

Reply #26
Please forgive my ignorance as a complete newbie to both ABX and statistics (and also I am aware that this is pushing slightly off-topic).  But I am interested in the thinking behind simply adding together the results of multiple ABX sessions.

As an example, say I run three sessions, each of which I decide in advance will be 11 steps.  And the results are 5/11, 5/11, 11/11.  If I add these results I get 21/33 which indicates that I could not differentiate between the samples reliably at all.  It is the same as I would get from 7/11, 7/11, 7/11.

But for me the first set of results does seem more interesting than the 7/11 set.  Perhaps I was tired the day I did the two 5/11 sessions.  Or was just getting used to what to listen out for.  The fact is that in one session I was able to identify the samples in every case.  I can see that this does not make things as clear cut as if I was able to ABX 11/11 first time.  So it makes sense to have to provide the results of every session.  But to me, adding them together seems to remove significance from the results.

I suppose one answer to this is to say that if I have now learnt to differentiate between the two samples then I can carry on doing more ABX sessions and adding the results and eventually if I keep getting 11/11 then the probability value will become low enough to show I can indeed differentiate them.  But I could imagine I would tire after a while and perhaps not be able to sustain this for the amount of sessions needed.

Is there any statistical way of taking into account the split of results over different sessions?  Or is my understanding of this simply invalid from a statistical point of view?

Results for 24bit/96KHz test

Reply #27
Well yes you pretty much sum it up perfectly in my view phwip.  I understand completely what you mean tigre, and I certainly wouldn't perform a whole bunch of tests to get one good result.  I do often mess around with the sample for a while first though, because it takes time to learn and concentrate.  It's a complete waste of time to start counting results when I still can't hear or even imagine what the difference is.  I'm quite confident with my results though, because every now and then I can hear the difference quite easily (for just a few listens at a time), and in this case I always guess correctly.  It's all the rest of the time that's tiring and tedious.

Tigre, I think I've had a few good results for 16 vs. 24bit, and there was one time that it seemed really easy, but that hasn't happened again... i'm much more interested in sampling rate anyway.  But I should keep testing both.  What do you mean by how high I can hear?  It sounds like frequency, but you are talking about 16 vs. 24bit??

2Bdecided, yes, I was just thinking about the volume control before.. I'm not really that keen to test on full volume because it's so loud    I wish I had a good amp....

Results for 24bit/96KHz test

Reply #28
phwip: immagine you throw a coin 10 times. The probability to get 10 times heads in a row is 1/2^10, so most likely this won't happen. If you repeat this again and again, you'll get 10 times heads in a row sooner or later for sure. It's too hard for me to give a detailed explanation in English why adding the scores and calculating the p-value from the sum works, but in fact the "probability to reach this result by guessing" is the same for 5/11, 5/11, 11/11 and 3x 7/11 (besides that you would probably give up ABXing the same position focusing on the same problem after having scored 5/11 2 times  ).

Of course if you ABX a different position and/or focussing on a different problem, you don't have to add the scores of previous attempts (at least IMO, if you want to be uber-correct you'd have to).
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

Results for 24bit/96KHz test

Reply #29
Quote
Well yes you pretty much sum it up perfectly in my view phwip.  I understand completely what you mean tigre, and I certainly wouldn't perform a whole bunch of tests to get one good result.  I do often mess around with the sample for a while first though, because it takes time to learn and concentrate.  It's a complete waste of time to start counting results when I still can't hear or even imagine what the difference is.  I'm quite confident with my results though, because every now and then I can hear the difference quite easily (for just a few listens at a time), and in this case I always guess correctly.  It's all the rest of the time that's tiring and tedious.

Sounds to me like (at least most of) your results are statistically valid.

Quote
But I should keep testing both.  What do you mean by how high I can hear?  It sounds like frequency, but you are talking about 16 vs. 24bit??

I'm talking about frequency. It would be interesting if you weren't able to hear the frequencies themselves (i.e. pure sine tones) but could hear a lowpass at the same frequency. This could be regarded as evidence for the theory that the ear's "amplifier" (there's been a thread with details about this, but I can't find it right now) can be trigered by frequencies that aren't audible themselves, so these fequencies would change the sound audibly.
OTH 2Bdecided's idea needs to be checked first before jumping to such a conclusion. Related to this: Doesn't have your soundcard 2 sliders to control volume, one in digital domain ("master volume" or similar that will be bypassed by kernel streaming) the other controlling the analog amplification after DAC (maybe I confuse some things here ).
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

Results for 24bit/96KHz test

Reply #30
I hate statistics:

The probability to get 11 correct flips in a row is approx 0.005%.

The probability to get 11 correct flips in a row after max 30 tries is max 1%.
The probability to get 11 correct flips in a row after max 50 tries is max 2%.

Results for 24bit/96KHz test

Reply #31
It is perfectly valid, and recommended, to train oneself before starting the real ABX test.
What must not be done is deciding if we are "aware" after the tests. For example, if I'm not sure to succeed, I try a round of 8 ABX, and look at my score. Then I must not say, if I get 16/16, that I succeeded, because it was a training ! It would allow me to throw out as many failures as I want, because "they were just trainings". If I succeed during a training, I must necessarily do it again, for real, to get a valid result.

In short, it is necessary to decide before starting, if the result will be thrown out, or kept. Then we must hold on to what was decided. If we get 16/16 while we decided to throw it away, let's throw it away ! If we get tired, can't concentrate, and get 5/16 while it was decided to keep the result, then keep it.
In the last case, the mistake was to go on with the test while tireness came. Once the difference can't be heard anymore, the test must be paused.
Only give answers about which you are certain. That's the key  When you doubt, don't click. Keep the current score, close the program, and get some rest.

Results for 24bit/96KHz test

Reply #32
Thanks Pio2001, that makes things much clearer for me.  While I understand the mathematics in Tigre's coin example, I do think that the simplistic solution of adding together all the results only makes sense there because we are analysing a simple scenario (tossing a coin), where there are unlikely to be external factors that influence the result and vary over time.

For audio ABXing there are many additional factors such as tiredness, boredom, familiarity with the sample, peripheral noise, etc, which may need to be taken into account.  However, I agree that if you are able to include or exclude sessions, as long as you decide before that session starts, and if you take breaks during a session if you feel it is necessary, these together should hopefully limit the effect of these other factors.

Results for 24bit/96KHz test

Reply #33
I've got a master volume, and also faders for left and right channels.  They all change the volume with Kernel Streaming...

For the low-pass, the problem is, you don't know if you're hearing the absence of high frequencies, or the effects of the filter itself. Right?

Results for 24bit/96KHz test

Reply #34
Thanks for the stats sshd..
Quote
The probability to get 11 correct flips in a row after max 50 tries is max 2%.

Well, then, even considering a hypothetical scenario where I had tried every sample 50 times in total, that's still a very low probability, once you consider how many of these results I've had..  I'm not at all ignorant of statistics, but it seems that most of this thread is an endless re-justification of results.  It's reasonable for people to query things they can't reproduce themselves, but I'm certain it's not the results that are at fault. 

The volume issue sounds like a much more likely cause.
Maybe I can listen with earplugs.. 

Results for 24bit/96KHz test

Reply #35
Ok, I've been quite absent from some time here (this is a very time-consuming hobby, and now I want to do other things), but now I'd just want to add a few things, which are the reason for the last test samples I posted.

According to published specs, listen headphones shouldn't have any significant  response at the frequency cutoff at the lovely_lowpass2 sample (IIRC they are rated up to 21 KHz).

My attempt to explain this and previous results is that ultrasonic information may be getting audible due to intermodulation somewhere at the listening chain, causing very low level products at audible frequencies, together with what seems to be exceptionally good low-level hearing abilities of listen, and also his isolating DJ headphones. However, this is not more than the most reasonable explanation I can find for the results.

Also, according to published specs and my own measurements of the Revo soundcard, the noisefloor of lovely_dith2 sample should be quite below what card hardware can resolve, more taking into account he was not listening at full volume. Maybe that's the reason why you can't get a consistent ABX result here.

As a side note, M-Audio control panel attenuation is always performed digitally at 24-bit resolution or more, so quantization distortion should not ever be an issue here. A poorer effective dynamic range would.

I don't know when I will write again, I just exposed more information so that you all have more to think about.

Results for 24bit/96KHz test

Reply #36
Thanks for your answer, KikeG. I've been thinking about similar hardware related problems as possible reasons before too, but for some reason I forgot posting about it.

So I've got 2 questions:

1. To listen: Would you be ready (and do you have the equipment) to do some loopback tests, i.e. connect your headphone amp to your soundcard's linein (maybe it would be good to use some one-input-two-outputs adaptor to connect the headphones at the same time) and record some test signals to find out more about KikeG's ideas?

2. To KikeG, 2Bdecided: What kind of test signals would be good to detect intermodulation distortion (and other problems that could be caused by equipment)? Probably sine sweeps and single-click impulses (at different sampling rates) - but maybe something else additionally like combinations of several low and high frequency sine waves (or sweeps)?
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

Results for 24bit/96KHz test

Reply #37
When you add two sinusoides of different frequencies F1 and F2, if there is some intermodulation between them, then the frequencies F1-F2 and F1+F2 appear in addition.
You can choose 14 kHz and 18 kHz. A new tone should appear at 4 kHz.
The classic intermodulation experiment shows that this happens very easily even in high end gear, but disappears as soon as you play one tone on one speaker and the other tone in the other speaker, letting the frequencies add themselves in the room.
Recording in a loopback configuration might not show it, since most of the distortion happens in the transducter (headphone or speaker). One should use a microphone to detect it.
That's why, when possible, ultrasonics experiments are run with super tweeters amplified separately, and entierely dedicated to the ultrasonic content, so that no intermodulation occurs.

Results for 24bit/96KHz test

Reply #38
Hi tigre.. I don't have a separate headphone amp, but I guess I can plug line-in to line-out if I don't monitor the signal..  or Pio, should I try to record my headphones with a mic?    Hmmm..
KikeG's ideas.. he's saying that the true 96KHz file actually sounds worse.. I think we speculated that on afterdawn forums, but there's no way of really knowing which sample I think is better because it's subjective.  If someone can tell me some tests to do I'll try, but I can't really think myself of where to start.

Results for 24bit/96KHz test

Reply #39
Well, if the difference frequencies appear mostly in the headphones, then it would be logical to say that either:
-the headphones are reproducing frequencies >29kHz, or
-there isn't any intermodulation distortion caused by >29kHz content.

 

Results for 24bit/96KHz test

Reply #40
What do you mean ? If the 4 kHz frequency of our exemple is audible only in headphones, it means that only headphones suffer from intermodulation distortion, since the 4 kHz tone is the distortion.

Results for 24bit/96KHz test

Reply #41
Oh.  .
No, I was thinking of KikeG's speculation, not your 18-14kHz example..

Results for 24bit/96KHz test

Reply #42
Well.. I've been busy again, but I see I haven't missed anything 

I'm completely unclear what I need to do next.. perhaps my equipment is not suitable.. maybe we shouldn't be listening to hi-res formats at all with todays speakers.

I ran a loopback test, just for the sake of completeness:
Code: [Select]
RMAA: M-Audio Revolution, 32bit(float), 96kHz.
Frequency Response, dB:  +0.12, -0.06
Noise Level, dBA:        -93.9
Dynamic Range, dBA:       91.2
THD, %:                   0.0048
Intermodulation, %:       0.016
Stereo Crosstalk, dB:    -94.9


I also spent some time listening to Seaside Rendezvous (Queen), and The Thin Line (Queensrÿche).  Well, I wasn't very surprised that I couldn't hear any difference between 96 and 44.1kHz, because:
a) they are not exactly great sounding recordings, and
b) I could only get one channel off the dvd (right channel would only give me noise)
But then I thought, well, if my headphones are causing audible problems because of the high frequencies, why can't I notice it here? There certainly is a plethora of high frequencies in these two files. (Actually I'm suspicious that the top half of the frequencies are just a mirror image of the bottom half, hmm..  )
Hearing lovely_1 after these files was like listening to chocolate melting (dark chocolate) 

I was wondering, too.. In case I am just a lucky person, how many trials would be needed for me to have a fairly good chance of getting a total of say twenty 12/12s over the whole test period if I was just guessing.. maybe sshd could tell me?

Results for 24bit/96KHz test

Reply #43
I thought I should clarify this 'result picking' issue with some things I might have forgotten to mention earlier. 

My initial tests took me many hours.. I even spent a whole afternoon on the very first test, going away and coming back, making sure to choose only when I was certain.  There is no way I would do this 5 times and choose the best one, it would take days and drive me crazy..

Then there was my filtered files, which I was pleased to find much easier, but turned out to be a waste of time..

Then KikeG's first batch of test files.. Well, I only tried these files once, except for the one I stuffed up.  When I realised why I was getting it wrong every time I reset and started again.  One or two of the files were very easy, but in total I spent several hours one evening working through these files.  Again, I'm really not interested in multiple attempts on this sort of time scale.

And when I said recently that I had a single attempt at the 29kHz low-pass file, I was intending to clarify that result, not imply that the other tests were any different..

Anyway, sorry if I have been a little irritable over this issue.  It seems I neglected to mention any of this (without realising), which made it hard for me to understand why there was so much agnosticism over the results.

So, I'm keen to do whatever tests are needed here (some more specific loopback tests?).. Or, if it's all going to be a waste of time, please somebody recommend me a good pair of headphones that are rated up to 48kHz and I can start over.

Results for 24bit/96KHz test

Reply #44
It's been a long time since I read this thread, but as far as I remember, this is the point. The main issue with high definition formats, after all, are the speakers / headphones frequency response.

The next time someone tries to sell me a DVD-A or an SACD player, I'll ask if the 100 kHz super speakers come with it.

Results for 24bit/96KHz test

Reply #45
So, quite sincerely, if my Sennheiser's are not suitable for this test, what headphones should I buy?  Even if all of the above has been a waste of time, I would prefer to prove myself wrong than to just forget about it...

-listen

Results for 24bit/96KHz test

Reply #46
I don't know, but for me, it was not a waste of time. It lead to interesting discussions. And now, we know better the pitfalls that appear in these kind of listening tests. It is interesting to note that Oohashi's experiment carefully avoided all these pitfalls (Physically double blind test, bi-amplification, listening tests with the ultrasonic content alone, microphone recordings of the ultrasonic content at the listening position, same lowpass filter for the lowpassed version and the full version...).
If only it could be repeated by an independant team...

Results for 24bit/96KHz test

Reply #47
Quote
I don't know, but for me, it was not a waste of time. It lead to interesting discussions. And now, we know better the pitfalls that appear in these kind of listening tests. It is interesting to note that Oohashi's experiment carefully avoided all these pitfalls (Physically double blind test, bi-amplification, listening tests with the ultrasonic content alone, microphone recordings of the ultrasonic content at the listening position, same lowpass filter for the lowpassed version and the full version...).
If only it could be repeated by an independant team...

Quote
It is interesting to note that Oohashi's experiment carefully avoided all these pitfalls (Physically double blind test, bi-amplification, listening tests with the ultrasonic content alone, microphone recordings of the ultrasonic content at the listening position, same lowpass filter for the lowpassed version and the full version...).
If only it could be repeated by an independant team...


I am alerting you too the issue of Oohashi's eperiment. First, he never acheived positive results in any standard accepted listening test that are of any signficance. His tests only foiund positive signficance in unconscious brain activity, not an actual audibility test. His test results in ths regard, are still suspect too me....  2nd, after this paper, NHK labratories performed a controlled listening tests in response:

NHK Laboratories Note No. 486, "Perceptual Discrimination between Musical Sounds
with and without Very High Frequency Components", Toshiyuki Nishiguchi, Kimio Hamasaki, Masakazu Iwaki, and Akio Ando

http://www.nhk.or.jp/strl/publica/labnote/lab486.html

3rd, Oohashi attempts to critisize the original test of reference from 1978 don't seem to be warranted, especially considering the later NHK test. Perhaps you should read the original peer reviwed paper, which still stands as JAES standard:

JAES, "Which Bandwidth Is Necesarry for Optimal Sound Transmission?", G. Plenge, H. Jakubowski, and P. Schone

-Chris

Results for 24bit/96KHz test

Reply #48
Thank you for the link, WmAx, Very interesting.

In Oohashi's paper however, a high significance level was shown not only in the brain activity, but also in the subjective evaluation of sound quality by the subjects, as reported in table 2 of the version linked above.

This alone can't be considered as a scientific proof until the result is confirmed. It is just a "piece of proof". I was not aware that another team had reproduced the experiment and failed to achieve any positive result.

I wonder if the test of Oohashi et al. was flawed, or if there was something more in it that allowed it to succeed.

The first difference is the protocol : similar to ABC/HR in the one that failed, but without ranking the samples, just telling if they are different, and an A-B-A playback followed by a binary quality evaluation in the one that succeeded (soft/hard, balanced / unbalanced etc).
The duration might have been different. In the sucessful test, the samples were always played during 30 seconds. It is not mentionned in the other test.
The material was different too. I'd like to perform a spectrum analysis of a raw gamelan recording (the instrument recorded in the test that succeeded, and that was not present in the one that failed), but with a short analysis window. The overall analysis doesn't show any special high frequency content that would be missing in the other test, but since the gamelan is a percussive instrument (on metal), I wonder if it is possible for the high frequency content to be concentrated during the attacks only. This way, it would be very powerful at some given times, but the average power on the whole sample would not represent faithfully the instant HF level that is present during the attacks.
If it is the case, drawing a spectrogram with shorter analysis windows would show shorter but more powerful HF bursts, as long as the bursts are shorter than the window itself.
I tried with the only CD I have featuring Gamelan (Akira soundtrack, track 4 - Tetsuo), but it didn't show such a behaviour. However, this movie soundtrack is heavily processed, and the Gamelan sound might have been tampered with.

We could say also that a 10 ms analysis window (4096 samples in 44100 Hz) represents best the human hearing, but I don't think that it is a valid argument. Since we study the hypothesis of inaudible sounds possibly intermodulating in the audible range, the process is necessarily nonlinear, and the relevance of this 10 ms window might not stand in these conditions.

Results for 24bit/96KHz test

Reply #49
Quote
Thank you for the link, WmAx, Very interesting.

In Oohashi's paper however, a high significance level was shown not only in the brain activity, but also in the subjective evaluation of sound quality by the subjects, as reported in table 2 of the version linked above.

This alone can't be considered as a scientific proof until the result is confirmed. It is just a "piece of proof". I was not aware that another team had reproduced the experiment and failed to achieve any positive result.

I wonder if the test of Oohashi et al. was flawed, or if there was something more in it that allowed it to succeed.

The first difference is the protocol : similar to ABC/HR in the one that failed, but without ranking the samples, just telling if they are different, and an A-B-A playback followed by a binary quality evaluation in the one that succeeded (soft/hard, balanced / unbalanced etc).
The duration might have been different. In the sucessful test, the samples were always played during 30 seconds. It is not mentionned in the other test.
The material was different too. I'd like to perform a spectrum analysis of a raw gamelan recording (the instrument recorded in the test that succeeded, and that was not present in the one that failed), but with a short analysis window. The overall analysis doesn't show any special high frequency content that would be missing in the other test, but since the gamelan is a percussive instrument (on metal), I wonder if it is possible for the high frequency content to be concentrated during the attacks only. This way, it would be very powerful at some given times, but the average power on the whole sample would not represent faithfully the instant HF level that is present during the attacks.
If it is the case, drawing a spectrogram with shorter analysis windows would show shorter but more powerful HF bursts, as long as the bursts are shorter than the window itself.
I tried with the only CD I have featuring Gamelan (Akira soundtrack, track 4 - Tetsuo), but it didn't show such a behaviour. However, this movie soundtrack is heavily processed, and the Gamelan sound might have been tampered with.

We could say also that a 10 ms analysis window (4096 samples in 44100 Hz) represents best the human hearing, but I don't think that it is a valid argument. Since we study the hypothesis of inaudible sounds possibly intermodulating in the audible range, the process is necessarily nonlinear, and the relevance of this 10 ms window might not stand in these conditions.

A signficant issue is that Oohashi was not able to achieve postitive results with LCS compared to baseline. However, he was able to achieve positive results whith FRS vs. HCS. THis is not logical. I can not conclude his results have any validity in this circumstance. I suspect a distortion component to be responsible. Perhaps a compounded IMD, created from the combining acoustic sources(individual speakers)? I realize they used individual speakers in order to reduced IMD compentns. However, the main eefect this has is to prevent IMD that is resultant from transducer non-linearities, or from IMD/doppler artifacts due to the simultanious radiation of brodband directly from the same moving diaphgrahm, since the direct pistonic behaviour will impede/react upon the various frequency pressurations across the distributed band. However, intermodular artifacts are created from two discrete acoustic sources, too. Same as if Bob hums at one frequency and a Jon hums at a slightly different one, audible modulations will result as these pressurized waves combine/react. I wonder if the pre-recorded signals in this case, when re-assembled, created further IMD artifacts as compared to the low passed-only signal. Ooashi brings up the issue of intermodular distortion, but does not proceed to actually attempt to measure the system to explain this illogical result. At least, he did not disclose such an investigation in this paper. If you are aware of a report detailing this specific issue of compounded IMD products, I would like to read it. I may be wrong on this account; but the lack of positive results with LCS only raises more questions. If the high frequency content is directly exciting ANYTHING in a human, then why is it when isolated, no positive results were acheivable? What did cuase the positive results when HF was added to the high cut?

Addressing your comment:

"We could say also that a 10 ms analysis window (4096 samples in 44100 Hz) represents best the human hearing, but I don't think that it is a valid argument. Since we study the hypothesis of inaudible sounds possibly intermodulating in the audible range, the process is necessarily nonlinear, and the relevance of this 10 ms window might not stand in these conditions."

IF the standard 44.1khz sample rate represents human auditory range, then how can this be logical? If the original source has audible IMD componentes(I'm sure many do) as a result of inaudible and audible frequency reactions, then the audible components/modulations will still reside within the audible band. These will be recorded faithfully since the artifacts are created before recording. Maybe I did not understand you?

-Chris