Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: List of typical problems and shortcoming of "common" audio t (Read 12380 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

List of typical problems and shortcoming of "common" audio t

Reply #25
But you haven't identified any flaws in audio testing.
You've identified flaws in people who you can't trust!

Just like you couldn't trust patients and doctors alike before blind testing were introduced in medical testing.

I just want you to focus on the following until it becomes clear to you (or ask for clarification, because this is central, and I hope we will agree):

What you are saying and summing up by "you can't trust people" is correct.
What you say right after that is we already have a solution to  "you can't trust people".

What I am saying is that "you can't trust people" is not the root problem, nor an interesting one: there are reasons that "you can't trust people".
These reasons (for instance, they can look at a spectrogram or forge the test log, etc.) all lead to the same consequence: "you can't trust people".

What YOU tell me is that what I am doing is useless because it all comes down to "you can't trust people".
What I am telling you is that acting directly on the reasons can lead to new audio tests, with new/other limitations and trade-offs.
But first we have to identify the main problems (like "you can't trust people"), their causes (they can look at a spectrogram or forge the test log, etc.) and the consequences (uncompelling evidence).

If I apply that to the medical field...
The orginal problem was exactly the same: they couldn't trust the doctors (not that they would say if the drug was true or fake, but some would ask "are you sure there is absolutely no improvement since last week?" to a person with a 'real pill" while they would not insist for the persons with the fake ones).
In your current way of thinking, tell me if I am wrong, if I were to say  "ok, so we don't trust doctors because 1-they tend to use different questions in each case 2- they use different overall tones 3- they tend to insist if the result does not go expected way (are you sure you really feel better? vs are you sure there is absolutely no improvement whatsoever?) 4- some blatantly cheat, etc.", you would take make list, find counter arguments and finally conclude, "well, in the end, it all comes down to 'we can't trust the doctors'".
However, there is some value to find the cause the this lack of trust: by doing so, you can improve, not only locally (by putting a nurse behind each doctor to check for flaws and avoid cheating for instance) but globally ("interestingly enough, these 4 cause-problem disappear if the doctors themselves do not know what they are giving"). And naturally, if you remove the causes, the consequence vanishes as well.

To get back to our case, having the cause (e.g. periodogram, fake log, etc.) give more insight and angles to strike than just saying that it all comes down to "we can't trust people".

List of typical problems and shortcoming of "common" audio t

Reply #26
What I am saying is that "you can't trust people" is not the root problem, nor an interesting one: there are reasons that "you can't trust people".
 

yet the inability to trust people is the sole "problem" in most of your list  NOT problems with "audio testing"!


What I am telling you is that acting directly on the reasons can lead to new audio tests, with new/other limitations and trade-offs.


WHY DO WE NEED NEW TESTS?  The current tests work fine unless you're trying to argue with people not acting in good faith.

But first we have to identify the main problems (like "you can't trust people"), their causes (they can look at a spectrogram or forge the test log, etc.) and the consequences (uncompelling evidence).

The inability to trust people only leads to uncompelling [sic] evidence in internet fights.  It does not lead to problems in scientific research.

If I apply that to the medical field...
The orginal problem was exactly the same: they couldn't trust the doctors (not that they would say if the drug was true or fake, but some would ask "are you sure there is absolutely no improvement since last week?" to a person with a 'real pill" while they would not insist for the persons with the fake ones).
In your current way of thinking, tell me if I am wrong, if I were to say  "ok, so we don't trust doctors because 1-they tend to use different questions in each case 2- they use different overall tones 3- they tend to insist if the result does not go expected way (are you sure you really feel better? vs are you sure there is absolutely no improvement whatsoever?) 4- some blatantly cheat, etc.", you would take make list, find counter arguments and finally conclude, "well, in the end, it all comes down to 'we can't trust the doctors'".
However, there is some value to find the cause the this lack of trust: by doing so, you can improve, not only locally (by putting a nurse behind each doctor to check for flaws and avoid cheating for instance) but globally ("interestingly enough, these 4 cause-problem disappear if the doctors themselves do not know what they are giving"). And naturally, if you remove the causes, the consequence vanishes as well.


I think your medical analogy is off the rails.  Not only does it appear to be historically inaccurate, but it's not moving us anywhere useful.

To get back to our case, having the cause (e.g. periodogram, fake log, etc.) give more insight and angles to strike than just saying that it all comes down to "we can't trust people".


Again, where are these "problems" limiting serious study and not just internet fights?

Creature of habit.

List of typical problems and shortcoming of "common" audio t

Reply #27
If you want to learn about bad audio testing, you should go someplace else. You won't find it here.


Just an example (without consequence, I am not saying that it would actually be a good idea). Some "subjectivists" argue that long term tests are required. I know a little bit of the literature regarding this, and I know that the brain actually does the reverse (it will "smooth" out differences in the samples you know well).

Yet, as long as the conditions of the test are known, would a "long term" ABX test be considered worse that a "short term" one. Yes, of course, there is the difference reported in the literature. Allowing this kind of test would confirm these findings (since the conditions would be known) and the people who prefer to trust these tests would be happy.

How would that be bad testing?

List of typical problems and shortcoming of "common" audio t

Reply #28
@Soap, if you don't see the value in this, then fine. But then please refrain from posting here. I'm asking for willing contributors to help me here. If you are not willing, please don't spend your time imposing your thoughts here.

I understood you. The fact that my house is burning is not a problem related to my house. but having my house burning is the only problem my house have. So there was absolutely no need in the first place, to have my fireplace, heaters, electric system and devices checked since it all comes down to my house burning and that's the only thing I should try to solve. Or not, because houses will always catch fire. That's why I must apply the same solution as everyone else (living in a blockhaus without any furniture nor electricity) and absolutely never ever dare to even think about other solutions or trade-offs. Understood.

And for the ones that were possibly surprised by my strong will to avoid arguments for the moment in this thread, this was the reason why. As far as I am concerned, I'll keep up building my tree. If you are willing to contribute and also want to try and get a better picture of the state of audio testing today, feel free to contribute. However be warned that I won't read anything other that list-like posts. Thanks for your contribution.

List of typical problems and shortcoming of "common" audio t

Reply #29
If you are willing to contribute and also want to try and get a better picture of the state of audio testing today,


You haven't started painting a picture of the state of audio testing today.  Do you not grok this?
Creature of habit.

List of typical problems and shortcoming of "common" audio t

Reply #30
@pelmazo: your crystal ball might be slightly broken, your comfort zone might be slightly bruised and I understand that you find that frustrating, but I really don't like your tone and won't tolerate any more of it. So quit freaking out and chill down. You can question, but the part about 'I see through you and your pathetic attempt at undermining my holy ABX' was unneeded.
You would see how wrong your accusations are just by looking at my previous comments in other threads (one about FLAC, particularly).
And don't you fell any shame, telling me that I am not "interested in any counterbalancing fact, i.e. the problems of alternatives of ABX" while I AM ASKING FOR THAT EXACTLY!!! Read again, if you assumed this was about the shortcomings of ABX, this is in your mind only! I'd LOVE to hear the shortcoming of the alternatives!!!


I was not freaking out, and I didn't accuse you of anything. I wrote what your attempt looked like to me. So I think it is up to you to chill down.

You should perhaps view the responses of several people here, not just me, as a sign that you had indeed not made yourself clear enough. I understand clearer now what you want, but I still don't understand either what it is supposed to be good for, nor how it can be useful and for whom. Particularly if you include non-problems that are only perceived by some to be a problem. That is going to include any amount of nonsense people can come up with, which is close to infinite. See your example with the old lady and the peas. Her kind of problem can be multiplied limitlessly. Want to test my imagination?

You are right that you didn't limit this to ABX testing, and I owe you an apology regarding this. Nevertheless, I still think that your attempt is ill-conceived and poorly justified. If you are doing this for some kind of scientific undertaking, that makes it worse.

List of typical problems and shortcoming of "common" audio t

Reply #31
And from here, in the tests, taken sound samples? Maybe they were corrupted 128-32-16bit and etc. types recalculation?

List of typical problems and shortcoming of "common" audio t

Reply #32
I'll start from scratch next week then.

What makes me really sad is that there seems to be strictly no attempt to really understand what I mean. I'm not saying that I made t easy. I understand it could have been better. I understand that what I'm doing is not familiar to you. However I find very tiring to try and explain more what I did just to find it ditched because of the choice of word at ond place or because someone makes one huge generalisation and refuse to hear any further adjustement or explanation.

I would have largely preferred that you told me what you did not understand and what you think you understood. After reading all that conversation again, the main problem we had is that we had different definitions of "problems". Come on, words are tools to express a meaning. If I tell you that what I call problem is not what you call problem, you don't have to change your own definition, but at least you should try to understand what I said with the definition that I gave you and suggest a better word. The second one is that you seem not to understand that I want to take psychological issues into account. You can repeat any time you want that this is not a problem with audio tests, if I tell you, I, the guy who created the thread, that I want to take them into account, then accept it as a part of the request and don't ditch it because it does not correspond to your definition of a problem in audio tests. Again, if I was silent and you didn't understand, that would be ok. But I spent too much time trying to refine and explain what I meant just to get " So what? This is not a problem in my dictionnary".

What I meant about DBT in the medical field was not to be historically correct. I just wanted to stress out that this is a non-obvious solution (or we would have started with it right away) to a psychologocal problem that had a great negative impact on results. You may consider that these psychological bias were a problem OF drug testing or you may consider they are NOT even related... Thats purely irrelevant: these bias had great effects on the results of drug tests and something had to be done about it for people to trust the results. That's just about it.

And here we come to the last part... Trust. In fine, science is all about trust. You observe, model, understand, expand and build knowledge. The scientific method has been developed and refined years after years to try and reach unbiased results. Why would you apply these methods? To be sure that you avoid the pitfalls that would transform your great measurements, models and results into pure crap. In other words, you want to trust and have confidence in your results.

But now, what's the use if the people in general have no trust in what you did? You can try to convince them. Sometimes you will be successful and sometimes you won't. From there, you can just not give a crap about thoses people and their flawed beliefs or you can, at least, try to understand them and try to make them understand. For instance (again, just an example, not something I wanna do or defend), maybe most people would come and accept ABX tests and results with minor changes. Letting them test on a month span, asking them for similar properties ("this sample has more profound bass, that one too") instead of directly asking them "is it A or B?". That may sound minor to you but that could make a lot of difference, while keeping the same rigorous testing framework (or not, but that typically the kind of thing I'll ask you in the future when the time for actual problem solving will come).

Before you object, let me make this clear. I, as the creator of this thread, consider that any psychological issue and bias that could undermine the trust of people in general in any audio test, is worthy to appear here and to be (ultimately) discussed. Because, once again, a test that no one trust, for valid or invalid reasons is useless. And, once again, that is not to say that any bias based on invalid reasons will be addressed, but to exaggerate a little bit, if tomorrow magazine X tels that only software in the color red should be trusted, it basically costs nothing to make it red and won't impair the results in any mean. Yet that would bring more trust for free (or not... I personally wouldn't like that, but ultimately this has nothing to do wih the quality of the device).


If that's not clear, please tell me. If you want to dismiss something then REALLY explain why.

List of typical problems and shortcoming of "common" audio t

Reply #33
The problem of trust only exists when trying to argue with unseen opponents on the internet.  In the lab one can tell if someone is cheating.

You're chasing problems which, I'll admit, I don't value for they appear to only be problems of arguing with strangers, not problems researchers experience. 

PLEASE, for the last time, tell me how these problems affect anything other than internet arguments with strangers you can't trust!
Creature of habit.

List of typical problems and shortcoming of "common" audio t

Reply #34
MMime
My question was probably still technical. It is logical to have on hand a really good tool, but without the quality of samples is not possible. To test lossy codecs apparently enough accuracy existing set, which was made through any audio editor by converting to 32bit float (I do not know this reliably). However, you have written that expand the boundaries of the application yours concept, and then I naturally became interested in the subtleties. However, it is not important. I am here for the casual observer, but you are free to do as you see fit to do.

 

List of typical problems and shortcoming of "common" audio t

Reply #35
I would have largely preferred that you told me what you did not understand and what you think you understood. After reading all that conversation again, the main problem we had is that we had different definitions of "problems". Come on, words are tools to express a meaning. If I tell you that what I call problem is not what you call problem, you don't have to change your own definition, but at least you should try to understand what I said with the definition that I gave you and suggest a better word. The second one is that you seem not to understand that I want to take psychological issues into account. You can repeat any time you want that this is not a problem with audio tests, if I tell you, I, the guy who created the thread, that I want to take them into account, then accept it as a part of the request and don't ditch it because it does not correspond to your definition of a problem in audio tests. Again, if I was silent and you didn't understand, that would be ok. But I spent too much time trying to refine and explain what I meant just to get " So what? This is not a problem in my dictionnary".


Ok then, here's what I still don't understand: If you include psychological issues in your definition of the word "problem", that makes the list you are trying to put together potentially infinite. Are you realizing this? How are you going to deal with it? What is the point of making the task so unwieldy?

As an example, if I told you that a potential problem of such a listening test was, that some people might want to have a dowser work on the test site before testing, to make sure that there are no negative earth rays that could hamper the test, would that be a welcome addition to the list, or would you think I'm trolling? If it should be welcome, where do you stop?

Quote
And here we come to the last part... Trust. In fine, science is all about trust. You observe, model, understand, expand and build knowledge. The scientific method has been developed and refined years after years to try and reach unbiased results. Why would you apply these methods? To be sure that you avoid the pitfalls that would transform your great measurements, models and results into pure crap. In other words, you want to trust and have confidence in your results.

But now, what's the use if the people in general have no trust in what you did? You can try to convince them. Sometimes you will be successful and sometimes you won't. From there, you can just not give a crap about thoses people and their flawed beliefs or you can, at least, try to understand them and try to make them understand. For instance (again, just an example, not something I wanna do or defend), maybe most people would come and accept ABX tests and results with minor changes. Letting them test on a month span, asking them for similar properties ("this sample has more profound bass, that one too") instead of directly asking them "is it A or B?". That may sound minor to you but that could make a lot of difference, while keeping the same rigorous testing framework (or not, but that typically the kind of thing I'll ask you in the future when the time for actual problem solving will come).


Trying to increase your own trust in your findings is quite a different thing from trying to gain someone else's trust in them. I think we've all seen instances of people not trusting your result no matter what you say or do. Relativity, evolution, even the landing on the moon still are being denied by many people despite overwhelming evidence. I doubt that anything can be done about this. Even if you understand perfectly why they reject the evidence, it doesn't help much. Those people exist in audio, too, and they appear here regularly and engage us in discussions.

Quote
Before you object, let me make this clear. I, as the creator of this thread, consider that any psychological issue and bias that could undermine the trust of people in general in any audio test, is worthy to appear here and to be (ultimately) discussed. Because, once again, a test that no one trust, for valid or invalid reasons is useless. And, once again, that is not to say that any bias based on invalid reasons will be addressed, but to exaggerate a little bit, if tomorrow magazine X tels that only software in the color red should be trusted, it basically costs nothing to make it red and won't impair the results in any mean. Yet that would bring more trust for free (or not... I personally wouldn't like that, but ultimately this has nothing to do wih the quality of the device).


The single most prominent reason that makes people distrust a listening test, as far as I have seen, is when the test yields the "wrong" result in their opinion. I have seen instances when people "discover" after the test that they were stressed during the test even though they had negated it before hearing the result. Those people are quite capable to invent excuses after the test, and even believe in them honestly and without maliciousness. If you have a recipe against that, please tell. I don't know of any.

The case that no one trusts a test is a quite artificial one, by the way. You are usually going to encounter trust in some people, and distrust in others. Therefore, you will have to consider in advance whose trust you are interested in. You can forget about winning the trust of everybody. So what is your goal? Do you seek the trust of scientifically minded, rational people, who are capable of understanding a test design and draw their conclusions? Or do you address the layman who hasn't got any idea how to conduct a good listening test? Or is your target the audiophile who distrusts controlled listening tests in principle, because of their habit of producing unwelcome results? Or who else?

Quote
If that's not clear, please tell me. If you want to dismiss something then REALLY explain why.


I'm not dismissing it. I just don't think what you are attempting makes much sense. That may be because of what you want, or because of how you explain it. I still don't know, or I misunderstand it.

List of typical problems and shortcoming of "common" audio t

Reply #36
Thanks Soap, that's clearer to me.

I am thinking about two different things mainly (these are only projects, I just began experimenting with, I can't guaranty that I will be able to work on them for a long time):

1) I want to provide a simple audio encoder with an encoding guidance. Basically, you would answer a small series of questions (where would you hear this? With what?), perform some rough hearing tests and suggest an encoder with sensible options. This tool would also provide a few functions to test the complete audio pipeline if possible. And, finally, since nothing really replaces the real thing, I would like to provide difference and preference testings. For differences, the ABX methodology is obvious to me. But not to everybody. I want people to be able to find something good enough for them in 30 seconds if they just want to invest that time. But at the same time, if they want to invest 2 months in it, I personnally don't care, that's up to them. However, I don't want to provide any methodology that is not rigorous and scientifically accepted. Period. So no non-blind testing at all for instance (if they want that, they would have to do it themselves). But at the same time, I don't want to let down users who don't trust foobar ABX results: some of them don't trust these results for invalid but easily 'fixable' reasons (a workaround that does not impair the quality of the results is implementable). But that requires to actually listen to people gibberish and to compose with that (again, without introducing flaws, that's not the purpose). That's why there are invalid elements in my list: these are problem in the mind of some people and while these problem are not real, a few of them could still be 'fixed'. This application could become a foobar plugin or be an open source app, don't know yet.

With that in mind, I could implement something, ask for comments here and in some other places and randomly fix things (just like anyone would do). The advantage of building the tree I talked to you about is to see the cause-consequence relationship between what *I* consider as issues... E.g. "oh, so my users don't trust my software, why is that? ah there are 10 reasons... But 7 of them would already disapear if I only fixed this single issue. Is there a way to achieve that while keeping the scientific quality?". The other advantage is to identify trade-offs: "I can solve this whole bunch of problems either this or that way". Of course you have to trust me for now, but this is why I do it this way.

2) In parallel, I want to explore what exactly is testable online (i.e. less sensitive to cheating, more trustable). I can see you reprobative look... I'm not saying that the system would be cheat-proof. Just as you said: if you really want to trust results, do it in a lab where people are monitored. No argument there. However there are ways to remove the urge to cheat or to make it far more complex to do (in some specific cases), so while you cannot trust the results, there are things that could be tried, even to check for the cheatability of them. For instance, I was talking about including hearing tests in the application. If the result of the hearing test is "you can hear up to 16803Hz", not only would that been done without any uncertainty considerations, that would also encourage cheating. Now, if you say: "You can hear the full range of what the best ears can" (not great, but the meaning is there), why would they cheat? They would get an advice: don't use a cutoff below 20kHz. At the same time, I would get 16803Hz in the logs... In the same field (audibility of high frequencies), I could provide files to ABX, one with the real HD data, the other, at the same sample rate but with shaped noise above 20kHz. Seeing no difference would not prove that the is no difference between reasonable sample rates and "HD", however that would suggest that the same effect can be obtained by taking any CD, upsampling and adding noise. The other positive effect is that differences due to aliasing in moronic setups *may* disapear. Etc.

In that case, having a tree (or even a simple list) of "issues" like "people can easily differentiate different sample rates in a spectrogram" or "ABX false positive can come from aliasing" or "people tend to do whatever possible to get a big number associated to their name" merely allows to think about trying such things and avoiding common pitfalls at the same time (again you have to trust me, but if you imagine chains of "problematic" things with their causes and consequences, that shoukd be quite obvious).

List of typical problems and shortcoming of "common" audio t

Reply #37
Ok then, here's what I still don't understand: If you include psychological issues in your definition of the word "problem", that makes the list you are trying to put together potentially infinite. Are you realizing this? How are you going to deal with it? What is the point of making the task so unwieldy?


In theory yes. In practice, you'll naturally stop. Once I make the first tree, you'll see that you won't reach infinity. That's also why I'll only let people add things to the tree for a limited amount of time.

Quote
Trying to increase your own trust in your findings is quite a different thing from trying to gain someone else's trust in them. I think we've all seen instances of people not trusting your result no matter what you say or do.

True, but I'm sure there are a few things easily done that would bring non-hardcore scientists to trust and see the values of these methodologies. For some it would require a small explanation, for other you'd have to provide a red-colored theme and for others, there is absolutely nothing you can do.



Quote
The single most prominent reason that makes people distrust a listening test, as far as I have seen, is when the test yields the "wrong" result in their opinion. I have seen instances when people "discover" after the test that they were stressed during the test even though they had negated it before hearing the result.

I was under the same impression from what I've read.  But see, the "stress", you can simply ditch it saying this is posterior justification for failure. But if you provide an environment that cannot, in anyway, be thought as stressful that's a (small) victory.  Now, with a tree, it would be easy to look at what "stressful" really means for a bunch of people... And I may very well figure out that providing a transcoding app with an ABX function to use as a "quick check" is considered far less stressful than a pure ABX software...

Quote
I'm not dismissing it. I just don't think what you are attempting makes much sense. That may be because of what you want, or because of how you explain it. I still don't know, or I misunderstand it.


Does it make more sense now? Do I need to explain something else? To go into more details?

List of typical problems and shortcoming of "common" audio t

Reply #38
1) I want to provide a simple audio encoder with an encoding guidance. ..in 30 seconds if they just want to invest that time.


No such ability is known.

You are literally asking for a short cut through unconscious bias removal through blind testing AND the reduction of uncertainty through multiple trials. 

The discovery of a way to accomplish those goals in anything approaching 30 seconds would be Nobel worthy.


I don't want to provide any methodology that is not rigorous and scientifically accepted. Period. So no non-blind testing at all for instance (if they want that, they would have to do it themselves).


As I said.  30 seconds is orders of magnitude too short to accomplish your stated goals.

But at the same time, I don't want to let down users who don't trust foobar ABX results: some of them don't trust these results for invalid but easily 'fixable' reasons (a workaround that does not impair the quality of the results is implementable).


You can not do this remotely.  Full stop. 

You can not prevent cheating by a dedicated cheater.  Even assuming you sent them testing hardware sealed in Lucite they could still record the output of their speakers and cheat (the classic "analog hole").  Trying to prevent cheating by people whose behavior has no impact on you is a foolish errand.  Learn to accept that which you can not change.


With that in mind, I could implement something, ask for comments here and in some other places and randomly fix things (just like anyone would do). The advantage of building the tree I talked to you about is to see the cause-consequence relationship between what *I* consider as issues... E.g. "oh, so my users don't trust my software, why is that? ah there are 10 reasons... But 7 of them would already disapear if I only fixed this single issue. Is there a way to achieve that while keeping the scientific quality?". The other advantage is to identify trade-offs: "I can solve this whole bunch of problems either this or that way". Of course you have to trust me for now, but this is why I do it this way.


This is magic seeking.  Magic does not exist.  You can not prevent cheating by those using your software and thinking that if only you had the perfect logic diagram those facts would be different is flawed thinking.

2) In parallel, I want to explore what exactly is testable online (i.e. less sensitive to cheating, more trustable). I can see you reprobative look... I'm not saying that the system would be cheat-proof. Just as you said: if you really want to trust results, do it in a lab where people are monitored. No argument there. However there are ways to remove the urge to cheat or to make it far more complex to do (in some specific cases), so while you cannot trust the results, there are things that could be tried, even to check for the cheatability of them.


How is this a second point and not literally a restatement of the first?  Regardless - analog hole.  You can't trust remote testers, end of story.  Stop banging your head against the wall.

For instance, I was talking about including hearing tests in the application. If the result of the hearing test is "you can hear up to 16803Hz", not only would that been done without any uncertainty considerations, that would also encourage cheating. Now, if you say: "You can hear the full range of what the best ears can" (not great, but the meaning is there), why would they cheat? They would get an advice: don't use a cutoff below 20kHz. At the same time, I would get 16803Hz in the logs... In the same field (audibility of high frequencies), I could provide files to ABX, one with the real HD data, the other, at the same sample rate but with shaped noise above 20kHz. Seeing no difference would not prove that the is no difference between reasonable sample rates and "HD", however that would suggest that the same effect can be obtained by taking any CD, upsampling and adding noise. The other positive effect is that differences due to aliasing in moronic setups *may* disapear. Etc.


Another restatement.  Let's try a different track:  why are you personally invested in them not cheating?

In that case, having a tree (or even a simple list) of "issues" like "people can easily differentiate different sample rates in a spectrogram" or "ABX false positive can come from aliasing" or "people tend to do whatever possible to get a big number associated to their name" merely allows to think about trying such things and avoiding common pitfalls at the same time (again you have to trust me, but if you imagine chains of "problematic" things with their causes and consequences, that shoukd be quite obvious).


The very people who motivate you to create a fool-proof test are the very people who have no interest in complying.  You can not make the horse drink.
Creature of habit.

List of typical problems and shortcoming of "common" audio t

Reply #39
In the same field (audibility of high frequencies), I could provide files to ABX, one with the real HD data, the other, at the same sample rate but with shaped noise above 20kHz. Seeing no difference would not prove that the is no difference between reasonable sample rates and "HD", however that would suggest that the same effect can be obtained by taking any CD, upsampling and adding noise. The other positive effect is that differences due to aliasing in moronic setups *may* disapear. Etc.

You have it appears, grand plans! Better at once - jolt of electricity through the body. Noise is not the music signal, his presence and to hear if the subject, then realizes it's somewhere in a month. Have you already made this the wrong conclusions. Recognize, you live in some totalitarian country?

List of typical problems and shortcoming of "common" audio t

Reply #40
I understand that what I'm doing is not familiar to you.

Actually you don't, or more accurately, can't. It's impossible for you to ever be cognizant of this, which makes it even funnier. 
Loudspeaker manufacturer

List of typical problems and shortcoming of "common" audio t

Reply #41
1) I want to provide a simple audio encoder with an encoding guidance. ..in 30 seconds if they just want to invest that time.


No such ability is known.

You are literally asking for a short cut through unconscious bias removal through blind testing AND the reduction of uncertainty through multiple trials. 

The discovery of a way to accomplish those goals in anything approaching 30 seconds would be Nobel worthy.


Once again, you are not reading. I say something not familiar to you so you try to force it into something familiar. The problem is that by doing this is that you determine what I want out of what you think and not what I said.

Can't you understand that the majority of the population outside of this forum consider reaching transparency with absolute certainty as useless nitpicking? Mostly because they are not interested in general findings, academic research and because all they want is to find something good enough for them. Where you see a successful attempt at differentiating two samples, they see a lunatic who looped over 200ms hundreds of times to look for a click, and even finding it that someone would still be wrong 1/3 of the time. They don't want to invest such a time for such a small value. Without judgement here, can you understand that a 90% certainty that 99% of the songs will sound almost exactly as intended is enough for many people (as opposed to a 99% certainty that 99% of the songs will sound exactly as intended)?

On the other hand, I don't like to see people performing non-blind tests to settle on encoder options. I want to, at least, let these people do it properly. At the top of that, some other people want to be sure to reach the absolute tranparency for their ears.

But as I warned, I wouldn't tolerate any more mind reading instead of proper actual reading, I'll stop there with you as you seem incapable of asking for more details or explanations. The "you are litterally asking..." stuff is a blattant example of that: where the fuck did I say that I wanted people to do in 30 seconds what takes hours for others to obtain? I stated the mere fact that some people will want to invest only 30 seconds, other people will want to invest months. And I want to be as useful as possible to all of them with the given time. But "being as useful as possible" does not mean "provide the same information with the same guaranty".

Another blattant example of mind reading is all the stuff about "magic thincking" and cheating in reference to my first point. My first point is NOT about remote tests. It's about a local program to determine which codec use for yourself. Even though I made a link in the second point (stating it as a possibility, not something to do) with the program of the first point, these are for two very different purposes. So all the cheating related comments regarding the first point are only the result of your mind misinterpreting things, not what I said. Read again...

Regarding the rest, all I read was "experimentation is futile, you absolutely won't find anything interesting, stay with the status quo please as our system is perfect with the most perfect trade off" (to what I insist, I'm glad DBT has been introduced BEFORE you reached such an enlightenment).

On these nices words, I'm going away.

List of typical problems and shortcoming of "common" audio t

Reply #42
Once again, you are not reading. I say something not familiar to you so you try to force it into something familiar. The problem is that by doing this is that you determine what I want out of what you think and not what I said.


Incorrect.  I believe I understand fully what you are saying and cut to the crux.  If I do not how about restatement?

Can't you understand that the majority of the population outside of this forum consider reaching transparency with absolute certainty as useless nitpicking?


Read what I said again.  I said nothing about that.  You're putting words in my mouth. 

If anything your stated goal of listener/encoding software user testing for anything other than transparency is even more silly.  "Good Enough" is a known quantity, and the difference between "Good Enough" and "Transparent for all but the oddball samples" is so slender that there is little point chasing it.  What do you feel you're going to accomplish through a hearing test of your users?  Is lowering the lowpass of an encoder 1K going to produce results either smaller or higher quality enough to make a difference in everyday life?

For if you aren't trying to give the users of your encoding software transparency why should they be arsed to take a test or answer questions?  Where have common encoders and their default settings let users down?  What is the problem you feel needs solved?


The "you are litterally asking..." stuff is a blattant example of that: where the fuck did I say that I wanted people to do in 30 seconds what takes hours for others to obtain? I stated the mere fact that some people will want to invest only 30 seconds, other people will want to invest months. And I want to be as useful as possible to all of them with the given time. But "being as useful as possible" does not mean "provide the same information with the same guaranty".


Then instead of cursing the wind restate an example of what you hope to accomplish in 30 seconds and how it will be better than what we have now (sane defaults).

For, as I said "Good enough" is a known quantity and the line between "Good enough" and "corner case" can not be discovered in 30 seconds or through your software's questionnaire.

Still none of this addresses why you want to prevent "cheating."  A question which has been on the table for 24 hours now.


Regarding the rest, all I read was "experimentation is futile, you absolutely won't find anything interesting, stay with the status quo please as our system is perfect with the most perfect trade off" (to what I insist, I'm glad DBT has been introduced BEFORE you reached such an enlightenment).


I attack ideas.  You attack people.
Creature of habit.

List of typical problems and shortcoming of "common" audio t

Reply #43
Brainstorming sometimes it's not such a bad idea, if the brain is not the only one. But, of course, enlisted the permission of administration HA.

List of typical problems and shortcoming of "common" audio t

Reply #44
In theory yes. In practice, you'll naturally stop. Once I make the first tree, you'll see that you won't reach infinity. That's also why I'll only let people add things to the tree for a limited amount of time.

That'll make your list rather random. Is that good enough for you?

Quote
True, but I'm sure there are a few things easily done that would bring non-hardcore scientists to trust and see the values of these methodologies. For some it would require a small explanation, for other you'd have to provide a red-colored theme and for others, there is absolutely nothing you can do.

I find it pretty hopeless to come up with a-priori solutions to the non-rational of those problems. I don't know where you get your hope from. If you provide a red-colored theme, the next fellow will want a pink one. Why bother?

My own experience points in a completely different direction: You ought to try to come up with a test that can stand up to the rational objections. That's difficult enough already. Convince yourself first that you have something solid and credible. If you are confident, and can demonstrate and explain your considerations, you have a better chance of convincing others, if they are convincable at all. As with all statistical tests, absolute certainty (a term that you seem to be fascinated with) is out of reach anyway. It all comes down to raising the confidence level. If that doesn't help, trying to come up with the right "theme" is futile IMHO. When non-scientific people don't trust a test, what they usually need is more education about testing, and not a superficial change in the test to make them feel better. Or perhaps the test is indeed dubious, and their scepticism is warranted, after all, but that can only be clarified in a rational discussion.

I acknowledge that my stance is somewhat anti-marketing and contains a dose of scientific arrogance. I stand by that. There are things that need to be understood before they can be appreciated, so there's no substitute for learning.

Quote
I was under the same impression from what I've read.  But see, the "stress", you can simply ditch it saying this is posterior justification for failure. But if you provide an environment that cannot, in anyway, be thought as stressful that's a (small) victory.

It would be a small victory if it were possible. In practice, the stress argument is a joker argument that can always be played, regardless of the details of the test, because stress needs no external cause. You can of course reject such excuses after the fact, and I would be with you, but that won't deter the others from playing this card. I have seen this happening before.

Quote
Does it make more sense now? Do I need to explain something else? To go into more details?

I think I understand. And I still think what you are trying is pointless.

And, no, this doesn't mean that "experimentation is futile, you absolutely won't find anything interesting, stay with the status quo please as our system is perfect with the most perfect trade off". You demonstrate that you are happy to commit the exact offense you accuse others of. I have seen nobody state that experimentation is futile, or that we have an already perfect system. Your exaggeration doesn't clarify, it makes you look offensive and abrasive. By all means experiment all you want, perhaps you come across a gold nugget. There's always a chance. But if you engage others, try not to waste their time.

List of typical problems and shortcoming of "common" audio t

Reply #45
By Post #40
And by the way and if we talk seriously, if you will append a constant electrical signal above 20 kHz, while analog amplifier transistors can switch from its standard operating mode class AB in сlass A. In theory, the listener can choose a noisy like a more natural and correct! (THD A < THD AB)