HydrogenAudio

Hydrogenaudio Forum => Listening Tests => Topic started by: Cavaille on 2014-08-17 12:30:09

Title: Issues with Blind-Testing Headphones and Speakers
Post by: Cavaille on 2014-08-17 12:30:09
New topic. Things become a bit more complex and controversial if you are attempting to level match things that fundamentally have markedly different frequency responses, such as speakers and headphones. Say for example one speaker has a stronger level of bass than the other one being tested. Do you level match them using pink noise? Do you level match them at 1 kHz? Do you instead level match them at 500Hz? Do you use a narrow band of noise centered where the ear is most sensitive, around 3.5 to 4 kHz? Do you use A-weighting when you conduct any of these level matches? You will get very different results depending one which exact method you use when the products have certain frequency response deviations between them.


I always use a 1 kHz sine. Now I have doubts if that is the best method. Should mention that I don't ABX speakers or amplifiers, just cd players, md players and other sources. Should I use, I don't know, ReplayGain?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-17 19:25:29
New topic. Things become a bit more complex and controversial if you are attempting to level match things that fundamentally have markedly different frequency responses, such as speakers and headphones. Say for example one speaker has a stronger level of bass than the other one being tested. Do you level match them using pink noise? Do you level match them at 1 kHz? Do you instead level match them at 500Hz? Do you use a narrow band of noise centered where the ear is most sensitive, around 3.5 to 4 kHz? Do you use A-weighting when you conduct any of these level matches? You will get very different results depending one which exact method you use when the products have certain frequency response deviations between them.


I always use a 1 kHz sine. Now I have doubts if that is the best method. Should mention that I don't ABX speakers or amplifiers, just cd players, md players and other sources. Should I use, I don't know, ReplayGain?

Since you seem to only A/B things that have fairly flat frequency responses, I don't think you have much to worry about. It is when they vary greatly, as do speakers and headphones, that any method becomes questionable in my mind.  This doesn't sit well with some people I've noticed, AES published scientists I mean, so they talk themselves into thinking their method is perfectly valid and beyond reproach, regardless of the specific comparison.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: Cavaille on 2014-08-17 23:19:40
Since you seem to only A/B things that have fairly flat frequency responses, I don't think you have much to worry about. It is when they vary greatly, as do speakers and headphones, that any method becomes questionable in my mind.  This doesn't sit well with some people I've noticed, AES published scientists I mean, so they talk themselves into thinking their method is perfectly valid and beyond reproach, regardless of the specific comparison.


I´ve always assumed that using a sine at 1 kHz would be sufficient to my needs, using exactly the reasoning you provided (flat frequency response). So I can sleep safe again

I´ve got no idea about loudspeakers but I have a pretty good idea about headphones. Judging from wildly differing frequency responses of several models I assume it´s close to impossible to level-match them using sines or noise. Wouldn´t something akin to ReplayGain be a better idea there? Something that takes into account how we perceive music? Or does that defeat the purpose of a DBT because it already kind of "pre-selects" certain differences?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-18 02:02:54
[I´ve got no idea about loudspeakers but I have a pretty good idea about headphones. Judging from wildly differing frequency responses of several models I assume it´s close to impossible to level-match them using sines or noise. Wouldn´t something akin to ReplayGain be a better idea there? Something that takes into account how we perceive music?

Whose weighting curve do you use though? There are many besides just A-weighting. Do you also install an eardrum probe mic under the headphone cushion to monitor the actual level at the ear drum so the weighting curve can vary depending on the volume the listener happens to choose? That varies too.


Notice even if using just one standard, ISO 226, the curve changes depending on the selected level at the ear.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: Cavaille on 2014-08-18 15:21:26
I see. So there are just too many variables. Which IMO also means, that one could influence a result towards a preferred outcome? Example: Harman might use this to discredit competitors during their loudspeaker tests by using weighing models better suited to their own products. Wrong?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-18 18:13:50
I'm of the mind that one should accept that level matching things like speakers and headphones is so tricky, problematic, and questionable, that "fair, level matched comparisons" simply can't be done. This doesn't stop researchers at Harmon, etc. from publishing papers where they say "and we level matched" and nobody but me seems to blink an eye. "Our trained listeners preferred speaker A over B due to tonal balance. We level matched the two so that couldn't have influenced their selection" Um. now wait a minute. If you level matched them using A-weighting and speaker A has a broad peak at 4kHz that speaker B doesn't, I'd hardly describe the two as being "level matched" using that method because the very concept of what is "correct in all situations"  itself is nebulous and controversial.


The original Toole & Olive work used B-weighting and pink noise,  always monophonic.  I don't recall them ever writing  "We level matched the two so that couldn't have influenced their selection".  For recent headphone preference work, Welti and Olive report (https://dl.dropboxusercontent.com/u/16343460/Relationship%20between%20Perception%20and%20Measurement%20of%20Headphone%20Sound%20Quality.key.pdf) that relative loudness was normalized as per ITU BS.1770 : "Algorithms to measure audio programme loudness and true-peak audio level" (https://www.itu.int/rec/R-REC-BS.1770-3-201208-I/en) recommendations.  Sean Olive posts to HA ,as well as to his own blog (http://seanolive.blogspot.com/2013/04/the-relationship-between-perception-and.html), so people might want to pose questions to him directly rather than insinuate and/or come to premature conclusions here.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-18 18:27:52
I've never thought of it that way, Cavaille, up until now at least, but I guess that is technically possible.

I have other problems with some of the Harman studies. Take for instance their, *ahem*, "blind" comparison of headphones. Although as I understand it they attached plastic handles to the headphones so listeners could adjust them for position and comfort without touching the headphones' main body, so as to not disclose the identity by finger touch alone [Good!], do people honestly think they wouldn't be able to tell by the feel on the head/ears when, for example, they were using the massive and triply heavy, circular ear cushioned Audeze (600g) compared to the markedly smaller, lighter (193g), and oval ear cushioned Bose, they tested?

I suspect you could ask any 10-year-old "Based on their appearance alone, which of these 6 headphones do you suspect costs the most?" and they'd go with the giant, round eared Audeze nine out of ten times.

The headphones were obscured from view during the actual testing, true, but how they feel pressing against the head and ears was not. Just because that is difficult/impossible to account for doesn't prove it isn't a potential problem which needs to be controlled.*

Sources for weight and shape:
http://www.bose.com/controller?url=/shop_o...rt_15/index.jsp (http://www.bose.com/controller?url=/shop_online/headphones/noise_cancelling_headphones/quietcomfort_15/index.jsp)
http://www.audeze.com/products/headphones/lcd-2 (http://www.audeze.com/products/headphones/lcd-2)

I'm confident Dr. Olive will see this momentarily and correct me if I'm wrong.

*[If I were testing people's headphone response curve preferences I would have simulated them electrically via outboard EQ and had the test subjects wear exactly the same pair of headphone for all comparisons. Not to say that the concept of whatever is "proper" level matching isn't still debatable and up for grabs using my unorthodox method.]
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-18 19:05:24
I'm of the mind that one should accept that level matching things like speakers and headphones is so tricky, problematic, and questionable, that "fair, level matched comparisons" simply can't be done. This doesn't stop researchers at Harmon, etc. from publishing papers where they say "and we level matched" and nobody but me seems to blink an eye. "Our trained listeners preferred speaker A over B due to tonal balance. We level matched the two so that couldn't have influenced their selection" Um. now wait a minute. If you level matched them using A-weighting and speaker A has a broad peak at 4kHz that speaker B doesn't, I'd hardly describe the two as being "level matched" using that method because the very concept of what is "correct in all situations"  itself is nebulous and controversial.


The original Toole & Olive work used B-weighting and pink noise,  always monophonic.  I don't recall them ever writing  "We level matched the two so that couldn't have influenced their selection".

I was paraphrasing, of course, but the concept is implied by the use of the word "normalized" in one of their bullet points, from your first link:

"Relative loudness differences normalized (ITU-R 1770 )", at least for that study it is referring to. No?


Everyone agrees level matching is important. Where people differ is how to go about doing it or if it even can be done.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: Cavaille on 2014-08-18 21:04:23
The original Toole & Olive work used B-weighting and pink noise,  always monophonic.  I don't recall them ever writing  "We level matched the two so that couldn't have influenced their selection".  For recent headphone preference work, Welti and Olive report (https://dl.dropboxusercontent.com/u/16343460/Relationship%20between%20Perception%20and%20Measurement%20of%20Headphone%20Sound%20Quality.key.pdf) that relative loudness was normalized as per ITU BS.1770 : "Algorithms to measure audio programme loudness and true-peak audio level" (https://www.itu.int/rec/R-REC-BS.1770-3-201208-I/en) recommendations.  Sean Olive posts to HA ,as well as to his own blog (http://seanolive.blogspot.com/2013/04/the-relationship-between-perception-and.html), so people might want to pose questions to him directly rather than insinuate and/or come to premature conclusions here.


I find a lot of research Sean Olive has been doing, extremely fascinating (in part, because the results feel similar to my own experiences / preferences). However, he works for a company whose intent it is to sell products. Therefore, I don´t think that careful questions regarding ethics are uncalled for. I don´t think that their sole reason for doing this is the good of all mankind.

I have other problems with some of the Harman studies. Take for instance their, *ahem*, "blind" comparison of headphones. Although as I understand it they attached plastic handles to the headphones so listeners could adjust them for position and comfort without touching the headphones' main body, so as to not disclose the identity by finger touch alone [Good!], do people honestly think they wouldn't be able to tell by the feel on the head/ears when, for example, they were using the massive and triply heavy, circular ear cushioned Audeze (600g) compared to the markedly smaller, lighter (193g), and oval ear cushioned Bose, they tested?


If memory serves me right (and please, correct me in case I´m wrong)... wasn´t there a study where frequency responses of certain headphones were applied to just one model in order to mimic their sonic signature? If so, that would exclude the need to change headphones.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-18 21:05:59
I've never thought of it that way, Cavaille, up until now at least, but I guess that is technically possible.

I have other problems with some of the Harman studies. Take for instance their, *ahem*, "blind" comparison of headphones. Although as I understand it they attached plastic handles to the headphones so listeners could adjust them for position and comfort without touching the headphones' main body, so as to not disclose the identity by finger touch alone [Good!], do people honestly think they wouldn't be able to tell by the feel on the head/ears when, for example, they were using the massive and triply heavy, circular ear cushioned Audeze (600g) compared to the markedly smaller, lighter (193g), and oval ear cushioned Bose, they tested?

I suspect you could ask any 10-year-old "Based on their appearance alone, which of these 6 headphones do you suspect costs the most?" and they'd go with the giant, round eared Audeze nine out of ten times.

The headphones were obscured from view during the actual testing, true, but how they feel pressing against the head and ears was not. Just because that is difficult/impossible to account for doesn't prove it isn't a potential problem which needs to be controlled.*

Sources for weight and shape:
http://www.bose.com/controller?url=/shop_o...rt_15/index.jsp (http://www.bose.com/controller?url=/shop_online/headphones/noise_cancelling_headphones/quietcomfort_15/index.jsp)
http://www.audeze.com/products/headphones/lcd-2 (http://www.audeze.com/products/headphones/lcd-2)



You're saying that the feel of the headphones could cause bias --  in which direction?  IOW which phones do you , or your 10-year-old,  predict will perform best and worst in the listener self-report? Would you be able to predict their ranking (from best to worst?  And how would you square that with Olive's finding that an objectively 'neutral' performance correlated best with subjective preference?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-18 21:11:14
I find a lot of research Sean Olive has been doing, extremely fascinating (in part, because the results feel similar to my own experiences / preferences). However, he works for a company whose intent it is to sell products.



He does now (he's also the president of the AES).  But the loudspeaker preference work with Toole was first undertaken when they worked at the National Research Council of Canada. 

And the 'winner and loser' brands/models are never identified in the published research.  Hard to see how this could be gaming the system when the reader doesn't know how well Harman/JBL's product (if any) did. 

And it would be churlish to dun Olive/Toole for *applying* the knowledge gained from such research.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-18 21:52:14
I find a lot of research Sean Olive has been doing, extremely fascinating (in part, because the results feel similar to my own experiences / preferences). However, he works for a company whose intent it is to sell products. Therefore, I don´t think that careful questions regarding ethics are uncalled for. I don´t think that their sole reason for doing this is the good of all mankind.


They do research so that they can develop, sell/offer better products, and make that research publicly available. How is that unethical?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-18 22:42:25
You're saying that the feel of the headphones could cause bias --  in which direction?  IOW which phones do you , or your 10-year-old,  predict will perform best and worst in the listener self-report?

I'm not saying a headphone's size and weight are the only things which would weigh in a person's decision, albeit not consciously, but the basic prejudice would flow like this:

Big and heavy electronic gizomos are the expensive ones = best quality ones.

Correct me if I'm wrong but weren't the Audeze, which are the heaviest and largest in the study, IIRC, also ranked as "the best/most accurate"?

I can give you an anecdotal example where a company used this well known bias.

In the late 80's/early 90's I was a Denon dealer and they were one of the top names in CD players back then. For some reason there was a top of the line unit we were throwing in the trash [because it had fallen from a shelf and had a bashed in corner, making it toast, for example].  Out of curiosity I dismantled the remains and discovered hidden away from view, completely obscured by the main internal circuit board, was a very thick metal slab of steel, a metal plate, much heavier than almost all other brands' entire CD design, heck it was even heavier than some of their own receivers. It wasn't attached to anything and served no electrical or heat dissipation purpose. I'm confident the sole purpose was to make the entire product's heft and "bulid quality" seem greater and served no other function.

[I think it was the DCD-3300 or DCD-3000, IIRC, but I'm not 100% sure at this point.]

Plus think of how many forum reviews of amps and receivers we've all read where they felt it to be useful to describe the unit's heft, which they note with some pride, as if that should convey to us something about its quality.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-18 23:32:32
And the 'winner and loser' brands/models are never identified in the published research.  Hard to see how this could be gaming the system when the reader doesn't know how well Harman/JBL's product (if any) did.


Considering Harman uses the data in their marketing and advertising, at least sometimes, I'd hardly call it a secret though, even if it wasn't technically spelled out in the original paper:

Harman Marketing page (http://harmaninnovation.com/blog/the-science-and-marketing-of-sound-quality/)
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-19 00:21:55
You're saying that the feel of the headphones could cause bias --  in which direction?  IOW which phones do you , or your 10-year-old,  predict will perform best and worst in the listener self-report?

I'm not saying a headphone's size and weight are the only things which would weigh in a person's decision, albeit not consciously, but the basic prejudice would flow like this:

Big and heavy electronic gizomos are the expensive ones = best quality ones.

Correct me if I'm wrong but weren't the Audeze, which are the heaviest and largest in the study, IIRC, also ranked as "the best/most accurate"?



I don't know, and neither do you.  That's my point.  But it's not just which is ranked best, it's also how they are ranked.  By your hypothesis, they should rank from best to worst according to weight or 'feel''.  Are they?


Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-19 00:34:35
And the 'winner and loser' brands/models are never identified in the published research.  Hard to see how this could be gaming the system when the reader doesn't know how well Harman/JBL's product (if any) did.


Considering Harman uses the data in their marketing and advertising, at least sometimes, I'd hardly call it a secret though, even if it wasn't technically spelled out in the original paper:

Harman Marketing page (http://harmaninnovation.com/blog/the-science-and-marketing-of-sound-quality/)


Again, why would they NOT use their research in *marketing* if the results favored their product? 

A correlation between good 'sound power' metrics and listener preference is what the 1980s CNRC and subsequent Harman 'academic' work reported; Harman incorporates this into their loudspeaker design so that by 2004, one of their comparatively low-cost Infinity loudspeaker bests the other three designs in a DBT involving both trained and untrained listeners.  This is evidence of biased research? Or is it proof of concept?

Other speaker mfrs are free to design their products according to the CNRC guidelines.  And some have.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 00:35:09
By your hypothesis, they should rank from best to worst according to weight or 'feel''.

False. I never said that. I said it could partly influence a decision just as easily as doing a sighted study where the listeners were free to see what headphone they were wearing.

I'm not saying a headphone's size and weight are the only things which would weigh in a person's decision, albeit not consciously....
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-19 00:41:21
By your hypothesis, they should rank from best to worst according to weight or 'feel''.

False. I never said that. I said it could influence a decision just as easily as doing a sighted study where the listeners were free to see what headphone they were wearing.



That's not unreasonable, but it's stipulated (especially the 'as easily' part) rather than demonstrated.  Toole and Olive have studied, and published on,  factors influencing loudspeaker preference.  I'm not aware that Olive  et al. quantified the relative impact of headphone feel, though I haven't read all their papers on this subject.  Have you?

If it *has* had a similarly powerful effect as, say, price, then you really should be able to predict what the likely preference ranking trend is, when this variable is not controlled for.


I'm not saying a headphone's size and weight are the only things which would weigh in a person's decision, albeit not consciously....


Which is why I wrote  weight or 'feel'.  (I really don't know what else you could be referring to, beyond that, in a visually 'blinded' protocol)
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 01:16:53
If it *has* had a similarly powerful effect as, say, price, then you really should be able to predict what the likely preference ranking trend is, when this variable is not controlled for.

False. When you do sighted comparisons, or ones where the test subjects might be able to identify the headphones by say touch or feel, you never know if their (possibly subconscious) expectation bias plays almost no part in the decision process, a minor part, or a major part, all you know is it may have had some impact.

Olive/Welti seem to agree with me that touch/feel might be an influence, hence their decision to add small external handles to the phones so that the users' positioning for comfort and seal could be accomplished without their fingers identifying the brand they were wearing, they just overlooked how the different shape of the ear cushions and the product's overall weight might influence the listeners as to the actual IDs, or at least allowed them to sense: "Wow, this one is heavier and larger than the rest, I wonder if that means it's pricy, too".
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 01:35:38
You're saying that the feel of the headphones could cause bias --  in which direction?  IOW which phones do you , or your 10-year-old,  predict will perform best and worst in the listener self-report?

I'm not saying a headphone's size and weight are the only things which would weigh in a person's decision, albeit not consciously, but the basic prejudice would flow like this:

Big and heavy electronic gizomos are the expensive ones = best quality ones.

Correct me if I'm wrong but weren't the Audeze, which are the heaviest and largest in the study, IIRC, also ranked as "the best/most accurate"?



I don't know, and neither do you. 

"Olive and Welti didn't reveal the scores of the individual headphones, but having measured all of the headphones myself (except in the case of the LCD2, but I've measured the similar LCD3), it looks to me like the clearly preferred headphone was the Audeze LCD2" - Brent Butterworth, Sound and Vision Magazine http://www.soundandvision.com/content/bigg...udio-story-2012 (http://www.soundandvision.com/content/biggest-audio-story-2012)


"We admitted up front that comfort/tactile factors were not eliminated from the test, and in that sense the test wasn't blind. However, our listeners didn't know which brands and models headphones were being tested so unless they could recognize the specific brand/model by its weight/comfort alone, their judgments weren't influenced by brand, price,etc" -  S.Olive

http://www.hydrogenaud.io/forums/index.php?showtopic=100538 (http://www.hydrogenaud.io/forums/index.php?showtopic=100538) post #8

They were influenced by size and weight. Consumers, even audio naïve ones like ten-year-olds, equate large and heavy with pricy/high quality and that potentially influenced them, even if it wasn't the only factor. I also suspect the listeners were shown what headphones the were going to compare beforehand, and could then later correlate that to the feel they experienced, although I don't know for sure.

Perhaps they also thought, "Gee, I bet that big one with the fancy wood trim and the big, round cushions will be a stand out!"  Just a thought.

From Jurassic Park:

Donald Gennaro: [Tim pops up wearing a pair of night vision goggles] Hey, where'd you find that?

Tim: In a box under my seat.

Donald Gennaro: Are they heavy?

Tim: Yeah.

Donald Gennaro: Then they're expensive, put 'em back.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-19 02:19:19
They were influenced by size and weight. Consumers, even audio naïve ones like ten-year-olds, equate large and heavy with pricy/high quality and that potentially influenced them, even if it wasn't the only factor. I also suspect the listeners were shown what headphones the were going to compare beforehand, and could then later correlate that to the feel they experienced, although I don't know for sure.

Perhaps they also thought, "Gee, I bet that big one with the fancy wood trim and the big round cushions is a stand out!"  Just a thought.

Sure, you cannot eliminate the ear pads so there's always gonna be some influence on the comfort, but so will the freaking sound. If you tried to add weight to the headphones you'd probably get some problems with seal, fit etc.

I don't think they were shown the headphones beforehand. If they had known which headphone they were listening to, then why didn't they rank the AKG (Harman) K550 better? I mean why would they rank a Bose higher?
Also, the K550's a pretty heavy headphone too yet it was ranked below lighter ones... like the Bose one (which is 100g + weight of K550 cable lighter).


... and there's the correlation of the measurements. You would have a (stronger) point if the Audeze headphone measured worse than the other headphones, but it doesn't.
The Audeze has the lowest distortion, most consistent performance across different heads, good frequency response ...
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 02:27:40
They were influenced by size and weight. Consumers, even audio naïve ones like ten-year-olds, equate large and heavy with pricy/high quality and that potentially influenced them, even if it wasn't the only factor. I also suspect the listeners were shown what headphones the were going to compare beforehand, and could then later correlate that to the feel they experienced, although I don't know for sure.

Perhaps they also thought, "Gee, I bet that big one with the fancy wood trim and the big round cushions is a stand out!"  Just a thought.

If they had known which headphone they were listening to, then why didn't they rank the AKG (Harman) K550 better?

Please link to this ranking you speak of.

Quote
You would have a (stronger) point if the Audeze headphone measured worse than the other headphones, but it doesn't.
I have no idea how you figure that.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-19 03:01:50
I have no idea how you figure that.

If weight had a strong influence then we would see a ranking that correlates with the weight, but instead we see a ranking that correlates with the measurements.

Also, they did much more research than just "blind"-testing 5 headphones for preference.. They tested several equalization curves, equalization based on measurements of loudspeakers in their reference listening-room, bass-treble balance, compared trained listeners' to kids' preferences, compared their new target curves with HD800, LCD2 ...

Given that for some of these tests a single headphone with different EQ curves applied was used, and those test results mirrored the preference of the initial 5 headphone test, you simply cannot argue that weight or pads had a significant effect.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 04:28:57
If weight had a strong influence...

Never said "strong". I said, "it may have had some impact." [bold emphasis in original post I'm quoting]

Quote
If they had known which headphone they were listening to, then why didn't they rank the AKG (Harman) K550 better?

Still waiting for that ranking by brand you referred to, by the way.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-19 12:10:05
Never said "strong".

So? You implied how weight may have skewed the results. I explained what we should have seen if weight had had a significant influence.

The thing is that the mood of the listeners also had an influence, as did the air temperature, humidity ... but those are (also) insignificant points.


Quote
Still waiting for that ranking by brand you referred to, by the way.

Can be found if you follow your own link.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: pdq on 2014-08-19 14:40:59
In the late 80's/early 90's I was a Denon dealer and they were one of the top names in CD players back then. For some reason there was a top of the line unit we were throwing in the trash [because it had fallen from a shelf and had a bashed in corner, making it toast, for example].  Out of curiosity I dismantled the remains and discovered hidden away from view, completely obscured by the main internal circuit board, was a very thick metal slab of steel, a metal plate, much heavier than almost all other brands' entire CD design, heck it was even heavier than some of their own receivers. It wasn't attached to anything and served no electrical or heat dissipation purpose. I'm confident the sole purpose was to make the entire product's heft and "bulid quality" seem greater and served no other function.

[I think it was the DCD-3300 or DCD-3000, IIRC, but I'm not 100% sure at this point.]

Plus think of how many forum reviews of amps and receivers we've all read where they felt it to be useful to describe the unit's heft, which they note with some pride, as if that should convey to us something about its quality.

I suppose one could be generous and assume that was a large mu-metal shield. 
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 18:54:56
^Maybe to protect against an nuclear blast EMP which is fired from directly below the unit, only (not from the sides or above).
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-19 20:09:28
Never said "strong".

So? You implied how weight may have skewed the results.

Not just weight. Headphone comfort, cushion size, shape [some contenders are circular and others are ovals of differing shapes and sizes], air seal, headband size, foam thickness, width, configuration [for example rigid/soft/cloth strap/contact area and shape], head clamp pressure, room noise attenuation curve, and in some instances noise canceling electronic's background hiss [which Olive acknowledges was a problem which needed to be overcome and implies he was confident he successfully did].

I know I've rejected certain headphones over the years not solely but at least in part due to headband issues, having nothing to do with the sound itself which I liked. It's perceptible and in some instances, sometimes, can be a tad annoying over long periods of listening. [I no longer have a lot of hair up top, so maybe I'm more sensitive to it now than others, but this was even true when I was younger and had a full head of hair.]

In some headphone measurements the inward clamping pressure or contact pressure isn't even accomplished with the headband at all, which will of course vary due to the size setting selected and the user's head size, so instead it is completely bypassed and achieved by a more repeatedly uniform means set to a specific value often expressed in N, newtons:

There seems to be much less focus on how this variable pressure alters the sound these days, compared to the headphone research before the 90's. Exaggerated contact pressure usually ensures a better seal, true, but also increases level and bass response which is why images of people adding addition force, by hand, is not uncommon in this random collection them wearing headphones:
Images of people wearing headphones (https://www.google.com/search?q=headphone+clamp+pressure&rls=com.microsoft:en-US:IE-Address&source=lnms&tbm=isch&sa=X&ei=NZjzU_ePJaKnigLyioCoBA&ved=0CAkQ_AUoAg&biw=1280&bih=541#q=people+wearing+headphones&rls=com.microsoft:en-US:IE-Address&tbm=isch)
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-19 20:55:22
Here's a 'best guess' at the headphone  ranking, from the other thread on HA :

HP1 - LCD-2 (Audeze)
HP2 - K701
HP3 - Bose
HP4 - K550
HP5 - Beats
HP6 - Crossfade (v-moda)

So, how well do the weight and feel of each DUT correlate to this ranking?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-19 21:31:04
Not just weight. Headphone comfort, cushion size, shape [some contenders are circular and others are ovals of differing shapes and sizes], air seal, headband size, foam thickness, width, configuration [for example rigid/soft/cloth strap/contact area and shape], head clamp pressure,

But the LCD is not the most comfortable headphone out of the bunch ... not by far. For me personally, the weight alone would be reason enough not to buy it.

Have you even looked at the research? Because there is a comfort ranking and the LCD is the worst out of the bunch.

@krabapple: Bose > K550 > K701 > Crossfade > Beats > LCD2 (least comfortable).
I don't see a correlation.


Quote
room noise attenuation curve

If we compare the LCD to the K701 they are not that different, but the K701 is brighter, has a bit of bass roll-off, higher distortion ... and therefore was ranked lower.
The Bose has extreme isolation but was ranked in the upper half, so I don't see a correlation here either.

Quote
and in some instances noise canceling electronic's background hiss [which Olive acknowledges was a problem which needed to be overcome and implies he was confident he successfully did].

Yes, the NC circuitry in the Bose. I was surprised that they even used a NC headphone in the test.
It doesn't seem to be an outlier in terms of sound preference, but you can simply remove that headphone and you will still have 5 passive ones that don't produce noise.

Quote
In some headphone measurements the inward clamping pressure or contact pressure isn't even accomplished with the headband at all, which will of course vary due to the size setting selected and the user's head size, so instead it is completely bypassed and achieved by a more repeatedly uniform means set to a specific value often expressed in N, newtons:

Doesn't matter, since they also did ear-canal measurements across several subjects.

Also, for the different EQ-curves preference test they only used one headphone at a time, so all the differences between headphones are irrelevant. It's all in the papers.


Quote
There seems to be much less focus on how this variable pressure alters the sound these days, compared to the headphone research before the 90's.

It's kinda in their research since they looked into bass response consistency ... and critiqued one of their "own" headphones for it.
In fact, they did not shy away from criticizing AKG headphones at all.


Quote
Exaggerated contact pressure usually ensures a better seal, true, but also increases level and bass response which is why images of people adding addition force, by hand, is not uncommon in this random collection them wearing headphones

That's a photo thing, because nobody is listening like that afaik. Also, in most photos they don't even apply force.


edit: Maybe an op should move this into the thread linked by mzil a few posts ago.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-20 19:49:37
Not just weight. Headphone comfort, cushion size, shape [some contenders are circular and others are ovals of differing shapes and sizes], air seal, headband size, foam thickness, width, configuration [for example rigid/soft/cloth strap/contact area and shape], head clamp pressure,

But the LCD is not the most comfortable headphone out of the bunch ... not by far. For me personally, the weight alone would be reason enough not to buy it.


So you just admitted in this quote that at least some people, and you seem to include yourself, can discern differences between headphones they are wearing based on head comfort/feel, independently of seeing with their eyes which brand they are currently wearing, or listening to the sound it produces. We don't need to ponder as to how, exactly, comfort/feel may influence people positively or negatively towards one headphone or another in a "blind" test, any more than we need to ponder how in a blind taste test of Coke versus Pepsi using a paper cup for one versus using a glass for the other might influence the test results. We must serve them the same way, regardless of the test results falling neatly into our preconceived notions of how they "should", based on some other factor(s) we can measure.

I can't for the life of me understand why you don't think head feel needs to be controlled for, regardless of the test results falling neatly into a pattern which matches "flatness of response", considering Olive himself went to great lengths to eliminate any influence from touch via fingers, not head, by his clever use of additional handles being added to each pair of 'phones so users could fiddle with their positioning and ear seal without sensing the headphone cup shape/size...via their fingers.

I would agree discussing if people can discern differences between headphones' comfort and feel is tangential to the thread's topic of level matching and ideally should be split off.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-20 20:02:23
So, how well do the weight and feel of each DUT correlate to this ranking?

I've sold headphones professionally for over 20 years and can assure you comfort is subjective. Some customers I've dealt with may like one design yet others may dislike the exact same pair.

For instance, I was briefly examining some Audeze in a high end store yesterday and I thought they seemed acceptably comfortable and not too heavy for me, yet xnor just wrote:
Quote
But the LCD is not the most comfortable headphone out of the bunch ... not by far. For me personally, the weight alone would be reason enough not to buy it.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-20 21:16:58
So you just admitted in this quote that at least some people, and you seem to include yourself, can discern differences between headphones they are wearing based on head comfort/feel, independently of seeing with their eyes which brand they are currently wearing, or listening to the sound it produces.

Yes, of course you can discern headphones by comfort alone. But the research was not to detect if people can distinguish an LCD from a Bose ... it wasn't an ABX test (which would be trivial considering the 10 dB frequency response differences) ... but which sound signature they preferred. They still tried to reduce influence from comfort (see below).
I said it before and I will repeat it again: the LCD was ranked as the worst in terms of comfort, but as best (out of the bunch) in terms of sound.


Quote
We don't need to ponder as to how, exactly, comfort/feel may influence people positively or negatively towards one headphone or another in a "blind" test, any more than we need to ponder how in a blind taste test of Coke versus Pepsi using a paper cup for one versus using a glass for the other might influence the test results. We must serve them the same way, regardless of the test results falling neatly into our preconceived notions of how they "should", based on some other factor(s) we can measure.

That's quite problematic.
a) Comfort is part of the headphone experience. It is not like a glass through which you serve some drink, but more like part of the drink.
b) Changing the earpads' shape will alter the sound. Changing the pad material will alter the sound. Changing the clamping force will alter the sound. Even equalizing the weight probably would indirectly alter the sound.

Trying to "serve them the same way" is like transplanting speaker drivers out of their enclosure into a standard test enclosure. That doesn't work for obvious reasons.


Quote
I can't for the life of me understand why you don't think head feel needs to be controlled for, regardless of the test results falling neatly into a pattern which matches "flatness of response", considering Olive himself went to great lengths to eliminate any influence from touch via fingers, not head, by his clever use of additional handles being added to each pair of 'phones so users could fiddle with their positioning and ear seal without sensing the headphone cup shape/size...via their fingers.

Exactly, he did as much as he could without altering the sound of the headphones. For further research, like solely comparing different EQ curves, you can use a single headphone - which is what they did.


I've sold headphones professionally for over 20 years and can assure you comfort is subjective. Some customers I've dealt with may like one design yet others may dislike the exact same pair.

Of course!

I hope that clears it up.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-20 21:33:24
Yes, of course you can discern headphones by comfort alone. But the research was not to detect if people can distinguish an LCD from a Bose ... it wasn't an ABX test (which would be trivial considering the 10 dB frequency response differences) ... but which sound signature they preferred. They still tried to reduce influence from comfort (see below).

An ABX test is a subset of another kind of test: a double blind test. The fact that Olive put plastic handles on the headphones proves he agrees with me that the identity of the headphones' shape or size should be obscured as best possible and he admits that in terms of tactile sensation, short of anesthetizing the listeners' skin from the neck up [ha-ha], he couldn't come up with a way to make the test truly blind.

I think he did a much better job than most but whenever we read of non-blind tests we need to always consider that there were possibly expectation biases at play, not fully dictating but rather influencing the decisions, possibly at a subconscious level.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: probedb on 2014-08-20 21:34:50
I've sold headphones professionally for over 20 years and can assure you comfort is subjective. Some customers I've dealt with may like one design yet others may dislike the exact same pair.


And that's one of the reasons there are different headphones and in fact different everything! Everyone is different with everything not just headphones.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-20 22:06:20
Here's a 'best guess' at the headphone  ranking, from the other thread on HA :

HP1 - LCD-2 (Audeze)
HP2 - K701
HP3 - Bose
HP4 - K550
HP5 - Beats
HP6 - Crossfade (v-moda)

So, how well do the weight and feel of each DUT correlate to this ranking?


You can assign a specific value in terms of weight, say in grams, or as another example "accuracy score" to a frequency response using some de rigueur weighting curve, no problem. This was first done by Consumer's Union, publishers of Consumer Reports magazine in I believe the 1970s and still done by major brands like my buddies at Etymotic Reasearch, with only minor modifications, but how on earth do you assign a numeric value to
"comfort/feel" when we all seem to be in agreement we all feel differently about it?

http://www.etymotic.com/technology/hwmra.html (http://www.etymotic.com/technology/hwmra.html)

Here's a random observation that caught my eye. Maybe people dig "big circular" earpads over oval shape as their top comfort priority however they also dig cushy foam? Maybe?

Going by the rank you posted above:

HP1- big circular
HP2- big circular
HP3- oval (but super cushy)
HP4- big circular
HP5- oval
HP6- oval

Just a thought.

Plus of course sound quality may have had an important if not over ridding role too.

Whenever customers asked me, "I've been reading up on headphones and I've learned all about frequency response, diffuse field equalization, free field equalization to a lateral target source, free field equalization to a 30 degree off axis target, free field equalization to a straight forward target source, square wave reproduction, harmonic distortion, bass extension, treble extension, channel balance, channel balance per frequency, HATS measurements, G.R.A.S cheek and ear measurements, KEMAR measurements...do tell, what is the most important factor?"
My answer was always the same "Comfort, hands down. If a headphone sounds "great" but is uncomfortable to wear, what does it matter if it sounds great?"

Comfort/feel plays a very important role in our assessment of many things, not just headphones, even if we are trained to "ignore that". It's just human nature.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-20 23:37:15
An ABX test is a subset of another kind of test: a double blind test.

So what? It still wasn't an ABX test ...

Quote
The fact that Olive put plastic handles on the headphones proves he agrees with me that the identity of the headphones' shape or size should be obscured as best possible and he admits that in terms of tactile sensation, short of anesthetizing the listeners' skin from the neck up [ha-ha], he couldn't come up with a way to make the test truly blind.

I don't understand what your problem is.

You started this with how you have problems with Harman studies, but after a while it was obvious you didn't even read the research first, and now you try to turn this around by saying Olive's work proves that what you said (which is basically what I just said before) is right?
WTF is going on?!
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-20 23:44:51
His headphones preference test wasn't truly blind. To me that's important and worthy of discussion, but if you disagree, whatever.

I also have an independent gripe about the very concept of what "level matching" means when we have grossly differing response curves, as we often do with headphones and speakers.

These are two separate distinct problems in headphone research, IMHO.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-20 23:54:30
His headphone preference test wasn't truly blind. To me that's important and worthy of discussion, but if you disagree, whatever.

How does a truly blind test with headphones look like?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-21 00:31:38
I'm currently of the mind it is nearly impossible to conduct a truly fair test of headphones under completely blind conditions with, "perfect level matching (at least according to the weighting curve we're currently using this decade)".

He freely admits his test wasn't blind but dismisses it with a joke about having to anesthetize people above the neck to correct the problem. Some people's take away message is, "Well, since we can't do that, then these existing tests are scientifically valid." My take away is ,"Since we can't make this method truly blind we'll never know for sure what impact headphone comfort/feel played in biasing people's decisions."

If his basic notion is that frequency response is what we really want to assess, then that should be done electrically via EQ, by simulating the Audeze, AKG, Bose, etc. response curves and then feeding that into the exact same pair of pre-calibrated headphones, which have been pre-calibrated to deliver an otherwise neutral response for that individual test subject, via a probe mic at the DRM reference point.

That test isn't perfect either, but in my opinion conducting a "blind" test of Coke vs Pepsi, but using a paper cup for one, yet a glass for the other, doesn't fly, despite any arguments to the contrary that "it shouldn't matter because they won't focus on that" or "it is too difficult to do it any other way".
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-21 01:02:02
His headphones preference test wasn't truly blind. To me that's important and worthy of discussion, but if you disagree, whatever.

I also have an independent gripe about the very concept of what "level matching" means when we have grossly differing response curves, as we often do with headphones and speakers.

These are two separate distinct problems in headphone research, IMHO.

Maybe we should break this thread into two as well. I'm not kidding. These two topics have almost nothing to do with eachother.

I also don't know why ITU BS. 1770 was chosen over other standards, say A-weighting or ISO 226, as examples, which clearly more closely reflect Fletcher Munson equal loudness contours:

Not to say that any one weighting is the correct one.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-21 01:17:28
Here are four of the contenders I could find with their raw, un-corrected curves, normalized at 1 kHz.  Notice how we have to do some major shifting laterally of some levels by 10 dB or so, if we instead decide to normalize at say 4 kHz, where the ear is more sensitive. But who's to say that's necessarily the correct place either? What if the test subject happens to focus on the sound of a bass instrument in the song, not some 4kHz centric instrument. What then? Are we "level matched" for their listening?
http://graphs.headphone.com/graphCompare.p...11&scale=20 (http://graphs.headphone.com/graphCompare.php?graphType=0&graphID%5b%5d=3221&graphID%5b%5d=703&graphID%5b%5d=3571&graphID%5b%5d=4011&scale=20)

Clearly the level match will differ greatly depending on which weighting you go with.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-21 07:30:54
The only way I know of to do proper level matching between two sources with differing frequency responses, such as is found with headphones or speakers, in a blind test where you don't want to inadvertently disclose identities, is to have a randomly chosen gain level for your volume knob for each and every listening trial. This needs to be explained to the test subject, the listener, that this has indeed been assigned randomly and that they'll never know what to expect as to how quickly the volume will change as they rotate the knob clockwise up from zero (full mute), so they should go slowly at first so as to not blast their ears out.

On some trials just a 1/3 rotation will achieve a very loud level yet for other trials, even if from the very same source which is of course always obscured from them, they'll have to turn it way up to 3/4 rotation to achieve the same sense of volume, even though it may (or may not) be the very same source. Only in this way will there be no tactile feedback to the test subject to potentially disclose identities, yet they can set volume at will and we don't have to worry about which weighting curve to use. They, in a sense, are using their own unique, customized weighting, yet they have no idea how much gain they had to apply to achieve what pleases them.

For example, if they get exposed to what is unbeknownst to them a somewhat bass shy source, for example,  which they have to really crank up to achieve any semblance of a full rich sound, due to Fletcher-Munson equal loudness contours, they'll never know it, because the amount they need to rotate that volume knob is randomly assigned each time and they won't know the source needed more than what they typically apply.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-21 12:20:14
I also don't know why ITU BS. 1770 was chosen over other standards, say A-weighting or ISO 226, as examples, which clearly more closely reflect Fletcher Munson equal loudness contours:

a) Equal loudness contours (ISO 226) deal with loudness of pure tones, not noise or music.
b) A-weighting is based on the old 40 phon Fletcher-Munson curves, so if it is used at all then ideally at very low levels with pure tones. Despite that it is still being (ab)used for noise measurements.
c) BS.1770 deals with loudness monitoring, so out of the family of weighting filers above its "K" filter is the only one that actually fits your purpose.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-21 17:20:42
While a tactile DBT is not 'truly blind' it seems that what you, mzil, want is something like a PCA (principle component analysis) to see what degree of influence, if any, tactile sensation has on the preference rankings.  Have I go that right?
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-21 17:42:28
I honestly don't really see the point, because given the new target curves you can build different types of headphones that sound "equally" awesome.
People will choose whatever type of headphone that they prefer anyway, be it in-ear, on-ear, around-ear, ... light or heavy, (p)leather or velour pads, and so on ...


And such a test would basically be just the comfort ranking that I mentioned before.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-21 23:32:31
c) BS.1770 deals with loudness monitoring, so out of the family of weighting filers above its "K" filter is the only one that actually fits your purpose.

Isn't it interesting that in examining the inflexsion point of the K-weighting, it just so happens to occur at exactly 1000 Hz. Funny how in our analysis of human's perception of loudness to determine that exact point it just so happens to fall at a very convenient number value because our number system is based on 10. If one didn't know any better they might think it was selected for convenience as an easy to remember ballpark figure rather than measured via precise audiometric accuracy! Wouldn't that be absurd. [sarcasm]

"Fits my purpose"? Not at all, in fact it seems rotten. It was based on a broadcast standard for mono which the inventors described as such:

"It may come as a surprise that the ITU uses such basic filtering to define the difference between RMS and loudness, but as they put it, "for typical monophonic broadcast material, a simple energy-based loudness measure is similarly robust compared to more complex measures that may include detailed perceptual models”. The ITU calls such a filter 'K-weighting',

I'll take the "complex methods which involves perceptual models" any day, and even using them I still think the concept of precisely level matching is sketchy; all we can do is "ballpark" levels and keep our fingers crossed it will cover the narrow band of the audible spectrum our test listener happens to focus on. If they happen to focus on a bass instrument we are dead in the water since upper and mid frequencies are given priority status in all these weighting schemes (which I agree is proper looking at the big picture), and the very concept of claiming the two systems are "level matched" regardless of frequency of interest is laughable.

"NOTE 1 – Users should be aware that measured loudness is an estimation of subjective loudness and involves some degree of discrepancy depending on listeners, audio material and listening conditions."

WHAT!? "Estimation"!?..."Discrepancy"?! "Varies by listener, conditions, and material"?! But, but, but I thought this new fangled method was guaranteed to play everything at the same level, no?! ITU, you are admitting there are scenarios where that might not be true? Then since I know any two divergent frequency response curves have to be able to be precisely level matched, because that is important for my blind studies, then I'll just keep on shopping for a better weighting system which promises me it has no flaws, since I've concocted in my mind that there must be one. Bye-bye ITU BS.1770 .

ITU word document (http://www.itu.int/rec/R-REC-BS.1770-0-200607-S/en)
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-21 23:44:46
Claiming a speaker or headphone with a mountainous frequency response has a particular "level" you can match to another one is like saying Da Vinci's Mona Lisa has a particular color value.

All you can do is take a tiny little section of these things and say "I've decided this is the important part to match and to hell with all of the rest".
Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-22 01:03:26
Isn't it interesting that in examining the inflexsion point of the K-weighting, it just so happens to occur at exactly 1000 Hz.

Yep, only that it doesn't...


If one didn't know any better they might think it was selected for convenience as an easy to remember ballpark figure rather than measured via precise audiometric accuracy! Wouldn't that be absurd. [ sarcasm]

Well, obviously you don't know better.


"Fits my purpose"? Not at all, in fact it seems rotten.

As I said, out of the weighting filers above this is the only one that was actually made for this purpose. I never said it is perfect.


"It may come as a surprise that the ITU uses such basic filtering to define the difference between RMS and loudness, but as they put it, "for typical monophonic broadcast material, a simple energy-based loudness measure is similarly robust compared to more complex measures that may include detailed perceptual models”. The ITU calls such a filter 'K-weighting',

You obviously did not read BS.1770 either, because in the appendix you can see correlation with subjective loudness ratings. There is a stereo and multichannel dataset with correlation r=0.98, even better than the first monophonic dataset.


I'll take the "complex methods which involves perceptual models" any day

Please do. Which algorithms do you suggest? Does it achieve better correlation?


"NOTE 1 – Users should be aware that measured loudness is an estimation of subjective loudness and involves some degree of discrepancy depending on listeners, audio material and listening conditions."

WHAT!? "Estimation"!?..."Discrepancy"?! "Varies by listener, conditions, and material"?! But, but, but I thought this new fangled method was guaranteed to play everything at the same level, no?! ITU, you are admitting there are scenarios where that might not be true? Then since I know any two divergent frequency response curves have to be able to be precisely level matched, because that is important for my blind studies, then I'll just keep on shopping for a better weighting system which promises me it has no flaws, since I've concocted in my mind that there must be one. Bye-bye ITU BS.1770 .

Until you stop being a moron I will also say bye-bye.


ITU word document (http://www.itu.int/rec/R-REC-BS.1770-0-200607-S/en)

That's the 6 year old outdated version..
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-22 03:54:47
Until you stop being a moron I will also say bye-bye.

Your completely unprovoked personal attack calling me a "moron" will not be forgotten.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: krabapple on 2014-08-22 17:57:58
I honestly don't really see the point, because given the new target curves you can build different types of headphones that sound "equally" awesome.
People will choose whatever type of headphone that they prefer anyway, be it in-ear, on-ear, around-ear, ... light or heavy, (p)leather or velour pads, and so on ...


And such a test would basically be just the comfort ranking that I mentioned before.



Which brings us back to the point that Sean Olive is saying otherwise: that people will tend to prefer headphones that sound 'neutral' *if* sighed bias (but not 'touch') is nullified. 


Title: Issues with Blind-Testing Headphones and Speakers
Post by: xnor on 2014-08-22 18:11:27
Which brings us back to the point that Sean Olive is saying otherwise: that people will tend to prefer headphones that sound 'neutral' *if* sighed bias (but not 'touch') is nullified.

Well, that is not necessarily mutually exclusive with what I said. If we ignore those people who buy headphones as fashion, but are interested mainly in sound quality, then they will still choose the better sounding headphone from the remaining ones. (The others were eliminated due to type, budget, comfort ...)
Title: Issues with Blind-Testing Headphones and Speakers
Post by: mzil on 2014-08-22 19:36:57
Here's another potential issue the method I mentioned earlier, using a single pair of individually calibrated headphones the listener doesn't remove from their head, and instead listens to different headphone brand modeling (via outboard electrical EQs) eliminates:
http://www.aes.org/e-lib/browse.cfm?elib=15443 (http://www.aes.org/e-lib/browse.cfm?elib=15443)

" The results indicate that, whatever the headphone model or the excerpt, the modifications caused by different positions were always perceived."

Notice they said "always". This proves that casual tests of things like, say for example headphone "burn in", listening to 'phones and then the same again later after 100 hours of burn-in, is pointless, or at least very problematic.
Title: Issues with Blind-Testing Headphones and Speakers
Post by: ajinfla on 2014-08-22 19:49:08
I've had some disagreement with how Harman test/conclude as well http://www.hydrogenaud.io/forums/index.php?showtopic=81708 (http://www.hydrogenaud.io/forums/index.php?showtopic=81708).
But of course, bravo to them for testing.

cheers,

AJ
Title: Issues with Blind-Testing Headphones and Speakers
Post by: 2Bdecided on 2014-08-26 17:07:02
in fact it seems rotten
What I heard (so take it with a pinch of salt) is that there were several contenders for this loudness standard from reputable companies/institutions, and someone also threw in this really simple solution. Across a fairly wide range of material, including the material chosen for testing, the simple solution did remarkably well. Each of the reputable companies could find material on which their solution performed better, but overall the simple solution worked well enough that everyone came to an agreement to stop pushing their own preferred system, and to just go with this one.

In some contexts where the loudness matching is hopelessly wrong, the way it gets it wrong is kind of handy too - i.e. it won't turn up almost inaudible near-ultrasonic signals in an attempt to make them equally loud, which is quite useful because it avoids frying tweeters in that context!

While you can do better, it's surprising how little benefit doing "better" brings in most real-world contexts.

Cheers,
David.