Earguy\'s improved Digital Ear

Topic: Earguy\'s improved Digital Ear (Read 26299 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Earguy\'s improved Digital Ear

Reply #50 – 2002-02-28 17:49:28

ff123- I am having difficulty downloading those two zips. It stops before completing.

tw101- The avg bitrate is 323k because I am using the resulting file size for computation (file size*8 bits/10 secs/1000). I have to do this to fairly compare different compression formats. The extra 3k is do to header info among other things.

Garf- Any difference value above 0 means VL did hear a difference at that frequency. So to VL it believes it is an audible difference that it heard.

amp- VL works in the time domain. It sections the cochlea into 251 sections with each section having a different resonant frequency, but all processing is done in the time domain (slow).

VL is an exact representation of the model in Frank Baumgarte's dissertation. You can read it here PhD Dissertation

I will keep posting results and let the arguments continue...

Earguy\'s improved Digital Ear

Reply #51 – 2002-02-28 17:55:04

EarGuy.
Will you ever share VL?

Earguy\'s improved Digital Ear

Reply #52 – 2002-02-28 18:07:35

layer3,

Quote

A hundred engineers and a thousand mechanics can agree that Yugos suck. It doesn't matter to my girlfriend - she LOVES hers...

And yet audio codec quality can be ranked, and has been done so for example by the MPEG group when they tested AAC. I agree that people should use whatever they prefer. It's only when they want to make a general statement about quality that group preferences need to be considered as well.

Quote

Did I leave you with the impression that I don't approve of skepticism? Skepticism is a "way of life" for someone like me.

Yes, you left me (and I presume others as well) with the impression that you may be giving this tool too much credence before it has really proved itself, even on easy stuff. Of course proving itself at very subtle levels of artifacting is something else again. For example:

Quote

No, in fact it's not. MPC is touted as being the best possible encoder for those seeking the highest quality output from a lossy codec. And these results show it's not.

ff123

Earguy\'s improved Digital Ear

Reply #53 – 2002-02-28 18:12:20

Earguy,

Quote

ff123- I am having difficulty downloading those two zips. It stops before completing.

My server is very slow at the moment. I guess you must be using Internet Explorer? I've had IE cut out in the middle of a slow download on me before. Very annoying. I recommend 1) waiting a bit until the connection speeds up, 2) using Netscape, which doesn't cut out in the middle of downloads like IE seems to, or 3) using a specialized download program (eg., getright) which won't make you start from the beginning if you lose the connection.

Sorry for the inconvenience.

ff123

Earguy\'s improved Digital Ear

Reply #54 – 2002-02-28 19:00:42

THESE results DO show that alt preset insane is superior to mpc insane. Does that mean that this constitutes proof? Yes and no. It proves that this specific software, with it's specific preferences and limitations, on these specific samples, finds the top LAME preset to be superior to MPC's top preset. It proves that. Just like a listening test only proves that that a specific group of listeners, with their specific preferences and hearing limitations, and on a specific sample or group of samples, find one codec to be superior to others. You allude to this on your own page. One codec can perform better when focusing on one type of artifact than another. And the best codec can change when testing for another type of artifact on another sample. And there is the issue of recognizable frequency range and stereo separation which isn't an artifact at all. The subtleties Beatles is so concerned with.

Earguy\'s improved Digital Ear

Reply #55 – 2002-02-28 19:32:40

Quote

Originally posted by layer3maniac
THESE results DO show that alt preset insane is superior to mpc insane.

I wouldn't say that. I would say the results show that according to VL, api is superior to mpc insane. According to real life, real human tests, this isn't the case.

Quote

Does that mean that this constitutes proof? Yes and no. It proves that this specific software, with it's specific preferences and limitations, on these specific samples, finds the top LAME preset to be superior to MPC's top preset. It proves that. Just like a listening test only proves that that a specific group of listeners, with their specific preferences and hearing limitations, and on a specific sample or group of samples, find one codec to be superior to others.

So which are you going to give more value: Real life provable results by several people or VL results, and why?

Earguy\'s improved Digital Ear

Reply #56 – 2002-02-28 19:48:23

Quote

I would say the results show that according to VL, aps is superior to mpc insane.

Exactly! (at least api, that is)

Quote

According to real life, real human tests, this isn't the case. So which are you going to give more value: Real life provable results by several people or VL results, and why?

First, the VL results are real life provable results. They are not, however human. This is both a plus and a minus for VL. Humans are famous for their fallibility. The very word "human" is often interchangable with failure and shortcoming. "Hey - I'm only human!" On the other hand, machines are MADE BY humans. As for me, VL is about as valuable as ONE listener from a listening test. Less valuable than a group of listeners (generally speaking).

Earguy\'s improved Digital Ear

Reply #57 – 2002-02-28 19:58:12

Quote

THESE results DO show that alt preset insane is superior to mpc insane. Does that mean that this constitutes proof? Yes and no. It proves that this specific software, with it's specific preferences and limitations, on these specific samples, finds the top LAME preset to be superior to MPC's top preset. It proves that.

Ah, ok, this is a misunderstanding of language usage. You would write: "The VL results show/prove that MPC insane is better than Lame insane," whereas I would write: "The VL results (from a utility with unknown correlation to actual human results) claim/say that MPC insane is better than Lame insane."

One difference is that you deftly avoid the first part of syllogism, the premise. The premise is that VL is an accurate representation of human hearing. And in my book, VL results don't "show" or "prove" anything until the premise is demonstrated to be correct to some substantial degree. That's the other difference in the way I would word things.

So far, I'd say there are indications for and against VL's correlation with human hearing. On the plus side, in its previous incarnation, it ranked different versions of fatboy the same way I did. But on the minus side, it didn't appear to hear ringing well. And I'm unsure whether it hears subtle pre-echo artifacts well. The best way to find out how VL performs is to subject it to a battery of tests and compare against group listening results.

BTW, anything which can be reliably demonstrated to be audibly different from the original counts as "artifacting" to me, including differences in frequency range and stereo separation.

ff123

Earguy\'s improved Digital Ear

Reply #58 – 2002-02-28 20:02:09

Quote

Originally posted by layer3maniac
Exactly! (at least api, that is) First, the VL results are real life provable results. They are not, however human. This is both a plus and a minus for VL. Humans are famous for their fallibility. The very word "human" is often interchangable with failure and shortcoming. "Hey - I'm only human!" On the other hand, machines are MADE BY humans. As for me, VL is about as valuable as ONE listener from a listening test. Less valuable than a group of listeners (generally speaking).

Of course, for VL to be worth one listener from a listening test, VL will first have to pass the sample series of listener training. Secondly, it has to pass the correlation test. From the way things look, VL might have a problem with the second one

Would be interesting if we can run a VL test on wayitis and check its correlation.

It all comes down to this point: VL cannot be considered a good listener until it can show that its results can agree with the results of some real listening test.

Earguy\'s improved Digital Ear

Reply #59 – 2002-02-28 20:04:26

Quote

Originally posted by layer3maniac
Exactly! (at least api, that is)

Eh yeah, of course api. (Edited the typo)

Quote

As for me, VL is about as valuable as ONE listener from a listening test. Less valuable than a group of listeners (generally speaking).

Well, according to your reactions, you give quite a lot of value to this ONE listener, even though you have yet no idea how good it is. There's no idea yet if it will in anyway correlate with group listening test results, or if it hears different types of distortions/differences in sound. Maybe I sound like an elitist, but I don't believe every listener is equally good or should be given equal value. That's one reason why screening tests are planned for listening tests.

Earguy\'s improved Digital Ear

Reply #60 – 2002-02-28 20:09:54

Earguy:
I uploaded ff123's listening test samples temporarely to hydrogenaudio server.
Could you say when you have the zips.
http://hydrogenaudio.org/temp/dogies.zip
http://hydrogenaudio.org/temp/wayitis.zip

Earguy\'s improved Digital Ear

Reply #61 – 2002-02-28 22:45:43

ff123, JohnV- I got the two files.

I will have VL listen to the listener training set as well as the two group listening tests, dogies and wayitis, to see if VL correlates properly.

Earguy\'s improved Digital Ear

Reply #62 – 2002-02-28 22:51:51

Just gonna give my thoughts here:

- i think the concept of developing some kind virtualear program to me sounds like a great idea.

- The problem for me lies in the fact that it's, well, VIRTUAL. Don't get me wrong as for now the prog is good in the way that you get a visualisation of the different formats, but will this ever really simulate the human hearing?? Than there would have to be settings for let's say audiophiles and normal users, ... as everyone hears (or doesn't hear) other things.

- Lot's (even me) of people asked the question is lame insane now better than mpc insane. As for now I don't think we can't really make any conclusions. Because on one side there's the prog stating lame is better than mpc and on the other side there's REAL (not VIRTUAL) people stating (due to lot's of testing) that mpc is better than lame. Now if I would have to choose which one I'd believe, It'd be for sure go for the REAL ears

- I really would like him (Todd I thought) to publish this proggie. I'm not even asking for the source code here, but I wonder why he doens't responded at the request of layer3maniac to share his prog. With this prog being published we ourselves (or maybe the tech people here) could do some more and larger testing.

These I my thoughts on this subject, I really hope this prog kicks off so we can finally have a graphical (easy to understand) comparison between formats.

Earguy\'s improved Digital Ear

Reply #63 – 2002-02-28 23:07:03

Quote

Originally posted by Captain_Carnage
- I really would like him (Todd I thought) to publish this proggie. I'm not even asking for the source code here, but I wonder why he doens't responded at the request of layer3maniac to share his prog. With this prog being published we ourselves (or maybe the tech people here) could do some more and larger testing.

Maybe he's concerned that some commercial company will want to "appropriate" the program somehow & make money off it without even giving him credit. It wouldn't be the first time that's happened...

Earguy\'s improved Digital Ear

Reply #64 – 2002-03-01 01:55:09

Quote

Well, according to your reactions, you give quite a lot of value to this ONE listener, even though you have yet no idea how good it is. There's no idea yet if it will in anyway correlate with group listening test results, or if it hears different types of distortions/differences in sound.

What is a good listener? One which focuses on the artifacts you like to focus on? This is the problem I have with listener testing done by people who have engaged in "listener training". It is skewed. It's not natural. Normal human's don't scour a sample searching for specific problems. They listen to the "whole" of the sample and judge it accordingly. I honestly believe that this is a bad way to judge codecs. It's fine for someone tuning an encoder, but it's just not, as you say, "real life".

Quote

Maybe I sound like an elitist, but I don't believe every listener is equally good or should be given equal value. That's one reason why screening tests are planned for listening tests.

See, I completely disagree. I believe every listener is equally valid, as long as they're honest. And an "untrained" listener, to me, is MORE valid than a "trained" one. Is the purpose of a listening test to determine which codec scores highest with people who are distinctly different from the average listener? Shouldn't the purpose of a listening test be to determine which codec scores the highest with "normal" people - the people who will be using it? People who are listening to the whole of the music, and not just those who have been "trained" to scour it looking for specific artifacts, and in doing so, possibly missing everything else?

Earguy\'s improved Digital Ear

Reply #65 – 2002-03-01 01:59:42

Quote

BTW, anything which can be reliably demonstrated to be audibly different from the original counts as "artifacting" to me, including differences in frequency range and stereo separation.

Again, as you say, language. To me, an artifact is something which WASN'T in the original - something extra, something added. As opposed to other problems, nuances like missing frequencies and separation. Warmth. Things which have been taken away.

Earguy\'s improved Digital Ear

Reply #66 – 2002-03-01 02:08:26

Quote

Originally posted by layer3maniac
Again, as you say, language. To me, an artifact is something which WASN'T in the original - something extra, something added. As opposed to other problems, nuances like missing frequencies and separation. Warmth. Things which have been taken away.

Umm, so for you dropouts are not artifacts?? I agree with ff123. I don't usually say "artifacts and other differencies". I just say "artifacts".

Earguy\'s improved Digital Ear

Reply #67 – 2002-03-01 02:17:49

You say potayto - I say potahto. To some people, head is the bathroom. To others, it's a sex act. It really doesn't matter.

Earguy\'s improved Digital Ear

Reply #68 – 2002-03-01 02:28:43

Quote

Originally posted by layer3maniac
What is a good listener? One which focuses on the artifacts you like to focus on? This is the problem I have with listener testing done by people who have engaged in "listener training". It is skewed. It's not natural. Normal human's don't scour a sample searching for specific problems. They listen to the "whole" of the sample and judge it accordingly.

This is exactly what I do, and I believe what others do as well. I listen to the "whole" sample, stereo separation, frequency response, everything.

Quote

I honestly believe that this is a bad way to judge codecs. It's fine for someone tuning an encoder, but it's just not, as you say, "real life".

You have completely wrong idea. I don't know how you can even claim that I listen in some "wrong" way. You have no idea how I test music. Just because I may be especially good at detecting for example pre-echo, doesn't mean I lack in other areas. I would say quite the opposite. I can either "separate" a track (concentrade on only one instrument for example) or listen as a "whole". I of course do both.

Quote

See, I completely disagree. I believe every listener is equally valid, as long as they're honest. And an "untrained" listener, to me, is MORE valid than a "trained" one. Is the purpose of a listening test to determine which codec scores highest with people who are distinctly different from the average listener? Shouldn't the purpose of a listening test be to determine which codec scores the highest with "normal" people - the people who will be using it? People who are listening to the whole of the music, and not just those who have been "trained" to scour it looking for specific artifacts, and in doing so, possibly missing everything else?

I think the group listening tests show that there is pretty often a tendency. And that more sensitive listeners are more "correct" with the results, than less sensitive, although less sensitive listeners follow the tendency at least as a group also.

Earguy\'s improved Digital Ear

Reply #69 – 2002-03-01 02:43:24

Quote

What is a good listener? One which focuses on the artifacts you like to focus on? This is the problem I have with listener testing done by people who have engaged in "listener training". It is skewed. It's not natural. Normal human's don't scour a sample searching for specific problems. They listen to the "whole" of the sample and judge it accordingly. I honestly believe that this is a bad way to judge codecs. It's fine for someone tuning an encoder, but it's just not, as you say, "real life".

Well then why don't we tell people use the cheap computer speakers they normally listen through while they put on some 128 kbit/s mp3's to fill up the background? The effect would be the same as using untrained listeners -- less reliable and less sensitive results.

Yes, one has to guard against the possibility that small differences will be unrealistically magnified (solution: include a bad sample in the test to remind people what bad really means).

Yes, people don't normally listen for artifacting in short samples. But it's the most sensitive way to rank codec quality. Having somebody listen to very long samples (> 30 seconds) is not reliable because human auditory memory is very short. Not to mention highly fatiguing.

And while we're on the topic of not reaching for the last bit of sensitivity that we can in listening tests, why then is it acceptable to take miniscule differences in VL results and say that this represents a real-life audible difference?

Quote

See, I completely disagree. I believe every listener is equally valid, as long as they're honest. And an "untrained" listener, to me, is MORE valid than a "trained" one.

Untrained listeners become trained listeners over time, as is shown by the changing general opinion on what an acceptable bitrate for mp3 is.

BTW, there are two types of training: general training to learn how to perform a blind test, and the other type of training (which I assume we are discussing) to hear artifacting.

ff123

Earguy\'s improved Digital Ear

Reply #70 – 2002-03-01 02:45:41

Quote

You have completely wrong idea. I don't know how you can even claim that I listen in some "wrong" way. You have no idea how I test music.

That's true! Did you think I was referring to you personally? Here's the question I have - at what point does sensitivity become oversensitivity.

Quote

I think the group listening tests shows that there is pretty often a tendency. And that more sensitive listeners are more "correct" with the results, than less sensitive, although less sensitive listeners follow the tendency at least as a group also.

I don't understand your use of there word "correct" here. Unless someone's running a crooked test, seeking a specific result in advance. How can there be a "correct" result?

Earguy\'s improved Digital Ear

Reply #71 – 2002-03-01 02:51:38

Quote

Originally posted by layer3maniac
I don't understand your use of there word "correct" here. Unless someone's running a crooked test, seeking a specific result in advance. How can there be a "correct" result?

Correct meaning closer or clearly following (but maybe with higher differences) the tendency.
Basically I mean that more sensitive listeners could create same kind of tendency with only few participants, where less sensitive listener group will need lots of listeners but the tendency will still be same kind.

Earguy\'s improved Digital Ear

Reply #72 – 2002-03-01 03:10:17

Quote

Well then why don't we tell people use the cheap computer speakers they normally listen through while they put on some 128 kbit/s mp3's to fill up the background? The effect would be the same as using untrained listeners -- less reliable and less sensitive results.

That's just silly.

Quote

Yes, one has to guard against the possibility that small differences will be unrealistically magnified (solution: include a bad sample in the test to remind people what bad really means).

Quote

Yes, people don't normally listen for artifacting in short samples. But it's the most sensitive way to rank codec quality.

I completely disagree. You are using a specific group of people, looking for specific problems, on specific samples to make wide ranging generalizations for all people on all samples. That's just not scientifically statistically sound thinking.

Quote

And while we're on the topic of not reaching for the last bit of sensitivity that we can in listening tests, why then is it acceptable to take miniscule differences in VL results and say that this represents a real-life audible difference?

Did someone do that? Not me. Again, at what point does sensitivity become oversensitivity?

Quote

BTW, there are two types of training: general training to learn how to perform a blind test, and the other type of training (which I assume we are discussing) to hear artifacting.

Can you train people to hear high frequencies? Can you train them to detect loss of warmth? Can you train them to detect loss of separation? I say no. This is my objection to doing tests using listeners who are sensitive to specific artifacts, on specific samples, and then making general assumptions based on the results. It doesn't give the whole story.

Earguy\'s improved Digital Ear

Reply #73 – 2002-03-01 03:19:53

Quote

Originally posted by layer3maniac
Can you train them to detect loss of separation? I say no. This is my objection to doing tests using listeners who are sensitive to specific artifacts, on specific samples, and then making general assumptions based on the results. It doesn't give the whole story.

Why do you think that learning to hear specific artifacts excludes the other properties?? This is not the case at all. People do not become worse listeners in some aspects during time because of listening training (except when ageing/hearing loss). People only become better listeners. I've definitely personally noticed this.

Earguy\'s improved Digital Ear

Reply #74 – 2002-03-01 03:30:20

It's just my opinion. Anyway, it was nice discussing this civilly with you two WITHOUT it getting ugly.

Notice