Skip to main content

Topic: MAD Challenge met (Read 3566 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • ff123
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Garf attempted the MAD challenge with its current rules (5% significance required with no cherry picking), and met it, achieving 1.1% significance with 74 correct trials out of 122 total, using the samples provided on my page, which were made with MAD 0.13.0b.

http://ff123.net/madchallenge.html

Page has been updated with his results and comments.

So, there is an audible difference using MAD, at least if your name is Garf and you listen to that sample!

BTW, heroic effort, Garf.

ff123

  • rjamorim
  • [*][*][*][*][*]
MAD Challenge met
Reply #1
Quote
Originally posted by ff123
So, there is an audible difference using MAD


For better or for worse?
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org

  • Gabriel
  • [*][*][*][*][*]
  • Developer
MAD Challenge met
Reply #2
Can I try the challenge with a sample which was not ripped from a cd?

  • ff123
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Reply #3
Quote
Originally posted by Gabriel
Can I try the challenge with a sample which was not ripped from a cd?


You can use any mp3 sample you like as long as it doesn't clip and as long as one version passes through the FhG decoder (or Lame decoder, if you like) at 16 bits and the other through MAD at 16 bits.

ff123

  • Garf
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Reply #4
Quote
Originally posted by rjamorim

For better or for worse?


I wouldn't know. I didn't manage to ABX the FhG or MAD decodes vs the SSRC clip, only the FhG and MAD decodes against each other.

--
GCP

  • Garf
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Reply #5
Quote
Originally posted by Gabriel
Can I try the challenge with a sample which was not ripped from a cd?


Makes we wonder what you're up to

A -90dB square wave might also do the trick. You should be able to turn the volume
high enough to hear the dither.

--
GCP

  • Gabriel
  • [*][*][*][*][*]
  • Developer
MAD Challenge met
Reply #6
Garf: I think you basically guessed what I want to do: A squared wave AFTER the 16th bit. If it works according to what I'm thinking, I'll have something with mad and nothing with fhg.
I'll try to find the time to try this week...

  • KikeG
  • [*][*][*][*][*]
  • Developer
MAD Challenge met
Reply #7
Good job, Garf!

From a time to now, and thanks to ff123's job, it is being proved at this amateur forums that dither at 16 bit level is audible, although very difficultly.

By the way, I think that relevant tests must be done with real world signals and at real world listening levels, not absolutely non-musical and artificially generated signals and extreme amplification. Real world sound signals, no matter how weird are them.

I'd like to ask Garf, what did you hear, or how you "felt" the two different samples?


Edit: for Garf, I guess the test could have been easier with a better soundcard.

  • KikeG
  • [*][*][*][*][*]
  • Developer
MAD Challenge met
Reply #8
By the way, wouldn't the initial 9/10 scoring be valid as a test pass?

  • Garf
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Reply #9
Quote
Originally posted by KikeG

I'd like to ask Garf, what did you hear, or how you "felt" the two different samples?


Edit: for Garf, I guess the test could have been easier with a better soundcard.


There really isn't any difference in casual listening. In my test setup, I could hear differences in the background noise structure when listening very closely. IIRC, the MAD decode was a bit 'sharper' but also a bit more noisy.

A real 24-bit audiophile quality card would have helped of course (got to get me one of those someday), although the SB 128 is pretty good. It's cheap and it's got no fancy features, but it's quite linear and the S/N ratio is good.

--
GCP

  • Garf
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Reply #10
Quote
Originally posted by KikeG
By the way, wouldn't the initial 9/10 scoring be valid as a test pass?


If I would have stopped at that moment, yes! But the 16/32 ruined the first result.

ff123's test requires you to take all ABX trials into account.

--
GCP

  • KikeG
  • [*][*][*][*][*]
  • Developer
MAD Challenge met
Reply #11
Quote
Originally posted by Garf

If I would have stopped at that moment, yes! But the 16/32 ruined the first result.

ff123's test requires you to take all ABX trials into account.
GCP


There is logic in that from the probability point of view. But there's also some non-logic on that, from the common sense point of view.

If you had stopped after the 9/10 test, you could have said you met the challenge. But if next day you tried again and got the worse results, then you would have to regret from what you said the previous day? That would mean that your bad results at now invalidate your good results of a considered as valid previous test?

Also, it is very easy that if you are "inspired" you get good results, But If in another round you are somewhat less "inspired" or tired, or whatever thing that decreases your sensitivity, you get bad results. Shouldn't the first test count as valid too? Not an easy question, I'd say, because probability and temporary unsensitivity get mixed and can't be separated.

Well, thinking further, I guess the probability rules, so a higher numer of trials should be performed until the results get stable.

So... ABX testing on subtle differences is more difficult than one might think at first, and that annoys me a little bit.

Any thoughts about this?

  • Garf
  • [*][*][*][*][*]
  • Developer (Donating)
MAD Challenge met
Reply #12
Quote
Originally posted by KikeG

There is logic in that from the probability point of view. But there's also some non-logic on that, from the common sense point of view.


Those often don't mix.

Quote
If you had stopped after the 9/10 test, you could have said you met the challenge. But if next day you tried again and got the worse results, then you would have to regret from what you said the previous day? That would mean that your bad results at now invalidate your good results of a previous test?

Also, it is very easy that if you are "inspired" you get good results, But If in another round you are somewhat less "inspired" or tired, or whatever thing that decreases your sensitivity, you get bad results. Shouldn't the first test count as valid too? Not an easy question, I'd say, because probability and temporal unsensitivity get mixed and can't be separated.

Well, thinking further, I guess the probability rules, so a higher numer of trials should be performed until the results get stable.

So... ABX testing on subtle differences is more difficult than one might think at first, and that annoys me a little bit.

Any thoughts about this?


The MAD Challenge is intentionally biased towards not falsely concluding someone hears a difference, at the expense of possibly missing a real difference.

Take the following: I do one ABX trial of 10 times each day and after three weeks I score a 9/10. 

What happened? Did I get lucky or did I have a good-ear-day?

Another example. I set out to do 40 trials. I start with 7/8 but end up with 20/40. Did I pass or not?

If you would allow these results as 'passes', you're going to have a problem...

I think maximal scrunty is a good idea. If the cicrumstances aren't good (e.g. tired), don't test! Only test when your sensitivity is maximal. You should always be doing that when ABX testing anyway...

--
GCP