Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Top reasons for screwing up a codec ABX test? (Read 4191 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Top reasons for screwing up a codec ABX test?

Audiophiles the world over are claiming that differences stem from USB cables, green rocks placed at random in the room, etc. etc.  This is nothing new.  Our last resort used to be to tell them that they wouldn't be able to pick the difference apart in an ABX test.  In the past this would lead to denial of the utility of ABX tests but recently I've seen a disturbing tendency for audiophiles to simply produce impossible ABX results leaving us gibbering "but... But that's impossible! Proper ABX tests have shown these differences to be inaudible!  You must have done something wrong!  :o  " Problem is, WHAT?  What are they doing wrong?

I won't even go into the mess that is equipment ABX tests.  Let's start with codec ABX testd conducted using the foobar2000 ABX plugin. Just today I read someone claim scoring 87% out of 30 trials of 320kbps mp3s vs FLAC encoded from the same source (of unspecified music).  Now I wouldn't be so incredulous if this were a known golden ear listening to fatboy or something but this is just some random person listening to presumably random music.  So what do you reckon he screwed up?

1. Does an encoder that's not a recent version of LAME probably account for it?
2. Would the versions have to be explicitly volume matched using replaygain or should they be close enough by default?
3. ? ? ?

HELP

 

Re: Top reasons for screwing up a codec ABX test?

Reply #1
Best way to verify potential test problems would be to see the tested files and environment settings.

MP3 playback volume should match the source file without doing anything. But MP3 in foobar2000 is decoded to floats that can and with typical music will result in peaks that exceed digital fullscale. If these files are played back at full volume using DirectSound output the Windows mixer will lower the amplitude automatically to prevent clipping. This can result in very noticeable volume differences.

Mixer volume adjustment problem could be avoided by using WASAPI output or by adjusting the playback level enough that peaks won't clip or by forcefully clipping the MP3. The last method could be achieved by converting the MP3 back to 16 or 24 bit WAV or by using for example Hard Clip DSP component. ReplayGain could work but since the frequency response is different it will alter the two files a bit differently. It can result in a detectable difference.

And you can't exclude the possibility that the person just lies about the hearing ability and cheats.


Re: Top reasons for screwing up a codec ABX test?

Reply #3
Extraordinary claims require extraordinary evidence. In this case, if someone claims they ABX'ed 96kbps MP3 vs FLAC, that's not so unbelievable. At the other end of the spectrum, if someone claims they ABX'ed FLAC vs WAV, then a lot more scrutiny is required, and sure as hell just posting results is not nearly enough. Also, obviously motivations and trustworthiness of the subject has to be considered.

Excluding dishonesty, there are ways in which people can fool themselves even if they have good intentions. One common way is to arbitrarily stop the trials when they think it's "enough". One must choose the number of trials before beginning the test. The reason for this may be hard to wrap their head around for many people (what's the difference between deciding to stop at #10 from the beginning, or deciding to stop at #10 after I felt confident it was enough?), and even bona fide scientists fall for this trap, which is one popular form of p-hacking.

Re: Top reasons for screwing up a codec ABX test?

Reply #4
Audiophiles the world over are claiming that differences stem from USB cables, green rocks placed at random in the room, etc. etc.  This is nothing new.  Our last resort used to be to tell them that they wouldn't be able to pick the difference apart in an ABX test.  In the past this would lead to denial of the utility of ABX tests but recently I've seen a disturbing tendency for audiophiles to simply produce impossible ABX results leaving us gibbering "but... But that's impossible! Proper ABX tests have shown these differences to be inaudible!  You must have done something wrong!  :o  " Problem is, WHAT?  What are they doing wrong?


When an ABX test is down to file versus file, the best check is for others to duplicate the controversial results with those files. When those files are downloadable from a publicly accessible source, it is then the responsibility of the critics to try them out. If they don't try them out then they are tacitly accepting the test results that they are making a big show about questioning. IOW, they are either hypocrites or can reasonably  explain the reason why they are not trying to duplicate the controversial results.

This is how science works.

Re: Top reasons for screwing up a codec ABX test?

Reply #5
Quote
Let's start with codec ABX testd conducted using the foobar2000 ABX plugin. Just today I read someone claim scoring 87% out of 30 trials of 320kbps mp3s vs FLAC encoded from the same source (of unspecified music).  Now I wouldn't be so incredulous if this were a known golden ear listening to fatboy or something but this is just some random person listening to presumably random music.  So what do you reckon he screwed up?
It's entirely possible that he/she is really hearing an MP3 compression artifact. 

320kbps isn't always  transparent and it doesn't necessarily take golden ears to hear the difference.   There are compression artifacts that don't improve over a certain bitrate.     It does help if you have some "training" so you know what a compression artifact sounds like, it usually takes some careful listening, and it depends on the program material.   ...Maybe this particular random song has audible artifacts...   If they are claiming ~87% on all program material, I'm more suspicious.


Now, if someone says 320kbps MP3 "sounds terrible" or they claim they can always tell an MP3 (without comparing to the original) they are most-likely fooling themselves.    And in that case there should be no 87%...  They should be getting it right 100% of the time.  

Almost every time I thought  I heard a compression artifact when listening casually (LAME V0), it turned-out that the original CD had the same "defect".

Re: Top reasons for screwing up a codec ABX test?

Reply #6
Almost every time I thought  I heard a compression artifact when listening casually (LAME V0), it turned-out that the original CD had the same "defect".

I've have noticed that too with plenty of music. Particularly, I often find cymbals and other similar high-frequency noises sounding garbled in similar ways as mid-to-low bitrate encodings and/or old mp3 encoders. I hear that everywhere now, but most of the time, the source material sounds the same to me. As a person that doesn't get a change to listen to much live music at all, therefore not all tha familiarized with how these ought to sound, it makes me wonder it's me having unreasonable expectations or there's something fishy going on with the mastering process.

Re: Top reasons for screwing up a codec ABX test?

Reply #7
When I was being a bit of a music student, I bought a minidisk player/recorder. As it was a move from  cassette tape, issues of transparency, claims and counter claims were not even in my head at that time. I was just wonderful to be able to record people making music and take home something that was, at least to me, hiss/noise-free and really quite authentic sound.

But there was one thing that the Sony compression system found hard to handle: the Indian-music drone that comes from an electronic box which simulates the sound of the plucked four-string tambura. It actually messed up that sound. If blind checking had been on the agenda back then, this would have been a simple give-away.

Of course, there is plenty of lossy-compressed Indian music around. This may have been specific to the implementation of the algorithm on that portable box. But it was a glaring hole in what, otherwise, to me then was like sudden perfection in home recording.
The most important audio cables are the ones in the brain

Re: Top reasons for screwing up a codec ABX test?

Reply #8
I did an mp3 320 kbps vs flac test and got 12 out 12 correct I believe it was (I have the log, but not at hand right now). I know why I could tell them apart though: Both where unofficial downloads, and the mp3 simply didn't sound quite as good (more grungy I suppose you could say). Then when I took the flac file and made an mp3 myself, I couldn't tell them apart. It's also possible that the mp3 was a lower bitrate that had been changed to 320 kpbs.
But there are various people around the world who can tell mp3 and flac/wav apart, although they are few and far between.
There is also the possibility that the person changed the file he/she used (e.g. the volume level) and then changed it back when asking others to test it. It should be possible to cheat in this way, right? But if the files show a "changed by" date that is later than the ABX test took place (Foobar logs the time), that's a give-away.
"What is asserted without evidence can be dismissed without evidence"
- Christopher Hitchens
"It is always more difficult to fight against faith than against knowledge"
- Sam Harris

Re: Top reasons for screwing up a codec ABX test?

Reply #9
Cheating on internet-based test is easy, but it is not a good attitude to assume a successful ABX test as cheating, since it will just discourage people to even think about doing a ABX test, especially in this forum since people will not respect TOS8 anymore if we easily disprove a successful ABX result.

On the other hand, identifying potential technical issues in the ABX procedure is more than welcome. It would be very useful, for example when such tests are performed to identify an issue in a complex studio setup, or in an audiophile meeting/demonstration when all participants are in the same venue.