Skip to main content


If you are using a Hotmail or Outlook email address, please change it now, as Microsoft is rejecting all email from our service outright.
Topic: A blind ABX procedure requiring only iTunes, and my findings (Read 5136 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

A blind ABX procedure requiring only iTunes, and my findings

Motivation: I use a Mac, so I could not use the ABX plugin for foobar2000, which seems to be the gold standard (or at least the most common tool) for performing ABX listening tests. I did not want to download a piece of software that was built just to do ABX tests (e.g. ABXer), because I was afraid of viruses or crashing my system, and because they all had bad reviews.

Method: In iTunes, create 2 new playlists, one playlist named "AB" and the other named "X". Choose two sound files to test, such as "Help" by The Beatles encoded at 128 kbps and "Help" encoded at 320 kbps. Put the 2 files in playlist "AB", and also put these same 2 files in playlist "X". "AB" and "X" are now identical.

Now, right click on "X" and select "Open in new window". Drag the window to the right until it spills off the side of your monitor, so that the "Bit rate" column is no longer visible. Open "AB" in the main iTunes window, with the bit rate of each file visible on the right.

Training phase: Listen to the 320 and 128 kbps versions in the "AB" playlist, until you think you have heard any differences that might exist.

Test phase: In the "X" playlist/window, click the "Shuffle" icon in the lower left. There is a 50% chance that this will reverse the order of the two tracks in "X", and a 50% chance that it will not. Because the "Bit rate" column is not visible on screen, you cannot tell whether they were switched. Now listen to JUST THE TOP track from "X". Does it sound like 128 or 320? You can go back to "AB" and listen to the open-label versions, and then go back and listen to the top track from "X". Never listen to the bottom track from "X". When satisfied, write down whether you think the top track from "X" was 320 or 128, and then drag the window to the left to reveal its true bit rate.

Note that the subject is truly blind to the identity of "X", so this is a properly blinded design. It could even be called "double blind", because there is no proctor.

Caveat: This is not a classic ABX test because the bit rates of A and B are displayed. We could discuss at length the signal detection theory and psychophysical implications of revealing this information, but I don't want to do that. If you want to more closely emulate classic ABX methodology, you could drag the "AB" window to the right as well, and shuffle it as well. This makes testing slightly more laborious to undertake, because two things must be shuffled with each trial, and two windows must be dragged.

Statistics: I performed 5 trials with each pair of stimuli. If I correctly identified X in all 5 trials, then I rejected the null hypothesis and concluded that A and B were significantly different. Using a one-tailed binomial distribution, 5/5 corresponds to a p-value of 3%, so there is only a 3% chance that I could get 5/5 if I was guessing randomly. Note that using only 4 trials is not statistically powerful enough to reject the null hypothesis with alpha of 5% (which is traditional in science). If I made even 1 mistake, I failed to reject the null hypothesis. Thus, 4/5 was considered "no significant audible difference".


1. Pandora sucks really bad.
Pandora streams music at 64 kbps AAC. I only need to listen for 1 second in order to reliably discriminate Pandora music from 128 kbps MP3.

2. 96 kbps LAME MP3 (unknown presets) sucks pretty bad.
My friend ripped a bunch of CDs with AudioGrabber using LAME at 96 kbps, and shared the MP3s with me. I asked to borrow the CDs so that I could rip them at a higher bit rate, and he laughed at me. The results speak for themselves: In each of 6 comparisons, I was able to discriminate 96 kbps MP3s from V0 MP3s. That is 30 trials altogether - 5 trials for each stimulus pair, and 6 stimulus pairs (6 songs). That is 30/30 correct identifications.

3. 128 kbps MP3 (unknown encoder, unknown presets) is NOT transparent for me, but I can't decide whether it actually bothers me.
I have a lot of 128 kbps MP3s, which were probably ripped with a number of different setups. I frequently find myself agonizing over the decision, "Should I rip this again, which takes time, and then throw out the old lower bit rate copies, and also lose the play count data? That is a hassle. Can I just live with the 128 kbps version?" It's official: For roughly 20 songs from various genres, I was able to reliably discriminate 128 kbps MP3s from V0 MP3s. In fact, for 4 songs I was able to discriminate (5/5 correct trials) even without listening to the "A" and "B" benchmarks. I would start by listening to "X", and I could just tell "that sounds crappy, that sounds like 128 kbps, I don't even need to compare it to something."

4. 192 kbps MP3 may or may not be transparent for me. For some songs, I can reliably discriminate 192 kbps MP3 from lossless ALAC (5/5 correct trials). For other songs, I cannot.

Conclusion: Damn, this is a tough call. Should I re-rip my 128 kbps MP3s? I think I should. Should I re-rip my 160 and 192 kbps MP3s? I don't know. I think I may re-rip JUST my favorites.

A blind ABX procedure requiring only iTunes, and my findings

Reply #1
What you have posted is not really wrong, but not really meaninful anyway.

AAC-64: If that is LC-AAC, then, most (if not all) encoders use a sampling rate of 32Khz, if not 22Khz. The encoded bandwidth is somewhere inbetween 11 and 16Khz. The lesser, the easier to spot.
If it is HE-AAC, I heard iTunes now supports profile v1 which I would expect a file of that bitrate to be. HE-AAC does not aim to transparency.

MP3-96 CBR: I don't know which version of LAME is your friend using (AudioGrabber is more than 10 years old software. I have no idea which codec is he using. On their website at, the last news is precisely a plugin for using the LAME encoder. The link currently points to download version 3.98.4).
Said that, CBR at such a bitrate is generally going to sound relatively bad, if transparency is the goal.

MP3-128: If you are talking again of CBR, then the usual limitations of MP3 at this bitrate apply (44Khz but a bandwidth of as much 16Khz, some ringing, flanging and preecho depending on the content). Mp3 at 128 is usually good enough, but having the choice, either higher bitrate or a VBR/ABR preset aiming at a similar bitrate (like V5) will be better. (this is the current recomended setting for portables, when there's a higher quality at home)

Mp3-160:  Usually, MP3 files at 160kbps were made with the fraunhoffer encoder. It used to switch from joint stereo to simple stereo at that bitrate, and this makes 160kbps files similar in quality to 128kbps files that use joint stereo.
About 192kbps files, there is a big window of qualities. It is expected to be good enough, but once again, if used in CBR tends to be a bad idea, if transparency is the goal.  Use of VBR/ABR presets aimed at around that bitrate (V2, V3) will be better.

iTunes MP3 encoder did have some problems in earlier versions. Here at Hydrogenaudio detected a problem that showed only in multithreaded computers (namely, all the intel based Macs). That problem has since been fixed.

So if you want to get MP3, you can read quite detailed information in these boards and wiki. If you don't mind moving to AAC/MP4, then using iTunes/Quicktime encoder is a good recommendation, using bitrates from 128 to 256kbps, preferably in VBR mode.

A blind ABX procedure requiring only iTunes, and my findings

Reply #4
How is ABXer for comparing the same music at 16/44 and 24/192?

It seems fine, it plays back every file I've given it.  Since it doesn't do continuous switching, there's no need to worry about switching artifacts.  I have no idea how it handles interface to the audio hardware.  Presumably it resamples material at unsupported sample rates.  I'm pretty sure the built-in audio on my Macbook Pro doesn't go above 96 kHz sampling.  I've come across something online that said that old versions (i.e. 6 and maybe 7) of iTunes always interfaced with the hardware at the sample rate set by the "Audio MIDI Setup" utility; if ABXer uses the relevant API, then it probably does just that.  (I've got it set at 96.)  I think iTunes 9 is different.  I've heard a faint click in iTunes when going between 44.1 and 96 files; I assumed that meant the DAC was switching from 44.1-based clocking to 48-based clocking.

The OP's method looks like it should work, although it isn't really convenient for the user.  And in a minor matter, those of us reading the report have to trust that the user was conscientious about hiding the appropriate information from himself at the relevant time and that he kept good records, whereas with the standard programs we just have to trust that he wouldn't doctor the automatically generated output logs.


A blind ABX procedure requiring only iTunes, and my findings

Reply #5
Should I rip this again, which takes time, and then throw out the old lower bit rate copies, and also lose the play count data?

In the past I've seen scripts, tools & other hacks linked which can transfer such stats to the new rips. Whether anyone knows a solution that works for your player on your OS might be worth asking in a new thread.

SimplePortal 1.0.0 RC1 © 2008-2020