A blind ABX procedure requiring only iTunes, and my findings 2010-07-06 07:40:36 Motivation: I use a Mac, so I could not use the ABX plugin for foobar2000, which seems to be the gold standard (or at least the most common tool) for performing ABX listening tests. I did not want to download a piece of software that was built just to do ABX tests (e.g. ABXer), because I was afraid of viruses or crashing my system, and because they all had bad reviews.Method: In iTunes, create 2 new playlists, one playlist named "AB" and the other named "X". Choose two sound files to test, such as "Help" by The Beatles encoded at 128 kbps and "Help" encoded at 320 kbps. Put the 2 files in playlist "AB", and also put these same 2 files in playlist "X". "AB" and "X" are now identical.Now, right click on "X" and select "Open in new window". Drag the window to the right until it spills off the side of your monitor, so that the "Bit rate" column is no longer visible. Open "AB" in the main iTunes window, with the bit rate of each file visible on the right.Training phase: Listen to the 320 and 128 kbps versions in the "AB" playlist, until you think you have heard any differences that might exist.Test phase: In the "X" playlist/window, click the "Shuffle" icon in the lower left. There is a 50% chance that this will reverse the order of the two tracks in "X", and a 50% chance that it will not. Because the "Bit rate" column is not visible on screen, you cannot tell whether they were switched. Now listen to JUST THE TOP track from "X". Does it sound like 128 or 320? You can go back to "AB" and listen to the open-label versions, and then go back and listen to the top track from "X". Never listen to the bottom track from "X". When satisfied, write down whether you think the top track from "X" was 320 or 128, and then drag the window to the left to reveal its true bit rate.Note that the subject is truly blind to the identity of "X", so this is a properly blinded design. It could even be called "double blind", because there is no proctor.Caveat: This is not a classic ABX test because the bit rates of A and B are displayed. We could discuss at length the signal detection theory and psychophysical implications of revealing this information, but I don't want to do that. If you want to more closely emulate classic ABX methodology, you could drag the "AB" window to the right as well, and shuffle it as well. This makes testing slightly more laborious to undertake, because two things must be shuffled with each trial, and two windows must be dragged.Statistics: I performed 5 trials with each pair of stimuli. If I correctly identified X in all 5 trials, then I rejected the null hypothesis and concluded that A and B were significantly different. Using a one-tailed binomial distribution, 5/5 corresponds to a p-value of 3%, so there is only a 3% chance that I could get 5/5 if I was guessing randomly. Note that using only 4 trials is not statistically powerful enough to reject the null hypothesis with alpha of 5% (which is traditional in science). If I made even 1 mistake, I failed to reject the null hypothesis. Thus, 4/5 was considered "no significant audible difference".Results:1. Pandora sucks really bad.Pandora streams music at 64 kbps AAC. I only need to listen for 1 second in order to reliably discriminate Pandora music from 128 kbps MP3.2. 96 kbps LAME MP3 (unknown presets) sucks pretty bad.My friend ripped a bunch of CDs with AudioGrabber using LAME at 96 kbps, and shared the MP3s with me. I asked to borrow the CDs so that I could rip them at a higher bit rate, and he laughed at me. The results speak for themselves: In each of 6 comparisons, I was able to discriminate 96 kbps MP3s from V0 MP3s. That is 30 trials altogether - 5 trials for each stimulus pair, and 6 stimulus pairs (6 songs). That is 30/30 correct identifications.3. 128 kbps MP3 (unknown encoder, unknown presets) is NOT transparent for me, but I can't decide whether it actually bothers me.I have a lot of 128 kbps MP3s, which were probably ripped with a number of different setups. I frequently find myself agonizing over the decision, "Should I rip this again, which takes time, and then throw out the old lower bit rate copies, and also lose the play count data? That is a hassle. Can I just live with the 128 kbps version?" It's official: For roughly 20 songs from various genres, I was able to reliably discriminate 128 kbps MP3s from V0 MP3s. In fact, for 4 songs I was able to discriminate (5/5 correct trials) even without listening to the "A" and "B" benchmarks. I would start by listening to "X", and I could just tell "that sounds crappy, that sounds like 128 kbps, I don't even need to compare it to something."4. 192 kbps MP3 may or may not be transparent for me. For some songs, I can reliably discriminate 192 kbps MP3 from lossless ALAC (5/5 correct trials). For other songs, I cannot.Conclusion: Damn, this is a tough call. Should I re-rip my 128 kbps MP3s? I think I should. Should I re-rip my 160 and 192 kbps MP3s? I don't know. I think I may re-rip JUST my favorites.