The flatter the frequency response, the better artifacts are hidden.
This is biased to me since boomy bass can outweigh a bad cymbal tone reproduction for instance. This same goes as regards the opposite. A flatter frequency response may not emphasize one particular artifact but you're more likely to spot odd reproductions across the whole spectrum.
Let me disagree here. In my experience the reference class headphones helped me a lot to spot artifacts.
I can't but utterly agree. I ABX consistently with speakers by the way, not headphones.
I used to have a crappy JBL Creature 2.1 setup and after having done some ABX I switched from WMA Pro VBR 75 (VBR 90 produced too big files for me at that time) to Nero 1.3.3 q.50/Nero 1.5.4 q.50 when it got out.
Then I got the Razer Mako (I know, but I got it half the price and I'm still deeply convinced this is a very nice piece of audio gear) and a few tracks of my own library sounded that bad on casual listening that I started some ABX testing with the APE's from my external HDD (unfortunately I don't have enough place on my laptop at the moment to keep everything in lossless) and I was able to distinguish from the lossless up to q.65 some of the aformentionned tracks. Obviously, most files were very fine even at q.50 when I gave them an ABX try, but some dragged my attention while I wasn't doing ABX at all!
I eventually got an entry level studio system worth 1000€, and exactly the same thing occured. I also tried again to ABX my "killer" samples and successfully did so up to Nero 1.5.4 q.85. I ABXed none at q.90 (I was tired too, constantly taking breaks since nothing was obvious at that bitrate anyway) but thought I had done enough with Nero to give a try to the well-regarded QT encoder (I used qaac with QT 7.7) and on different samples I was able to ABX TVBR127.
In a nutshell, it does make a significant difference. A friend of mine could concur when he switched from entry level Sony headphones to the Audio-Technica ATH-A900, even after having spotted the artificats on the AT and ABXed them, he could not ABX any of the files with the Sony.
The system you're using is relevant regarding mastering issues too, save on Death Magnetic, I had never been bothered with clipping on the Mako's. With the Tannoy's studio monitors I heard clipping even on Load and instantly refrained myself: damn, this album is 1996, it can't be so clipped, then oppened the APE in Audacity and cried...
Here are a few logs so that I'm not hanged, drawn nor quartered:
AAC QT 7.7 TVBR 127 (363kbps)
foo_abx 1.3.4 report
foobar2000 v1.1.1
2011/08/30 16:59:49
File A: D:\Users\Recup\To Listen To\TEMP\02 Ground.flac
File B: D:\Users\Recup\To Listen To\TEMP\02 Ground.m4a
16:59:49 : Test started.
17:00:29 : 01/01 50.0%
17:00:35 : 02/02 25.0%
17:00:41 : 03/03 12.5%
17:00:51 : 04/04 6.3%
17:03:07 : 05/05 3.1%
17:03:17 : 06/06 1.6%
17:04:30 : 07/07 0.8%
17:04:43 : 08/08 0.4%
17:04:57 : 09/09 0.2%
17:05:49 : 10/10 0.1%
17:08:38 : 11/11 0.0%
17:08:54 : 12/12 0.0%
17:09:07 : 13/13 0.0%
17:09:09 : Test finished.
----------
Total: 13/13 (0.0%)
AAC LC .85 (342kbps)
foo_abx 1.3.4 report
foobar2000 v1.1.1
2011/09/05 22:22:23
File A: D:\Users\Recup\To Listen To\TEMP\mmasq.m4a
File B: D:\Users\Recup\To Listen To\TEMP\mmasq.wav
22:22:23 : Test started.
22:22:58 : 01/01 50.0%
22:23:14 : 02/02 25.0%
22:24:11 : 03/03 12.5%
22:26:45 : 04/04 6.3%
22:28:20 : 05/05 3.1%
22:29:32 : 06/06 1.6%
22:30:59 : 07/07 0.8%
22:36:31 : 08/08 0.4%
22:36:59 : 09/09 0.2%
22:38:10 : 10/10 0.1%
22:38:56 : 11/11 0.0%
22:41:09 : 12/12 0.0%
22:42:05 : 13/13 0.0%
22:42:18 : Test finished.
----------
Total: 13/13 (0.0%)