Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Comparing two different masterings of an album. Am I doing it right? (Read 10259 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Comparing two different masterings of an album. Am I doing it right?

I think by now everyone is familiar with the loudness war and the dynamic compression of music.  Though a lot of people criticize it, I do find the DR Database at dr.loudness-wars.info to be a good tool to get information about how much an album has been squashed in newer releases.

I'm a big fan of the Moody Blues and back in the late 80s went out and bought "Days of Future Pased" that was released by Mobile Fidely Sound Labs.  According to the DR Database this album has an average DR value of 12.

In 2006, the Moodies began releasing remastered versions of their albums with addtional bonus tracks.  "Days of Future Passed" came out and I went out and bought it.  The DR database has this release with an average DR value of 8.

At the time I bought the new remaster, I was pretty unfamilar with the loudness war, and wa simply a fan looking for bonus material.  From 2006 until about 2 weeks ago, the 2006 release was what I was listening to.

2 weeks ago, I found the old MFSL release in my basement in box of old CDs that I had "upgraded" to new remastered releases.

I decide to compare the two, so I rip the MFSL release to FLAC.

The 2006 release was MUCH LOUDER than the 1989 MFSL release as expected, but I didn't think it sounded all that terible when comparing the two.  But, louder music tennds to fake yor ears out into believeing that it sounds OK.

So I decided to try something.

I took track one, and applied track replay gain to the first track of each CD and told foobar200o to use it.  This reduce both songs to the same volume to my ears.  The 2006 release suddenly sounded very dull, lifeless and flat.  The 1989 release was very vibrant and just sounded better overall.

Now that you've read that wall of text, my question is, did I do it right?  Is this a good way to compare a squashed release to an older release that has more dynamic range, or should I be doing it some other way?

Comparing two different masterings of an album. Am I doing it right?

Reply #1


Blinding the comparison would have helped rule out preference bias.  But now it's too late because you already know which one sounds like what.

Comparing two different masterings of an album. Am I doing it right?

Reply #2
Blinding the comparison would have helped rule out preference bias.  But now it's too late because you already know which one sounds like what.


Blinding would not tell me which one I like better.  It would only tell me if I can tell a difference.

There are a bunch of other tracks on the album.  I can still ABX another track and see what happens.

I wish there was a foobar plugin that would let me blindly compare two tracks and decide which one I like better.  I haven't found anything like that yet.



Comparing two different masterings of an album. Am I doing it right?

Reply #5
Is this a good way to compare a squashed release to an older release that has more dynamic range, or should I be doing it some other way?
"squashed", "more dynamic range". It seems you're biased already
IMO when you're always using replaygain, this should be a fair comparison.
However, increased loudness is one of the most important goals for (re)mastering. It is also known to produce side effects. If you remove the loudness advantage by lowering the level with replaygain, the side effects will become more evident. In that sense your test is unfair.

To find out if you prefer a louder version, all other things being equal, do a preference (abchr) test with two sources that only differ in level. Hint: you might experience other differences than loudness alone, like clarity, brightness, spaciousness etc.
Next thing is to test the DRC differences. There are dozens of interesting tests, like comparing the differently mastered versions in a noisy environment like a car, airplane or in the subway. Your preference might depend on the playback circumstances. Here's another one: listen on speakers and lower the monitoring level a lot, say 30 dB and test again.
In an ideal world the dynamic range would be adapted (if at all) in the playback device. AFAIK this is not yet technically (high quality at least) and financially possible. And content providers don't want to provide several differently mastered versions either, so we're stuck with the average consumer version.
I'm looking forward to your results.

Comparing two different masterings of an album. Am I doing it right?

Reply #6
Is this a good way to compare a squashed release to an older release that has more dynamic range, or should I be doing it some other way?
"squashed", "more dynamic range". It seems you're biased already
IMO when you're always using replaygain, this should be a fair comparison.
However, increased loudness is one of the most important goals for (re)mastering. It is also known to produce side effects. If you remove the loudness advantage by lowering the level with replaygain, the side effects will become more evident. In that sense your test is unfair.

To find out if you prefer a louder version, all other things being equal, do a preference (abchr) test with two sources that only differ in level. Hint: you might experience other differences than loudness alone, like clarity, brightness, spaciousness etc.
Next thing is to test the DRC differences. There are dozens of interesting tests, like comparing the differently mastered versions in a noisy environment like a car, airplane or in the subway. Your preference might depend on the playback circumstances. Here's another one: listen on speakers and lower the monitoring level a lot, say 30 dB and test again.
In an ideal world the dynamic range would be adapted (if at all) in the playback device. AFAIK this is not yet technically (high quality at least) and financially possible. And content providers don't want to provide several differently mastered versions either, so we're stuck with the average consumer version.
I'm looking forward to your results.


I fully admit I am biased already.

The thing that started me down my "loudness wars" research was the INXS album "Listen Like Thieves."  I had never bought this album even though I loved the song as a youth.  I walked into the used record store and found a 2011 British release of the album that was digitally remastered.  I grabbed it for $6.00 and went home.  I thought the title track sounded awful.  I wondered how a CD release could sound worse than 1985 radio version I remembered from my youth.  I shelved the album and moved on...

Then a rerun of Miami Vice came on that had "Listen Like Thieves": in it, and I thought "wow, that sounds much better.  I wonder why."  (I'm not reliving the 80s.  I was channel surfing.)  Got myself a copy of the 1985 CD release and played the song, and it sounded so much better.  Now this is obviously not directly related to the loudness wars.  It could have easily just been a bad remaster.  But Googling blamed the loudness wars as the #1 suspect.

Though I haven't set up ABC/HR yet, I did tinker with 2 other tracks last night.  Both tracks had a dynamic range, according to the DR meter plugin for foobar2000, twice as high in the original release.

Using replaygain, I compared Paul Simon's "The Boy In The Bubble," comparing the 25th Anniversary release to the original.  I liked the the newer release, but only slightly

I then compared "Overkill" by Men at Work.  I preferred the original version, but only slightly.  The DR meter claimed the original version had a DR value of 15 and the remaster had a DR value of 6.

I was wondering whether just having a digital master somehow just naturally decreased the dynamic range.  So I went and looked up the album "Brothers In Arms" by Dire Straits.  A full digital recording, recorded at 16/48, I believe.  It has a DR value of 16, telling me that I can ignore that belief.

Well, time to do some blind testing.  Used record store opens at 11:00 AM.  Hopefully I will find some time to test today and can post some results.


Comparing two different masterings of an album. Am I doing it right?

Reply #8
Blinding the comparison would have helped rule out preference bias.  But now it's too late because you already know which one sounds like what.


Blinding would not tell me which one I like better.


Um, yes it would.  To be precise, it would tell you which one you preferred when the only thing you knew about it, was the sound.

DBTs are used to test for difference *or* for preference. 

Quote
It would only tell me if I can tell a difference.

There are a bunch of other tracks on the album.  I can still ABX another track and see what happens.

I wish there was a foobar plugin that would let me blindly compare two tracks and decide which one I like better.  I haven't found anything like that yet.


 

Comparing two different masterings of an album. Am I doing it right?

Reply #9
http://sourceforge.net/projects/abchr/


Thank you!  I'll be compiling that tomorrow.


 

ABC/HR is a blind test. (Effectively 'double' blind when implemented in software.)

Purpose:
Blind comparison and blind quality rating to remove the effects of personal bias and the placebo effect.


Comparing two different masterings of an album. Am I doing it right?

Reply #10
Blind comparison and blind quality rating to remove the effects of personal bias and the placebo effect.
I don't think you can completely eliminate personal bias in preference tests. As soon as you can identify both versions by ear (and passed the ABX test), the bias is back again.
Suppose one version is brighter. You might prefer the less bright one. But I might be able to explain that the microphone was a bit dull, which wasn't corrected in the first version, so the brighter version is more correct. After knowing this fact your preference might change (or not).
Now this was only one variable (bright/dull). Differences in mastering can be caused by dozens of variables (like level, low-eq, mid-eq, hi-eq, de-essing, dynamic compression, limiting, stereo width etc.) and none of these are necessarily static during the track, so you'll have to listen to the whole track.
When asked for a single, overall preference, we have to combine all these variables. For the mastering engineer these preferences for each variable will be different compared to an average consumer or an audiophile. A good mastering engineer however should know what his audience likes and deliver a suitable product, even if it conflicts with his personal preference.
YMMV

Comparing two different masterings of an album. Am I doing it right?

Reply #11
Blind comparison and blind quality rating to remove the effects of personal bias and the placebo effect.
I don't think you can completely eliminate personal bias in preference tests. As soon as you can identify both versions by ear (and passed the ABX test), the bias is back again.
Suppose one version is brighter. You might prefer the less bright one. But I might be able to explain that the microphone was a bit dull, which wasn't corrected in the first version, so the brighter version is more correct. After knowing this fact your preference might change (or not).
Now this was only one variable (bright/dull). Differences in mastering can be caused by dozens of variables (like level, low-eq, mid-eq, hi-eq, de-essing, dynamic compression, limiting, stereo width etc.) and none of these are necessarily static during the track, so you'll have to listen to the whole track.
When asked for a single, overall preference, we have to combine all these variables. For the mastering engineer these preferences for each variable will be different compared to an average consumer or an audiophile. A good mastering engineer however should know what his audience likes and deliver a suitable product, even if it conflicts with his personal preference.
YMMV


Personal bias is a horrible thing.

Though I think it is possible to set it aside to some degree.

I just ABed (sorry no ABX), Huey Lewis and the News - Sports between the 1989 MFSL release and the 30th Anniversary Edition. which came out in 2013.  To my ears, the 2013 version sounded different, but just as good as the 1989 release.  It definitely did not have the loudness cranked up as high as some other remasters (I'm talking to you, Moody Blues!).  I hopped on the Steve Hoffman forums, and the general consensus there was that the 2013 remaster was a good release.  Someone posted screen shots from Audacity showing how there was definitely some increase in loudness, but that there was absolutely no clipping, and the loudness was not cranked up that high.

So I continued to compare the two and came to the conclusion that one was not any better really than the other to my ears.

Then I checked the DR database and saw that the MFSL release had a DR value of 13 and the 2013 release had a DR value of 9.

Had I seen those DR numbers first, I guarantee you, I would have been completely biased against the remaster going into the test.  Makes me wonder how much value one should place in DR values.

Comparing two different masterings of an album. Am I doing it right?

Reply #12
Perhaps knowing that TTDR is not infallible may remedy the situation?

https://www.hydrogenaud.io/forums/index.php?showtopic=102963

Comparing two different masterings of an album. Am I doing it right?

Reply #13
Perhaps knowing that TTDR is not infallible may remedy the situation?

https://www.hydrogenaud.io/forums/index.php?showtopic=102963


I know it's not infallible, but it is a tool to look at to get the overall picture when assessing which version of an album to buy.  The Green Day releases on HDTracks only had an increase of 4 points in the DR database vs the initial 2004, but I like them a lot better than the previous releases of those album.  However, I fully admit that I did not go into that purchase blind.  I read a number of posts and watched a YouTube video showing the merits of the release and how it was a victory against the "loudness wars."

But that's human nature.  No one wants to go into a purchasing decision blind.  Especially when you're being forced to pay $20-$25 for an album you already own, just to get the version they should have released in the first place.  I really wish the HDTracks master was released on CD.

Comparing two different masterings of an album. Am I doing it right?

Reply #14
You seemed to have missed the point.

Now that you know that TTDR can assign a low number to a release that hasn't undergone dynamic range compression, why would you continue to allow yourself to be swayed by a four point difference?

Now before someone mentions babies and bathwater, TTDR is not a baby.

My signature also applies.

Comparing two different masterings of an album. Am I doing it right?

Reply #15
You seemed to have missed the point.

Now that you know that TTDR can assign a low number to a release that hasn't undergone dynamic range compression, why would you continue to allow yourself to be swayed by a four point difference?

Now before someone mentions babies and bathwater, TTDR is not a baby.

My signature also applies.


Well, I know that NOW.  I did not know that a few months ago when I bought the two Green Day albums.

I have often wonder if just cleaning up artifacts, pops, whistles, etc in old analog releases could cause the DR value to go down, even though the final release could be subjectively superior to a lot of people.

Comparing two different masterings of an album. Am I doing it right?

Reply #16
People often overlook the point that Kees made earlier: DRC is not the only tool that is used when remastering old releases.  Just by itself, EQ can make a significant difference.

I need to back away from my previous reply, however.  While it may be lacking, TTDR is often right, and like all tools, when used judiciously it can be useful.  Humans are also easily prone to cognitive biases, so your position is quite reasonable.  That I so quickly dismiss TTDR, despite it having a pretty good track record for comparing non-vinyl versions, can also be characterized by cognitive bias.

Comparing two different masterings of an album. Am I doing it right?

Reply #17
People often overlook the point that Kees made earlier: DRC is not the only tool that is used when remastering old releases.  Just by itself, EQ can make a significant difference.

I need to back away from my previous reply, however.  While it may be lacking, TTDR is often right, and like all tools, when used judiciously it can be useful.  Humans are also easily prone to cognitive biases, so your position is quite reasonable.  That I so quickly dismiss TTDR, despite it having a pretty good track record for comparing non-vinyl versions, can also be characterized by cognitive bias.


My opinion on TTDR at this point is to use it last, so it doesn't bias me.  The problem, of course, is that there's this big searchable database out there just begging you to go looking.  And I really want to see this kind of data BEFORE I go out and make a purchase.  It's really hard, if not next to impossible to AB a new release without buying it.  So, when looking at older albums which have 3-4 different remasters, you have some homework to do, unless you want to buy 4 copies, listen to them and sell the 3 you don't back used for a fraction of what you paid for them.

I have to say that I really think the loudness war is turning.  I just picked up a copy of the 2015 remaster of Pipes of Peace by Paul McCartney, mostly to hear the new remix of "Say, Say, Say." I was surprised that the loudness level of the album was almost identical to the 1983 release, which is kind of rare for a remaster.  Then I go look on the DR database and see a DR value of 13, while the 1983 release had a DR value of 14.

A friend of mine bought the "Five Years" David Bowie boxed set, and again the loudness level is pretty close the original CD release, if not identical.

Hopefully other artists will follow suit.  It's annoying to have Maroon 5 shuffle on my phone while driving in my car and having it blow my ears out.  In my old car, I used to listen to most music at 7 on the dial.  When Maroon 5 came on, I had to turn it down to 2.

Comparing two different masterings of an album. Am I doing it right?

Reply #18
Hopefully other artists will follow suit.  It's annoying to have Maroon 5 shuffle on my phone while driving in my car and having it blow my ears out.  In my old car, I used to listen to most music at 7 on the dial.  When Maroon 5 came on, I had to turn it down to 2.

Use replaygain.

Comparing two different masterings of an album. Am I doing it right?

Reply #19
Hopefully other artists will follow suit.  It's annoying to have Maroon 5 shuffle on my phone while driving in my car and having it blow my ears out.  In my old car, I used to listen to most music at 7 on the dial.  When Maroon 5 came on, I had to turn it down to 2.

Use replaygain.


Better yet, EBU R128. I know it is implemented in JRiver and probably plenty of other players (quick google turned up this for Foobar: https://www.foobar2000.org/components/view/foo_r128norm).

OP, what you describe with comparing remasters is demonstrated very nicely in this video: https://www.youtube.com/watch?v=j-O5l6NSsdY

Comparing two different masterings of an album. Am I doing it right?

Reply #20
Better yet, EBU R128.

Replaygain doesn't preclude the use of R128 (in fact, foobar2000 uses the R128 algorithm when scanning for replaygain), though I should have simply said use loudness normalization.

EDIT1: Whether R128 was recently brought into question:
https://www.hydrogenaud.io/forums/index.php?showtopic=110561

EDIT2: Regarding the linked fb2k component, the OP was talking about listening in his car, so it doesn't apply.

Comparing two different masterings of an album. Am I doing it right?

Reply #21
Hopefully other artists will follow suit.  It's annoying to have Maroon 5 shuffle on my phone while driving in my car and having it blow my ears out.  In my old car, I used to listen to most music at 7 on the dial.  When Maroon 5 came on, I had to turn it down to 2.

Use replaygain.


As I clean up my ID3/Vorbis tags, I have been adding ReplayGain info.

Comparing two different masterings of an album. Am I doing it right?

Reply #22
Blind comparison and blind quality rating to remove the effects of personal bias and the placebo effect.
I don't think you can completely eliminate personal bias in preference tests. As soon as you can identify both versions by ear (and passed the ABX test), the bias is back again.


Only if you 'know' something else about the audio source.

The idea is not to *identify* which is which, in any given trial, it's simply to decide which you like better.

Sure, if the difference is easy to hear ('big'), then your preference will probably be set in the earliest trials.  You'll probably keep 'preferring' the same one.  But at least the 'bias' will have been set by the *sound alone*. The crucial thing is  you have to have not previously  heard A and B when you 'knew' what A and B were.  For the OP this would mean picking a track, setting replaygain,  all without actually listening to them.  *Then* do DBT.  (Better still, have someone else do the picking/processing). *

This is why I told the OP that he couldn't do it with the track he's already 'learned'.

(Btw, the quote you responded to is from the HA Wiki article on ABC/HR    )


*Better still, I would think if,  'A' and 'B' were not fixed  (i.e., the same file could be 'A" in one trial and then 'B' in the next).  I don't know why DBT tools don't always do this.