Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: DSP Loudness Control (Read 40872 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

DSP Loudness Control

Reply #25
... Unusual music will give it fits, though, I suspect.


The loudness control? Or the listener?
I'm getting an impression that there's more to perception of bass levels than the steady state indicated by the "equal loudness" curves.
Given a piece of recorded music with quiet parts and loud parts, played at a level different than that for which it was originally mixed, you appear to be saying that a different amount of loudness compensation will be required during the playing of the two different parts in order to make them both sound as "balanced" as they do when played at their intended level.
Regards,
   Don Hills
"People hear what they see." - Doris Day

DSP Loudness Control

Reply #26
I'm getting an impression from you and JJ that loudness compensation EQ has to vary as the signal level varies within a given musical performance, and I find that hard to accept. If it is in fact the case, I have some learning to do. I'll have to get Holman's paper that you referenced.


Journal of the AES, July/August 1978, volume 26, number 7/8, pp 526 - 536.

There's so much to quote that is relevant, but here's two portions of one paragraph:

"A completely technically correct system would also need to be dynamic, that is, it should compensate for the fact that the recording is not always playing at the peak level.  Instead, the system needs to look at the amount of attenuation between the original and reproduced sound pressure level and the amount of attenuation at any given point in time below the peak level." ..."...the translation between original and reproduced sound pressure level calls for less compensation at the upper end of the dynamic range than the lower end: the level translation causes the need for dynamic compensation." (bold italics, mine).


Take any equal loudness contour curve family and re-graph the data for any bass frequency as a gain differential between 1khz and that frequency.  You'll see the resulting graphs for lower frequencies are not straight lines, they are non-linear curves.

DSP Loudness Control

Reply #27
You can buy the paper from the AES, if I get time I might be able to scan my physical copy. But clearly its an old paper,  the core of it and the outgrowth using modern technology can also be gleaned from this page:

http://www.audyssey.com/audio-technology/dynamic-eq

DSP Loudness Control

Reply #28
... Take any equal loudness contour curve family and re-graph the data for any bass frequency as a gain differential between 1khz and that frequency.  You'll see the resulting graphs for lower frequencies are not straight lines, they are non-linear curves.


Is this the differential you refer to?
At the 40 phon level, the differential between 1 KHz and 20 hz is about 60 dB.
At the 60 phon level, the differential between 1 KHz and 20 hz is about 50 dB.
At the 80 phon level, the differential between 1 KHz and 20 hz is about 40 dB.

Regards,
   Don Hills
"People hear what they see." - Doris Day

DSP Loudness Control

Reply #29
... Take any equal loudness contour curve family and re-graph the data for any bass frequency as a gain differential between 1khz and that frequency.  You'll see the resulting graphs for lower frequencies are not straight lines, they are non-linear curves.


Is this the differential you refer to?
At the 40 phon level, the differential between 1 KHz and 20 hz is about 60 dB.
At the 60 phon level, the differential between 1 KHz and 20 hz is about 50 dB.
At the 80 phon level, the differential between 1 KHz and 20 hz is about 40 dB.


From memory the actual differences above seem too high, but that's the general idea.

DSP Loudness Control

Reply #30
From memory the actual differences above seem too high, but that's the general idea.


Those are the ISO 226:2003 differences. Here are the approximate differences for Fletcher-Munson, Robinson-Dadson, and ISO 226:3003:
Code: [Select]
Phon  F-M  R-D  ISO
----  ---  ---  ---
40    42   50   60
60    26   41   50
80    10   32   40


But the important figure is the delta between the differences:

Code: [Select]
Phon  F-M  R-D  ISO
----  ---  ---  ---
40    42   50   60
=20   =16   =9  =10
60   =26   41   50
=20   =16   =9  =10
80    10   32   40


This holds true within a dB or two for other phon levels, especially in the ISO curves.
OK so far?

Regards,
   Don Hills
"People hear what they see." - Doris Day

DSP Loudness Control

Reply #31
This holds true within a dB or two for other phon levels, especially in the ISO curves.
OK so far?

Sorry, I don't have a convenient way to check your data for accuracy.  But assuming it is (accurate), what's your point?

 

DSP Loudness Control

Reply #32
... Sorry, I don't have a convenient way to check your data for accuracy.  But assuming it is (accurate), what's your point?


Maybe an example will help:


We have a piece of "music" that has the following characteristics:

1. It consists of a "loud part" and a "quiet part", with 10 dB difference.

2. The "instruments" produce a 1 KHz tone and a 20 Hz tone in both parts. 

3. The music sounds "as composed" when the two tones are of equal perceived loudness in both parts and the loud part is played at the 80 phon level. (The two parts sound the same, except that one is quieter than the other.)

4. The dB values in the following examples assume the use of the ISO 226:2003 curves. The values will be different if using other curves such as those by Fletcher-Munson or Robinson-Dadson, but the same principle and relative ratios still apply.

For ISO 226:2003, the SPL difference between "equal loudness" 1 KHz and 20 Hz tones are approximately:

80 phon level = 80 dB SPL 1 KHz, 120 dB SPL 20 Hz = 40 dB difference.
60 phon level = 60 dB SPL 1 KHz, 110 dB SPL 20 Hz = 50 dB difference.
40 phon level = 40 dB SPL 1 KHz, 100 dB SPL 20 Hz = 60 dB difference.
20 phon level = 20 dB SPL 1 KHz,  90 dB SPL 20 Hz = 70 dB difference.


Under the above conditions, the resultant SPLs for the example "music" played at its intended level are as follows:

Loud part:
1 KHz =  80 db SPL
20 Hz = 120 dB SPL (40 dB difference)
Quiet part:
1 KHz =  70 db SPL
20 Hz = 115 dB SPL (45 dB difference)

No loudness compensation is required for the quiet part. The music is being played at its "composed" level. The perceived balance between the "instruments" is as intended by the artist: in this case, both "instruments" are of equal perceived loudness during both parts.

The listener finds this too loud for their environment so reduces the "volume" level by 20 dB. The resultant SPLs are now as follows:
Loud part:
1 KHz =  60 db SPL
20 Hz = 100 dB SPL (40 dB difference)
Quiet part:
1 KHz =  50 db SPL
20 Hz =  95 dB SPL (45 dB difference)

The listener perceives a loss of bass, because the curves indicate that the "loud part" difference should be 50 dB and the "quiet part" difference should be 55 dB at this listening level. 10 dB of boost at 20 Hz needs to be applied to both the loud and quiet parts in order to restore the equal perceived loudness.

The listener reduces the "volume" level by a further 20 dB. The resultant SPLs are now as follows"
Loud part:
1 KHz =  40 db SPL
20 Hz =  80 dB SPL (40 dB difference)
Quiet part:
1 KHz =  30 db SPL
20 Hz =  75 dB SPL (45 dB difference)

The listener perceives a further loss of bass, because the curves indicate that the "loud part" difference should now be 60 dB and the "quiet part" difference should be 65 dB. Another 10 dB of boost at 20 Hz needs to be applied to both the loud and quiet parts in order to restore the equal perceived loudness.

The key points are that:

1. For each 10 db of difference between the "artist intended" SPL and the actual listening SPL, 5 dB (at 20 Hz) of "loudness compensation" needs to be applied. (Assuming the ISO 226:2003 curves are used.)

2. The same amount of "loudness compensation" needs to be applied to the loud and quiet parts of the music.

3. "Dynamic" compensation, that alters according to the changing dynamics within the piece of music, will alter the artist's intended balance. The compensation amount should be determined only by the difference between the "artist intended" SPL and the actual listening SPL, set by adjusting the "volume control".


For a practical implementation, two parameters need to be determined.

The first is the time constant of the tone control. This can be determined by examining the difference between two curves on the "equal loudness" curve. Two curves that differ by, for example, 20 dB SPL at 1 KHz, will end up differing by only 10 dB SPL by the time they get to 20 Hz. The curve derived from the differences will determine the required time constant for the tone control network. (Note that it's the difference between two different phon levels that is important, not the absolute phon level. For example, the curve derived from the difference between the 20 phon and 40 phon curves is the same as the curve derived from the difference between the 60 and 80 phon curves.) I've got as far as using Jeff Hackett's Matlab routine to spit out a spreadsheet of the curve values.

The second is the amount of loudness compensation needed for a given volume change. Again, it can be determined from the curves, noting for ISO 226:2003 that a 20 dB SPL change at 1 KHz requires a 10 dB change at 20 Hz in order to maintain a perceived balance between bass and midrange.

Does this make sense, or have I over-simplified due to an inadequate knowledge of how the auditory system responds?

Regards,
   Don Hills
"People hear what they see." - Doris Day

DSP Loudness Control

Reply #33
Oversimplified. 

Your example assumes that all stimulus within the entire spectrum start out at a perceived equal loudness relative to each other.  But what, for example, if a bass note were intended to be 10dB below a mid-band note?  At the original level, all is well, but if you alter the play level, you'll find that the two notes land on entirely different theoretical curves.  And even that example is too static to be realistic.

The problem is, the curves, all of them, were derived from data taken by using stimuli presented at "equal loudness levels" with respect to each other, and all were steady-state, with no variation over time, be they tones or band-limited noise signals.  It had to be done that way to characterize the ear/brain response in some easily presented manner.  The data shows what it would take for a stimulus to be perceived as equal in loudness to a mid-band reference signal.  So long as we realize that that is what the curves show, we're fine.  And we can see the nonlinear nature of LF ear/brain response represented by a curve family.  But the equal-loudness contour curves do not represent the actual correction needed, because they are not correction curves at all, they are graphs of stimuli level required for perceived equal loudness.  The actual correction curve must be derived from a known original reference level, and the new play level of each independent stimulus frequency, and the predicted differential in ear/brain response shown by the contours.  Note: you can't apply a single curve to the spectrum, you have to correct for the level of each frequency because (and this is the key) the ear/brain system is non-linear to a different degree for each frequency below mid-band. 

So, for proper loudness compensation you need to know:
1. The acoustic original level (at the time of recording)
2. The new acoustic level during playback of each frequency or group below 1KHz (actually, 400Hz would work as well)
3. The amount of correction needed for each frequency to re-establish spectral balance for playback at the new level considering its specific moment to moment acoustic level.

The problem is, for #1 you have to make certain assumptions, unless the original was mixed in a standardized control environment (again, this happens routinely in film and has for decades, not so much for music).  For #2, you have to know a lot about the play system to establish any meaningful relationship between real acoustic level and a knob position.  For #3, we open up the whole "which curve" can of worms, F/M, R/D, ISO, or what?  J.J. likes F/M, I've always felt the Stevens data was closer, then there's the wizards at Audyssey, who didn't use any existing data, but gathered their own, and didn't use an anechoic chamber, or tones, or band-limited noise.  They did it with actual program material, a wide variety at that, and they took terrabyes of data points.  And their solution is dynamic. 

Considering music as the stimulus, we must note that it changes moment by moment.  And yes, some music probably does present stimuli across the spectrum at equally perceived loudness levels, but certainly not all. If we expand our possible signals to include non-music material such as film soundtracks, the problem becomes even more obvious.  Moreover, the stumuli doesn't happen exactly concurrently in time or level. 

If you're willing to make certain compromises, you could apply a fixed correction to certain music because it doesn't have a lot of dynamic range, and is mixed with a relatively fixed relationship between bass and mid-band.  In fact, it's because of that fact that loudness compensation has worked at all over the years.  But you must realize that this solution is situational at best, certainly far from universal. 

The only volume setting at which a fixed, non-variable correction curve would be right would be that which results in the exact acoustic level of the original performance, at the original listening position.  In other words, the level at which no loudness compensation is required.

So, just because I'm running out of gas on this, here's the "agree to disagree" statement: feel free to build up your favorite curve set in a DSP, and give it a whirl. Make it adjustable using non-variable curves. Use the old one-knob or two, or three, or no knobs. Tweak it by ear, RTA, FFT, or fuzzy clustering.  One of the nice things is, today, it's not so big a deal to do that, even if you do prefer to ignore the research.  Who knows? You just might be thrilled with your result!  And that, I would have to say, will make it valid for you.  But you may want to grab a few of those loudness research papers and look them over.  At very least, you'll have a cure for insomnia.

DSP Loudness Control

Reply #34
Oversimplified.


Understood. I appreciate your point that sources with dynamics cause the ear to exhibit different behaviour than it does with steady state tones. I have some learning to do.

Your example assumes that all stimulus within the entire spectrum start out at a perceived equal loudness relative to each other.  But what, for example, if a bass note were intended to be 10dB below a mid-band note?  At the original level, all is well, but if you alter the play level, you'll find that the two notes land on entirely different theoretical curves.  And even that example is too static to be realistic.


Please bear with my static, unrealistic model for a moment. I ran the figures again assuming the bass in the loud part was 10 dB (SPL) below the level required to provide "equal loudness" compared with the midrange. The non-intuitive part was, "if the bass in the loud part is 10 dB SPL below the midrange, what level must the bass be in the quiet part to maintain the same perceived loudness difference?" And the answer was, 10 dB SPL below the midrange. (Note that in both cases, the "perceived loudness" difference is actually 20 dB. In other words, for both parts you would need to drop the midrange by 20 dB to make it "equal loudness" with the bass.) Dropping the listening level by 20 dB still showed the same difference: for both loud and quiet parts, 10 dB of "loudness compensation" at 20 Hz was required. Dropping the volume by a total of 40 dB, a total of 20 dB of compensation was required. So if you say it's wrong, I must be making a fundamental error in my calculations.


But the equal-loudness contour curves do not represent the actual correction needed, because they are not correction curves at all, they are graphs of stimuli level required for perceived equal loudness.  The actual correction curve must be derived from a known original reference level, and the new play level of each independent stimulus frequency, and the predicted differential in ear/brain response shown by the contours.


I'm quite clear on the point that the curves do not represent the actual correction needed. But I believe the compensation required can be derived from the curves, at least for steady state signals, because the relationship between the curves follows a single law. That is, although the curves are all different in shape, the differences between them work out to a single set of deltas. Applying this to each curve makes them all "line up" over each other. Take a curve at a given phon level, apply the correction, and it matches closely the next phon curve up. I'll have to post some graphs from my spreadsheet.

Note: you can't apply a single curve to the spectrum, you have to correct for the level of each frequency because (and this is the key) the ear/brain system is non-linear to a different degree for each frequency below mid-band.


I thought that's what I was doing? Apparently not.

So, for proper loudness compensation you need to know:
1. The acoustic original level (at the time of recording)
2. The new acoustic level during playback of each frequency or group below 1KHz (actually, 400Hz would work as well)
3. The amount of correction needed for each frequency to re-establish spectral balance for playback at the new level considering its specific moment to moment acoustic level.


I do understand that there has to be an initial reference level set that matches the intended playback level (not necessarily the original recording level, it may have been rebalanced in mixing / mastering to suit a different playback level.) I'm taking that as a given, in order to focus on what I'm doing wrong.

... The only volume setting at which a fixed, non-variable correction curve would be right would be that which results in the exact acoustic level of the original performance, at the original listening position.  In other words, the level at which no loudness compensation is required.


I'm not suggesting a "fixed, non-variable curve". I'm suggesting that there is a specific curve for every difference between the "original performance" level and the listener's choice of level. And this curve follows a simple law. I appreciate that this is only valid for steady-state signals. I'll need to learn more about how we hear, in particular the differences in response to time-varying signals versus steady state.

... So, just because I'm running out of gas on this, here's the "agree to disagree" statement: feel free to build up your favorite curve set in a DSP, and give it a whirl. Make it adjustable using non-variable curves. Use the old one-knob or two, or three, or no knobs. Tweak it by ear, RTA, FFT, or fuzzy clustering.  One of the nice things is, today, it's not so big a deal to do that, even if you do prefer to ignore the research.  Who knows? You just might be thrilled with your result!  And that, I would have to say, will make it valid for you.  But you may want to grab a few of those loudness research papers and look them over.  At very least, you'll have a cure for insomnia.


Thank you for your time. I appreciate it. I have plenty of areas to investigate now. Insomnia? I haven't had that problem for a very long time. My biggest problem is that I get involved in reading some paper, I start scribbling notes and diagrams and suddenly it's 3 in the morning... 



Regards,
   Don Hills
"People hear what they see." - Doris Day

DSP Loudness Control

Reply #35
I'd iterate, one last time, three points:

1. In research many have concluded the control has to respond to the input signal dynamically.  Those include, but are not limited to, the Holman paper, the Stevens data (apparently his Mark VI system revealed the issue, though he was focused on annoyance prediction), I suspect the data in the Torick-Bauer papers and others too, though my loudness file is almost two inches thick, and I'm not prone to read it all again having had that fact established quite well the first time.

2. Today's loudness compensation solutions implemented in DSP include the afore mentioned Audyssey Dynamic Volume and Audyssey Dynamic EQ, Dolby Volume, and THX Loudness Plus.  All are dynamic and respond to the input signal.  You may Google at will. All three had the same basic goal, all three are commonly available on inexpensive equipment. None of those organizations would put man-years and $$$ of R and D resources into such a project if the simple solution were adequate.

3. A very brief and informal survey of mine, though admittedly incomplete, shows that there are no loudness compensation systems on the market today that use a non-dynamic, steady state, user variable contour solution.  I may have missed one, but the high-end market would have nothing whatever to do with EQ of any kind, and loudness comp passed out of the consumer receiver market decades ago, including a single switch/single knob control, a two-knob system, and even a three-knob system approach.  If I have somehow not described the solution you suggest correctly, I apologize.  Again, I'd like to reference the Holman paper where it discusses an implementation of a three-control loudness comp system as impractical.

The above three points should be enough to send anyone interested in loudness comp to the library to see what the Loudness Jedi have learned over the years.  I'd just hate to see anyone try to re-invent the square wheel (even in DSP) when experts have already agreed it needs to be round.

DSP Loudness Control

Reply #36
I'd iterate, one last time, three points:


On your point 1: I accept that dynamic compensation is required in addition to "fixed" compensation for optimum performance. I interpret "dynamic" to mean "compensation EQ varies based on variations in the level of the program material", and "fixed" to mean "The EQ curve and amplitude will vary depending on the system volume setting, but is not affected by the program material".

On your point 2: I've spent a couple of days reading through the relevant threads of the Audyssey forum over at AVS. (Over 52,000 posts...)  Audyssey Dynamic Volume is not concerned with "loudness compensation". Audyssey Dynamic EQ is. The two systems work differently to different purposes. Audyssey Dynamic EQ does, as its name suggests, have a dynamic component on top of a "fixed" (varies with Master Volume setting, not with program level) component similar to the scheme I have been describing, though it uses proprietary curves rather than the published curves. I haven't looked at Dolby Volume yet. I've been told that THX Loudness Plus doesn't have a dynamic component, though I would expect it to do so.

On your point 3: The foundation of the Audyssey DEQ is "non-dynamic, steady state". They added a "user variable" gain offset in response to feedback from users. The control ("RLO") effectively reduces the amount of DEQ applied to a given signal by offseting the level set by the Master Volume. The result is a user variable contour. Audyssey suggest different settings for film versus music, and different settings for different music genres / dynamic structures.

The above three points should be enough to send anyone interested in loudness comp to the library to see what the Loudness Jedi have learned over the years.  I'd just hate to see anyone try to re-invent the square wheel (even in DSP) when experts have already agreed it needs to be round.


Why develop my own when I could buy a ready-made solution which benefits from much more R&D than I could possibly afford to do? For the experience.
I'm not into re-inventing wheels, but I am into making wheels that fit my requirements rather than buying ready-made ones that aren't quite right.
I'm reminded of a saying abut car manufacturers: They require their tyres to be round, black and cheap. And they aren't too fussy about the round part.

My original aim was to provide a solution to a different problem - how to manage the bass in an OB speaker system with optional subwoofer, using an analog implementation. I've come a long way from that... Next step on that path, get up to speed with the literature. I also want to write up an explanation of how the "fixed" part of loudness compensation works, that can be understood by non-technical users. There has been years of debate on whether the behaviour indicated by the "equal loudness" curves can be compensated for with a single control law, and the argument continues. I'll also need to point out that it is only part of the solution.

Edit: Just looking back to the OP in this thread, after all this he still doesn't have a workable solution. It reminds me of this cartoon:
http://www.projectcartoon.com/cartoon/2648
Regards,
   Don Hills
"People hear what they see." - Doris Day

DSP Loudness Control

Reply #37
I don't know if I can solve the problem in its fullest form but I have done a little work.

I hope the board finds this useful. I found a MATLAB M-file for computing the ISO226 equal loudness contours.

I used this to compute what I believe to be reasonable loudness compensation curves.

I assume there is a reference dB SPL level (choose 80 or 90), and then a desired "level difference", i.e. how many db below the reference level we are generally listening at, and for which we need boost.

I take the equal loudness contours from REFDB and then REFDB-LEVEL_DIFF.  Then I normalize them by forcing them to match exactly at 1 kHz.  This is the somewhat arbitrary part.  Then the difference between them forms the loudness correction curve.

I found that the "reference dB level" is rather unimportant for the loudness correction over a reasonable range (e.g. 70 to 90db---the underlying approximation to ISO 226 doesn't accept values above 90dB)

It would be very nice if the computations involved could be wrapped into a nice foobar2000 plug in (e.g. a specialized version of the great Graphical Equalizer plugin from xnor which has great filters).
Loudness computations for MATLAB