HydrogenAudio

Hydrogenaudio Forum => Scientific Discussion => Topic started by: easily_confused on 2012-01-10 04:39:26

Title: DSP Loudness Control
Post by: easily_confused on 2012-01-10 04:39:26
I want to implement bass boost based on volume setting.  Basically the low end (below 2KHz) of the Fletcher Munson curves.  I don't want to try to dynamically load interpolated biquad filter coefficients based on the volume control.  Is ther some other filter architecture that will produce the approx. 0db per octave (loud) to 12 dB per octave (low volume)  bass boost in a simpler way.  An example of a simple 6dB only analog control is at extron.com  http://www.extron.com/company/article.aspx...p;version=print (http://www.extron.com/company/article.aspx?id=loudnesscontrol_ts&version=print) .  This would be simple to implement as an iir filter, but I want to go a little fancier. Thanks
Title: DSP Loudness Control
Post by: pdq on 2012-01-10 12:46:50
How about taking the unfiltered data, and the data with max bass boost applied, and calculate some linear combination of the two based on volume setting?
Title: DSP Loudness Control
Post by: Woodinville on 2012-01-10 13:13:27
You need a time-varying system with knowlege of the absolute gain of the entire system including transducer in order to make this work.
Title: DSP Loudness Control
Post by: DVDdoug on 2012-01-10 19:34:20
I think what you are looking for is a shelving filter  (or, something bassed on shelving filters) rather than standard filter that boosts xdB/octave with a constant slope.
Title: DSP Loudness Control
Post by: easily_confused on 2012-01-10 20:43:59
Hmmm, - implementing the exton example is not as simple as I expected. 

I guess a biquad configured as a variable gain shelving filter is the simplest. That requires recomputing  the coefficients based in a sine table.  I wonder where I could look to see who has already done a loudness control for a dsp?
Title: DSP Loudness Control
Post by: bandpass on 2012-01-15 09:48:58
SoX has an accurate FFT FIR filter implementation for loudness control; it also has simple biquad shelving filters (based on RBJ's biquad cookbook). I'm guessing maybe you want to take the code from the loudness filter that determines the bass boost (for a particular freq and loudness) and use this to help configure a low-shelf biquad.
Title: DSP Loudness Control
Post by: Woodinville on 2012-01-16 00:05:16
Again, you need a signal-dependent EQ that operates with knowlege of the exact overall system gain. You've chosen a hard problem.
Title: DSP Loudness Control
Post by: hellokeith on 2012-01-16 04:20:47
Again, you need a signal-dependent EQ that operates with knowlege of the exact overall system gain. You've chosen a hard problem.

JJ,

How is low-volume bass boost different than other "maintaining intensity" problems?
Title: DSP Loudness Control
Post by: saratoga on 2012-01-16 04:39:23
I always wondered if they're any good open source implementations of these bass boost algorithms?  Or do people typically cook their own from scratch.
Title: DSP Loudness Control
Post by: Woodinville on 2012-01-16 05:24:14
Again, you need a signal-dependent EQ that operates with knowlege of the exact overall system gain. You've chosen a hard problem.

JJ,

How is low-volume bass boost different than other "maintaining intensity" problems?


It's not, and they all have the same problem.
Title: DSP Loudness Control
Post by: splice on 2012-03-18 11:00:39
... How is low-volume bass boost different than other "maintaining intensity" problems?

It's not, and they all have the same problem.


I don't see it as a hard problem to solve, if you restrict the solution to recorded music. Most recorded music has been deliberately equalised to sound "right" (as the artist, or at least the mastering engineer, intended) at a specific SPL. Many mastering engineers use an average level of -20dB ref to FS, and adjust monitor levels to give about 86 dB SPL C weighted at their listening position. Audio for home theatre uses  a reference of 105 dB at FS. That's pretty close to 86 dB at -20 dB FS. So a loudness compensation control doesn't have to be complicated. It just has to increase the bass level by a fixed ratio as the level (volume control) is reduced.

The required loudness compensation appears to be in the order of 2:1. For example, if you decrease the overall gain by 10 dB, you need to increase the bass by 5 dB at 20 Hz. Most simple tone control circuits max out at about 12 to 18 dB boost or cut. Matching this with a coupled volume control would result in a volume adjustment range of plus or minus 24 to 36 dB. Assuming an 86 dB SPL centre point, the 36 dB figure would correspond to an in-room range from 50 to 122 dB, which is adequate.

In practice, you set the control to "flat" and adjust a master gain so that the music sounds "right". The music should then still sound "right" as you decrease or increase the level from that point.
You might argue that you would need to repeat the setup for each source or track, but in practice most modern music is so heavily compressed ("loudness wars") that one calibration will suffice for most of it. Even before CD, most LPs had reasonably similar levels to each other - set, in this case, by the limitations of the playback cartridge. 

Most of the above was excerpted from posts I made a couple of years ago here:

http://www.diyaudio.com/forums/solid-state...ss-control.html (http://www.diyaudio.com/forums/solid-state/154209-reverse-old-loudness-control.html)

See post #31 for proof of concept circuit.


Title: DSP Loudness Control
Post by: dc2bluelight on 2012-03-30 09:40:12
First, the Fletcher-Munson data is not accurate.  See the work of S. S. Stevens, much better. Ever notice how a Fletcher-Munson loudness compensation control never sounds right?  That's because they got it wrong to begin with. 

Second, Woodinville is right, it IS a hard problem, it's not a fixed curve, and it is highly dependent on the specific acoustic play level, so it has to be dynamic. 

Next...why number them...BIG assumption that everybody mixes to a standard level in a standardized monitoring environment.  Not in the music industry!  Film, yes, but not music.  And that -20dbFS would be nice, but doesn't happen after mastering, especially pop stuff.  Not even close.  Pretty much have to ignore dbFS in this case, it's not relevant.  System acoustic play level is though.  But in the context of correcting for differing hearing response at differing levels.  You're in no way matching the mix environment, there's just no way to know what it was, and it's not important anyway. 

No,  you can't do it based on a volume control setting.  Been tried by many people for many years, but it doesn't work.  The reason is simple: the correction required is dependent on SPL, which a volume control may influence but doesn't predict and is not the only thing that affects it.  Hotter signal into it, and you turn it down, but that would change the compensation inappropriately.  There were even attempts to calibrate the compensation by adding another control, but it doesn't work because program dynamics are not fixed.  No, the correction must be tied to specific SPL, not a control setting.  That's actually where many people trying this messed up. 

If you look at the Stevens data, or even Fletcher-Munson for that matter, you'll see that the compensation curve families resemble that of a dynamics processor with a dynamic transfer function that changes with frequency, not a fixed filter.  The amount you need at 20Hz changes with level at a different rate than the amount you need at 200Hz. 

And finally, it's been done, and done quite well.  It's called Audyssey Dynamic Volume and Dynamic EQ.  Rather than base their idea on existing loudness research, their algorithm is based on what was essentially reverse-engineering human loudness perception.  They took LOTs of data on lots of subjects, with lots of different program material and the result is pretty darn good.  The big advantage is, once an Audyssey system has been calibrated it knows the exact SPL at every moment regardless of volume control setting or variations in program material, so it can apply the right correction dynamically. Pretty darn smart, those guys.
Title: DSP Loudness Control
Post by: splice on 2012-03-30 12:12:36
First, the Fletcher-Munson data is not accurate.  See the work of S. S. Stevens, much better. Ever notice how a Fletcher-Munson loudness compensation control never sounds right?  That's because they got it wrong to begin with.


That's true, but you're putting up a straw man. I haven't said which "loudness curve" I based my reasoning on, either here or in the thread I referenced. In fact, I used ISO 223:2003, which was referenced by someone else early in the thread.
Is Stevens still about? I haven't seen any work from him for many years.

Second, Woodinville is right, it IS a hard problem, it's not a fixed curve, and it is highly dependent on the specific acoustic play level, so it has to be dynamic.


Ah, yes, but that doesn't make it hard to solve, at least approximately. Say you increase the level at 1 KHz by 6 dB. The change required to produce a similar perceived level increase at, say, 20 Hz is about 3 dB. This ratio holds true over a wide range of phons.  So all you need is a coupled level and bass equalisation control that, for every 6 dB of level reduction, adds equalisation resulting in 3 dB of boost at 20 Hz relative to 1 KHz.  (Every time the midrange level drops by 6 dB, the bass level drops by 3 dB).

Next...why number them...BIG assumption that everybody mixes to a standard level in a standardized monitoring environment.  Not in the music industry!  Film, yes, but not music.  And that -20dbFS would be nice, but doesn't happen after mastering, especially pop stuff.  Not even close.  Pretty much have to ignore dbFS in this case, it's not relevant.  System acoustic play level is though.  But in the context of correcting for differing hearing response at differing levels.  You're in no way matching the mix environment, there's just no way to know what it was, and it's not important anyway.


Standardised, or at least similar, monitoring levels are more common than you might think once you move up the ladder a bit. Ask Bob Katz, he could "bore for Africa" on the subject. And the -20 dBFS is relevant for monitoring when mastering, it has no relevance to the final released media level. As for matching the mix environment, try it yourself, assuming you have a competent reproduction chain. For most genres other than the highly artificial (electronica etc), there is a definite SPL at which they sound "right". So even though it may not match the mix environment levels, it sounds balanced to you on your system.

No,  you can't do it based on a volume control setting.  Been tried by many people for many years, but it doesn't work.  The reason is simple: the correction required is dependent on SPL, which a volume control may influence but doesn't predict and is not the only thing that affects it.  Hotter signal into it, and you turn it down, but that would change the compensation inappropriately.  There were even attempts to calibrate the compensation by adding another control, but it doesn't work because program dynamics are not fixed.  No, the correction must be tied to specific SPL, not a control setting.  That's actually where many people trying this messed up.


You need two controls. One to set the initial volume level so that it sounds "right", then the coupled control to change the volume to the setting you want to listen at.  In theory you would need to do this for each track, or at least each album, but in practice most sources of a given genre and age have similar levels. If you play old vinyl, you should be familiar with the way that the majority of LPs end up being played within a relatively small arc of the volume control. Ditto but different setting for old CDs, and again for current "loudness war" CDs. Apple's Soundcheck and MP3 Replaygain standardise the levels even more. 

... And finally, it's been done, and done quite well.  It's called Audyssey Dynamic Volume and Dynamic EQ. Rather than base their idea on existing loudness research, their algorithm is based on what was essentially reverse-engineering human loudness perception.  They took LOTs of data on lots of subjects, with lots of different program material and the result is pretty darn good.  The big advantage is, once an Audyssey system has been calibrated it knows the exact SPL at every moment regardless of volume control setting or variations in program material, so it can apply the right correction dynamically. Pretty darn smart, those guys.


... and missing the point when it comes to music dynamics. Chris acknowledges that the Audyssey dynamically changes the EQ in response to changing program levels. But this is exactly what you do not want when listening to music. As I said elsewhere:
".... Take Ravel's "Bolero". The double bass initially comes in while the levels are still moderate. The loudness of the bass is chosen to be audible but not overpowering. As the piece progresses and the overall levels rise, the bass level also rises but still in proportion to the rest of the players - if the overall level rises by 10 dB, the bass level rises by somewhat less. The point is that "loudness compensation" is built into music by the composers / musicians / mix engineers, and if you make a static adjustment to the volume, you only need to make a static adjustment to the loudness compensation. The rest is already taken care of in the music. "

And why I think loudness compensation is needed:
"... In my opinion, music is best listened to at the SPL at the listener position that it was created for. (Creation may mean the original performance, or the engineer's creation of a mix of separate components recorded at different times in different acoustics - or no acoustic at all in many cases.) If we normally listened at this SPL, there would be no need for any loudness compensation. But we do like to listen at different levels for several good reasons, and when we do so we no longer hear the intended tonal balance. Many of us like to adjust the tonal balance at our chosen listening level so that it is similar to the perceived tonal balance at the "correct" level. Done properly, we find this adjustment effective and pleasing. It is an effective mitigation of the degradation forced by having to listen at a different level to that which the work was intended for.
..."

And on the topic of tonal balance change with level:
"... In the specific case of loudness compensation, we aren't correcting for human hearing deficiencies. We're compensating for deficiencies in the reproduction environment.

In a "live" situation, if we move away from the source we experience an overall level decrease. In addition, the treble decreases somewhat faster than the midrange, and the bass somewhat less. We perceive this as a natural tonal balance change, which needs no correction.

In a reproduction scenario, if we reduce the volume by a similar amount, the tonal balance does not change. Compared to a distance increase, we have too much treble and not enough bass. We perceive this as unnatural. This is why I believe in leaving the HF compensation alone and just boosting the bass. The natural change in HF sensitivity of the ears, as illustrated by the "loudness curves", takes care of the required additional HF attenuation, so only the bass requires compensation. ..."

I suggest you read the referenced thread in DIYAudio, if you haven't done so already. All of the points you raised were also raised there.
Title: DSP Loudness Control
Post by: Woodinville on 2012-03-30 21:05:35
First, the Fletcher-Munson data is not accurate.  See the work of S. S. Stevens, much better. Ever notice how a Fletcher-Munson loudness compensation control never sounds right?  That's because they got it wrong to begin with.


Whoa, there, the reason a "loudness control" doesn't work is simple, it doesn't work because it is not time varying (according to signal and absolute presentation level) or signal dependent.

Stevens' curves and Fletcher's curves are not far off, if you remember that one used open ear canals and one closed ear canals.  Unsurprisingly, the frequency of the ear canal resonance shifts by approximately an octave as a result. No surprise there.

I wouldn't be so fast to dismiss Fletcher, especially since by any reasonable reading, Stevens is more confirmation than anything else.  Claiming "Fletcher got it wrong" is just unjustifiable, and is almost as bad yellow journalism as the crap put forth in the article in Spectrum where it was asserted that Fletcher was trying to figure out how cheap AT&T could make transmission.

If you build a codec based on Fletcher's results (Using modern understanding), you get to AAC, via the original version of AT&T PAC, from before the 'trivestiture". That's not conjecture, that's personal experience.

And Splice, please realize that the thing you want to build must be signal dependent, and must be tied to absolute presentation level as a function of frequency.  Signal dependency is not an option, it's a requirement.
Title: DSP Loudness Control
Post by: splice on 2012-03-31 01:42:22
... And Splice, please realize that the thing you want to build must be signal dependent, and must be tied to absolute presentation level as a function of frequency.  Signal dependency is not an option, it's a requirement.


I'm missing some crucial piece of understanding. Please bear with this "bear of very small brain" for a bit...
"Signal dependant" - do you mean in time or frequency?
To me, one implies a dynamic EQ that adjusts itself according to the current level or spectral content of the signal (e.g Audyssey processor), the other a static EQ, the curve of which is adjusted according to the auditory system behaviour described by the "equal loudness" curves.

"Tied to absolute presentation level as a function of frequency" - I interpret this as saying that each chosen presentation SPL must have a matching EQ curve. My assertion is that each *change* in presentation SPL requires a fixed *change* in the EQ curve. Almost, but not quite, the same thing.

"Signal dependency being a requirement" - I take that to be the first part of the process. With no EQ, adjust the listening SPL until the source sounds "right" or "natural" or otherwise sounds pleasing. Now use the "loudness" control to make all level adjustments after that.

(If I were to implement this in DSP instead of analog controls, I'd make it more user friendly by unidirectionally coupling the level and loudness controls for the initial adjustment.)

I think part of the understanding problem is the way that the function being compensated for is dynamic - the amount of compensation required changes as the absolute SPL changes, so how can it be compensated for by a statically adjusted EQ? As I tried to explain earlier, the spectral balance of the music is fixed at source - the "Bolero" example - so a fixed compensation is appropriate. You don't have to adjust the bass tone control as a piece of music goes from pianissimo to fortissimo - the musicians have done that for you already. All the "loudness" control does is mimic what happens when you walk from the front of the hall to the back - although a concert hall is a bad example, perhaps more like an open-air concert. 

I've been procrastinating because of the difficulty of building such a circuit in the analog domain - not difficult for me, but a disincentive for anyone wanting to try it who doesn't have constructional skills. It occurred to me last night that I should try my hand at coding a foobar plugin implementation. It would make it easy for anyone wanting to try it out.


---------------
Regards,
  Don Hills 


Title: DSP Loudness Control
Post by: dc2bluelight on 2012-03-31 03:13:51
Perhaps my statement as to Fletcher-Munson getting it wrong was a bit to generalized.  Their data was accurate for the conditions in which it was taken, and the test equipment available in that day.  But, since those conditions included pure tones as stimuli presented as a frontal field in an anechoic space, the resulting curves don't represent the actual correction needed for real listening environments.  The really unfortunate part of Fletcher-Munson is that the curves became widely adopted, but almost entirely misunderstood.  They were applied as complete loudness correction curves, when in fact, they represent human hearing response (in those specific test conditions).  Loudness compensation doesn't need to correct for human hearing response, it just needs to correct for the variance in response at differing levels. 

Later research using more modern measurement equipment and more appropriate methods yielded better data.  Yet even though adopted as a standard, Robinson-Dodson's data (pure tones again, but presented with headphones or random incidence) isn't as pertinent to real listening conditions as is really required, and they freely admit that fact in their paper. 

There have been many, historically, who have attempted to characterize human loudness perception, some of them fairly well known (Zwicker, or the Bauer-Torick papers), and what's interesting is there reasonable correlation between all of them, particularly in that the ear response is anything but flat even at live music levels of 100dB or so.  The exception, interestingly, is the Fletcher-Munson data, which shows response at 100dB that is much flatter than any other curve family.  For that reason alone, the F/M data would not appropriate to apply in a loudness compensation scheme. 

To complicate things, as most equal loudness curves go, even Fletcher-Munson, the high frequency portion of the curves above 1KHz are parallel, and so no adjustment is required in that range.  But designers applying the F/M equal loudness data to a loudness comp circuit often used the entire curve!  So we had boost at the top and bottom, and of course, the wrong amount at the bottom in any case. 

Stevens work included a wide variety of stimulus methods, from diffuse, free-field, earphones, etc., and included several subjective quantities as well (annoyance, etc.).  One of his test systems extended down to 1Hz!  And while that's not useful for loudness compensation, it's notable since other research stops at 20Hz. 

At the risk of repetition, it's important to note that the equal-loudness contours found in Stevens, F/M or any other do no represent the actual correction needed, but reflect hearing response.  The correction system would actually apply a differential curve, which would, in fact, have to be dynamically variable, a fact easily seen on any equal loudness contour family. 

So, I'm afraid I'll modify my Fletcher-Munson comment only slightly: Their data is valid for their test conditions, and considering the limitations of test equipment of the day.  However, to consider it at all relevant to an actual loudness compensation algorithm would be an error.  Perhaps that's more accurate than "they got it wrong", but you see what I mean. 

As to the supposed loudness compensation built into music by composers (Ravel, et al),  their hypothetical compensation is valid for only one listening position: the conductor's podium.  Ever other seat in the house will hear something else.  However, no seat will have a basic level change anywhere near 20dB.  Yet that's the kind of level shifts we see in recorded music played in private listening conditions.  With that kind of offset, and looking at any equal loudness contour curve family, anyone can see for this to work it must be dynamic and must operate with the knowledge of actual playback SPL.  No fixed modifier would be correct at anything but one specific SPL.

The dual-control loudness compensation idea has been tried (Yamaha, late 1970s, early 1980s,  Apt-Holman, 1978), but has not survived even though the Apt-Holman implementation actually applied correction based on the Stevens data.  The reason is simple: people can't be depended upon to make continual subjective evaluation and apply correction.  Two knobs might get you close, but only at one SPL (at least some music still has dynamic range), and one volume setting.  The knob would require constant adjustment, something no listener will do. 

Bob Katz has made some excellent inroads in studios, but there's decades of music already recorded and released without any of that, and still today volumes of music released without standardization.  The film industry became standardized, at least in the high-fidelity sense, when Dolby Labs became involved.  That's been 40 years.  No, it's wild in music to this day, though getting better. 

I don't now how else to make the point that compensation must be dynamic, but if all of the above doesn't do it, perhaps ask yourself: if it's so simple as to be a fixed, static correction, why at this point in history have we moved completely away from fixed-curve and dual control systems? Why do the pre-eminaet voices in this field all say it has to be dynamic?  Must be something they know.
Title: DSP Loudness Control
Post by: Woodinville on 2012-03-31 03:40:11
I'm missing some crucial piece of understanding. Please bear with this "bear of very small brain" for a bit...
"Signal dependant" - do you mean in time or frequency?
To me, one implies a dynamic EQ that adjusts itself according to the current level or spectral content of the signal (e.g Audyssey processor), the other a static EQ, the curve of which is adjusted according to the auditory system behaviour described by the "equal loudness" curves.


You need frequency domain equalization (i.e. a filter curve) that varies with the signal (and of course frequency and presentation level), and where the actual gain of the system post-filter is known to a dB or so.
Title: DSP Loudness Control
Post by: Woodinville on 2012-03-31 03:44:24
Perhaps my statement as to Fletcher-Munson getting it wrong was a bit to generalized.  Their data was accurate for the conditions in which it was taken, and the test equipment available in that day.  But, since those conditions included pure tones as stimuli presented as a frontal field in an anechoic space, the resulting curves don't represent the actual correction needed for real listening environments.  The really unfortunate part of Fletcher-Munson is that the curves became widely adopted, but almost entirely misunderstood.  They were applied as complete loudness correction curves, when in fact, they represent human hearing response (in those specific test conditions).  Loudness compensation doesn't need to correct for human hearing response, it just needs to correct for the variance in response at differing levels.


Interestingly, F-M and Stevens disagree on loudness growth at low frequencies, and having built systems using both models for loudness growth, I've been much, much more successful with a variation on F-M than I have with Stevens (annoyingly that work belongs to long-former employer, not even a recently former employer, and it hasn't been put to any use at all).

As to the flatness concern, once you realize that the bandwidth of the critical bands emerges as a factor, F-M makes a great deal of sense, actually.

But, as far as loudness ratio, I've had much more success with loudness ratios using a model I can't talk about (snarl, hiss, grumble) very much that are derived from F-M. Fletcher and Munson show more loudness growth at threshold than Stevens, and that's also been my experience.
Title: DSP Loudness Control
Post by: dc2bluelight on 2012-03-31 06:20:39
Interestingly, F-M and Stevens disagree on loudness growth at low frequencies, and having built systems using both models for loudness growth, I've been much, much more successful with a variation on F-M than I have with Stevens (annoyingly that work belongs to long-former employer, not even a recently former employer, and it hasn't been put to any use at all).

As to the flatness concern, once you realize that the bandwidth of the critical bands emerges as a factor, F-M makes a great deal of sense, actually.

But, as far as loudness ratio, I've had much more success with loudness ratios using a model I can't talk about (snarl, hiss, grumble) very much that are derived from F-M. Fletcher and Munson show more loudness growth at threshold than Stevens, and that's also been my experience.


Ah, someone with real hands on, now that's a treat!

What would be your comment on why F-M didn't work historically?  And why the Stevens-based systems worked markedly better, if much more rare? I'd have some ideas, but I'd rather hear it from someone who made a F-M system actually work.

This would fill in a few holes in what has been a 35 year hot topic for me.  F-M always seemed to do too much  in the LF in the classic realizations.
Title: DSP Loudness Control
Post by: splice on 2012-03-31 07:56:00
(Excuse my trimming of quotes, I'm trying to keep post lengths down. If you think I've trimmed too much and misrepresented your points, please say so.)

...  the resulting curves don't represent the actual correction needed for real listening environments.  ...  Loudness compensation doesn't need to correct for human hearing response, it just needs to correct for the variance in response at differing levels.


That's it exactly.  You understand it here, but why do you remain skeptical at the end of your post?

To complicate things, as most equal loudness curves go, even Fletcher-Munson, the high frequency portion of the curves above 1KHz are parallel, and so no adjustment is required in that range.  But designers applying the F/M equal loudness data to a loudness comp circuit often used the entire curve!  So we had boost at the top and bottom, and of course, the wrong amount at the bottom in any case.


This is where so many seem to get it wrong. As you point out, there's no need to apply an EQ curve that's the inverse of a given "equal loudness" contour of whatever provenance. All that is needed is a correction to compensate for reproducing a source (music) at a different level than that it was originally performed / mastered for.  Take a look at the ISO 226:2003 "equal loudness" curves. As a crude example, imagine you're listening to two tones - 20 Hz and 1 KHz - at the 60 phon level. You perceive them as equally loud, although the 1 KHz tone is at 60 dB SPL and the 20 Hz tone is at  110 dB SPL. Now you turn down the "volume" by 20 dB. This is equivalent to moving the 60 phon curve down to the 40 phon curve. The problem is that they don't match. You've lowered the 1 KHz signal from 60 dB SPL to 40 dB SPL, and you've lowered the 20 Hz signal from 110 dB SPL to 90 dB SPL. But you can see from the curves that a 20 Hz signal should be reproduced at 100 dB SPL to match the 40 dB SPL 1 KHz signal in loudness. In short, if you change the level at 1 KHz by x dB, you have to change the level at 20 Hz by x/2 dB. This is a ratio, not a fixed EQ. It doesn't need an absolute reference level to work.

Stevens work included a wide variety of stimulus methods, from diffuse, free-field, earphones, etc., and included several subjective quantities as well (annoyance, etc.).  One of his test systems extended down to 1Hz!  And while that's not useful for loudness compensation, it's notable since other research stops at 20Hz.


Other researchers have done work in the 1-20 Hz area recently. I have papers by Yeowart and Evans, and Moller and Pederson, but there may well be others.

As to the supposed loudness compensation built into music by composers (Ravel, et al),  their hypothetical compensation is valid for only one listening position: the conductor's podium.  Ever other seat in the house will hear something else.  However, no seat will have a basic level change anywhere near 20dB.  Yet that's the kind of level shifts we see in recorded music played in private listening conditions.  With that kind of offset, and looking at any equal loudness contour curve family, anyone can see for this to work it must be dynamic and must operate with the knowledge of actual playback SPL.  No fixed modifier would be correct at anything but one specific SPL.


I'm not proposing a fixed modifier. I'm proposing a fixed ratio (2:1 as a ballpark figure). If the loudness compensation (crudely, the bass level relative to the midrange level) is correctly set at one specific SPL, and the ratio is applied to any level change, then the compensation will also be correct at the new SPL.

The dual-control loudness compensation idea has been tried (Yamaha, late 1970s, early 1980s,  Apt-Holman, 1978), but has not survived even though the Apt-Holman implementation actually applied correction based on the Stevens data.  The reason is simple: people can't be depended upon to make continual subjective evaluation and apply correction.  Two knobs might get you close, but only at one SPL (at least some music still has dynamic range), and one volume setting.  The knob would require constant adjustment, something no listener will do.


I'm aware of the earlier schemes. They did not accurately couple (or in some cases couple at all) the level and "loudness compensation EQ" controls. I do. "Continual subjective evaluation" is thus not required, and the right amount of correction is applied regardless of the listening level.

Bob Katz has made some excellent inroads in studios, but there's decades of music already recorded and released without any of that, and still today volumes of music released without standardization.


Most of my listening is to various sub-genres of "rock". My vinyl collection spans some 20 years. Almost all of it plays back within a 20 degree arc of the volume control. My early CDs play back as a group, there's the 90s transition, then most of the last 10 years play back as another group. I accept that other genres may be more varied in their playback levels.  My point is that it's not hard to establish a playback level that the music was intended to be heard best at, and this level doesn't vary all that much between like grouped sources.

I don't now how else to make the point that compensation must be dynamic, but if all of the above doesn't do it, perhaps ask yourself: if it's so simple as to be a fixed, static correction, why at this point in history have we moved completely away from fixed-curve and dual control systems? Why do the pre-eminaet voices in this field all say it has to be dynamic?  Must be something they know. ...


I remain unconvinced. I'm not proposing a "fixed, static" correction. My proposal is also different than any "dual control" system I have seen, and I have been looking hard. And if by "dynamic" you mean that the EQ adjusts itself based on the (varying) level of the source, then I disagree strongly. That would be equivalent to twiddling the bass tone control to match the loud and quiet parts of the music, and we just don't do that. (Well, I don't, anyway.)

One more time... My system has two knobs. As I originally envisaged it, one knob is more or less "set and forget" for a given genre and input source, especially if the source has Soundcheck or Replaygain. The other knob is the main "volume" control. Adjusting this control also applies the correct amount of "loudness compensation" for that volume. In concept, the bass tone control is ganged to the volume control. Where this differs from other schemes is that the ratio of bass to overall level is fixed, and matches the ratio inherent in the "equal loudness" curves.

An alternative scheme which may be more user friendly is to again have two knobs - one labeled "volume" and one labeled "bass", which actually sets the operating point of the loudness compensation. Adjust the volume control to your desired level, regardless of the original intended playback level, then adjust the bass control to your taste - "not too heavy, not too light". But behind the scenes, the two controls are actually linked, so any subsequent adjustment of the volume control automatically applies the correct level of loudness compensation.


Title: DSP Loudness Control
Post by: Woodinville on 2012-03-31 09:25:55
What would be your comment on why F-M didn't work historically?  And why the Stevens-based systems worked markedly better, if much more rare? I'd have some ideas, but I'd rather hear it from someone who made a F-M system actually work.


Not having hands-on other Stevens systems, I suspect it was getting the skirts on the cochlear filters right.  The upward spread that reduces loudness of higher frequencies near masking level can bite pretty hard if you don't get it right.

But that is a conjecture.
Title: DSP Loudness Control
Post by: Woodinville on 2012-03-31 09:27:49
An alternative scheme which may be more user friendly is to again have two knobs - one labeled "volume" and one labeled "bass", which actually sets the operating point of the loudness compensation. Adjust the volume control to your desired level, regardless of the original intended playback level, then adjust the bass control to your taste - "not too heavy, not too light". But behind the scenes, the two controls are actually linked, so any subsequent adjustment of the volume control automatically applies the correct level of loudness compensation.


For a given standard genre, this has a shot at working "ok", I think.

Unusual music will give it fits, though, I suspect.
Title: DSP Loudness Control
Post by: dc2bluelight on 2012-03-31 10:01:41
I remain unconvinced. I'm not proposing a "fixed, static" correction. My proposal is also different than any "dual control" system I have seen, and I have been looking hard. And if by "dynamic" you mean that the EQ adjusts itself based on the (varying) level of the source, then I disagree strongly. That would be equivalent to twiddling the bass tone control to match the loud and quiet parts of the music, and we just don't do that. (Well, I don't, anyway.)
Ok, but that's precisely what is required. 
One more time... My system has two knobs. As I originally envisaged it, one knob is more or less "set and forget" for a given genre and input source, especially if the source has Soundcheck or Replaygain. The other knob is the main "volume" control. Adjusting this control also applies the correct amount of "loudness compensation" for that volume. In concept, the bass tone control is ganged to the volume control. Where this differs from other schemes is that the ratio of bass to overall level is fixed, and matches the ratio inherent in the "equal loudness" curves.
  Yes, I understand what you are saying, but please understand that this has been done, and did not succeed because because of an error in concept outlined in your previous sentence, "the ratio of bass to overall level is fixed, and matches the ratio inherent in the "equal loudness" curves."  The curve families show the ratio of bass to overall level is not fixed, it's a non-linear relationship.  Because it's non-linear, every time you change the overall level, you operate at a point where the rate-of-change in bass sensitivity is different, and the lower the overall level the faster the rate of change in bass sensitivity.  It's the rate-of-change problem that dictates the fact that compensation cannot be fixed.  It must track the rate of change of bass sensitivity of the ear.  The ear/brain system has what is essentially a volume expander that is both frequency and level dependent.  The expansion ratio is dependent on the specific SPL as well as the specific frequency of stimulus.  That's why it takes a family of equal loudness curves to show what's actually going on, and also takes something fairly complex and dynamic to perform the compensation.  We're kicking around the details of what curve-set to follow in other posts, but they all have this non-linear ratio characteristic. 
An alternative scheme which may be more user friendly is to again have two knobs - one labeled "volume" and one labeled "bass", which actually sets the operating point of the loudness compensation. Adjust the volume control to your desired level, regardless of the original intended playback level, then adjust the bass control to your taste - "not too heavy, not too light". But behind the scenes, the two controls are actually linked, so any subsequent adjustment of the volume control automatically applies the correct level of loudness compensation.

Actually, this is exactly what Yamaha and Holman did, as I mentioned before.  Yamaha had a Volume knob and a Loudness knob, which set the amount of fixed bass boost.  You set the volume, then set the loudness to taste.  The Apt-Holman preamp had a Volume control and a Bass control with a contour that matched a differential curve derived from Stevens.  Again, you set the volume as desired, then set the bass for the best subjective compensation.  However, in Holman's 1977 AES paper, "Loudness Compensation: Use and Abuse", from which his bass control-loudness comp was derived, among his conclusions he states that while variable compensation is required, the only correct solution would be for it to react to program material (sorry, I'll post the exact quote later).  The technology to accomplish that accurately and economically didn't exist then, but does now.

It would seem the that this conclusion is supported by the fact that loudness compensation began to vanish from consumer audio gear from the 1980s onward, and with the exception of several DSP based dynamic systems such as the afore mentioned Audyssey, (which by the way, Holman was involved with), Dolby Volume (subjectively less effective, because while it's dynamic, it doesn't have specific SPL information on which to base its correction), loudness compensation has not generally made a reappearance in today's products.  If you can answer the question of why that might be, you'll also answer the question of why a simple approach isn't effective and therefore, not used.
Title: DSP Loudness Control
Post by: splice on 2012-03-31 11:05:28
You need frequency domain equalization (i.e. a filter curve) that varies with the signal (and of course frequency and presentation level), and where the actual gain of the system post-filter is known to a dB or so.


OK, I think I get it. Let me explain it as I understand it and please tell me if I have it right.

As I see it, the source was performed / mastered to sound "right" at a given SPL. This implies listening to it at that SPL, where the relative levels of bass vs midrange are as intended. "Loudness compensation" maintains the perceived balance between bass and midrange when the "volume control" is adjusted to listen at SPLs other than those for which the music was "designed". I suspect we're all in agreement so far?

Where we appear to differ is in how this compensation might be achieved, and how difficult it would be for the listener to use.
I see it as easy to achieve. In principle it is a Baxandall type bass tone control coupled to a "volume" control - lowering the listening level increases the bass boost and vice versa. The coupling ratio is set so that a change in the listening level is matched by a change in the bass level that restores the perceived balance between bass and midrange. The coupling is "one way" - you can adjust the bass separately from the volume, but adjusting the volume will always adjust the bass.

To use it, you play the music, and adjust the volume control to your desired listening level. You then adjust the bass until it sounds right to you - "not too heavy, not too light." (Alternatively, you can initally adjust the volume to a "realistic" or "life size" level and set the bass to "flat".) You can then change the listening level and the "loudness compensation" will track the change, so the perceived bass - midrange balance doesn't change. This will likely require readjustment of the bass as well as the volume control if you make a significant change to the input level, for example from an 80s vintage CD to a 00s vintage CD, but I submit that it is an intuitive adjustment and a small price to pay for the added perfomance. This assumes a standalone implementation for use with classic analog sources, but it can be fully automated if you have all your music on a server with Soundcheck / Replaygain set.

This differs from earlier schemes of loudness compensation.
The "tapped volume control" is at best only an approximation of the desired curve, only works over part of the travel of the volume control, and only tracks properly if the normal listening level occurs close to maximum clockwise rotation of the pot.
The separate loudness control (Yamaha et al) does not track the volume control setting at all. It has to be readjusted for each volume setting change.
If you know of any implementation where the loudness control is coupled to the volume control, I'd like to hear of it. It's so obvious (to me at least) that I can't see why it hasn't been done before. I tend to agree that the must be a catch, that if it really were that easy it would have been done before, but so far no-one has come up with a practical reason why it wouldn't work beyond the added complexity to the user.


Title: DSP Loudness Control
Post by: splice on 2012-03-31 11:36:06
...  if by "dynamic" you mean that the EQ adjusts itself based on the (varying) level of the source, then I disagree strongly. That would be equivalent to twiddling the bass tone control to match the loud and quiet parts of the music, and we just don't do that. (Well, I don't, anyway.)
Ok, but that's precisely what is required.


Why do you think it is required? If you wouldn't manually adjust the bass control during the "performance", for example during the playing of Ravel's "Bolero", why would you want a DSP to do it for you?

...  Yes, I understand what you are saying, but please understand that this has been done, and did not succeed because because of an error in concept outlined in your previous sentence, "the ratio of bass to overall level is fixed, and matches the ratio inherent in the "equal loudness" curves."  The curve families show the ratio of bass to overall level is not fixed, it's a non-linear relationship.  Because it's non-linear, every time you change the overall level, you operate at a point where the rate-of-change in bass sensitivity is different, and the lower the overall level the faster the rate of change in bass sensitivity.  It's the rate-of-change problem that dictates the fact that compensation cannot be fixed.  It must track the rate of change of bass sensitivity of the ear.  The ear/brain system has what is essentially a volume expander that is both frequency and level dependent.  The expansion ratio is dependent on the specific SPL as well as the specific frequency of stimulus.  That's why it takes a family of equal loudness curves to show what's actually going on, and also takes something fairly complex and dynamic to perform the compensation.  We're kicking around the details of what curve-set to follow in other posts, but they all have this non-linear ratio characteristic.


What's non-linear about a 2:1 ratio that holds over at least an 80 dB range? I'm missing something here. (Not my marbles, every morning I look in my toybox and there they are.) I'm getting an impression from you and JJ that loudness compensation EQ has to vary as the signal level varies within a given musical performance, and I find that hard to accept. If it is in fact the case, I have some learning to do. I'll have to get Holman's paper that you referenced.

Title: DSP Loudness Control
Post by: splice on 2012-03-31 13:03:29
... Unusual music will give it fits, though, I suspect.


The loudness control? Or the listener?
I'm getting an impression that there's more to perception of bass levels than the steady state indicated by the "equal loudness" curves.
Given a piece of recorded music with quiet parts and loud parts, played at a level different than that for which it was originally mixed, you appear to be saying that a different amount of loudness compensation will be required during the playing of the two different parts in order to make them both sound as "balanced" as they do when played at their intended level.
Title: DSP Loudness Control
Post by: dc2bluelight on 2012-03-31 17:19:04
I'm getting an impression from you and JJ that loudness compensation EQ has to vary as the signal level varies within a given musical performance, and I find that hard to accept. If it is in fact the case, I have some learning to do. I'll have to get Holman's paper that you referenced.


Journal of the AES, July/August 1978, volume 26, number 7/8, pp 526 - 536.

There's so much to quote that is relevant, but here's two portions of one paragraph:

"A completely technically correct system would also need to be dynamic, that is, it should compensate for the fact that the recording is not always playing at the peak level.  Instead, the system needs to look at the amount of attenuation between the original and reproduced sound pressure level and the amount of attenuation at any given point in time below the peak level." ..."...the translation between original and reproduced sound pressure level calls for less compensation at the upper end of the dynamic range than the lower end: the level translation causes the need for dynamic compensation." (bold italics, mine).


Take any equal loudness contour curve family and re-graph the data for any bass frequency as a gain differential between 1khz and that frequency.  You'll see the resulting graphs for lower frequencies are not straight lines, they are non-linear curves.
Title: DSP Loudness Control
Post by: dc2bluelight on 2012-03-31 17:38:36
You can buy the paper from the AES, if I get time I might be able to scan my physical copy. But clearly its an old paper,  the core of it and the outgrowth using modern technology can also be gleaned from this page:

http://www.audyssey.com/audio-technology/dynamic-eq (http://www.audyssey.com/audio-technology/dynamic-eq)
Title: DSP Loudness Control
Post by: splice on 2012-04-01 11:41:33
... Take any equal loudness contour curve family and re-graph the data for any bass frequency as a gain differential between 1khz and that frequency.  You'll see the resulting graphs for lower frequencies are not straight lines, they are non-linear curves.


Is this the differential you refer to?
At the 40 phon level, the differential between 1 KHz and 20 hz is about 60 dB.
At the 60 phon level, the differential between 1 KHz and 20 hz is about 50 dB.
At the 80 phon level, the differential between 1 KHz and 20 hz is about 40 dB.

Title: DSP Loudness Control
Post by: dc2bluelight on 2012-04-01 17:03:35
... Take any equal loudness contour curve family and re-graph the data for any bass frequency as a gain differential between 1khz and that frequency.  You'll see the resulting graphs for lower frequencies are not straight lines, they are non-linear curves.


Is this the differential you refer to?
At the 40 phon level, the differential between 1 KHz and 20 hz is about 60 dB.
At the 60 phon level, the differential between 1 KHz and 20 hz is about 50 dB.
At the 80 phon level, the differential between 1 KHz and 20 hz is about 40 dB.


From memory the actual differences above seem too high, but that's the general idea.
Title: DSP Loudness Control
Post by: splice on 2012-04-01 22:04:36
From memory the actual differences above seem too high, but that's the general idea.


Those are the ISO 226:2003 differences. Here are the approximate differences for Fletcher-Munson, Robinson-Dadson, and ISO 226:3003:
Code: [Select]
Phon  F-M  R-D  ISO
----  ---  ---  ---
40    42   50   60
60    26   41   50
80    10   32   40


But the important figure is the delta between the differences:

Code: [Select]
Phon  F-M  R-D  ISO
----  ---  ---  ---
40    42   50   60
=20   =16   =9  =10
60   =26   41   50
=20   =16   =9  =10
80    10   32   40


This holds true within a dB or two for other phon levels, especially in the ISO curves.
OK so far?

Title: DSP Loudness Control
Post by: dc2bluelight on 2012-04-01 22:37:42
This holds true within a dB or two for other phon levels, especially in the ISO curves.
OK so far?

Sorry, I don't have a convenient way to check your data for accuracy.  But assuming it is (accurate), what's your point?
Title: DSP Loudness Control
Post by: splice on 2012-04-02 01:49:14
... Sorry, I don't have a convenient way to check your data for accuracy.  But assuming it is (accurate), what's your point?


Maybe an example will help:


We have a piece of "music" that has the following characteristics:

1. It consists of a "loud part" and a "quiet part", with 10 dB difference.

2. The "instruments" produce a 1 KHz tone and a 20 Hz tone in both parts. 

3. The music sounds "as composed" when the two tones are of equal perceived loudness in both parts and the loud part is played at the 80 phon level. (The two parts sound the same, except that one is quieter than the other.)

4. The dB values in the following examples assume the use of the ISO 226:2003 curves. The values will be different if using other curves such as those by Fletcher-Munson or Robinson-Dadson, but the same principle and relative ratios still apply.

For ISO 226:2003, the SPL difference between "equal loudness" 1 KHz and 20 Hz tones are approximately:

80 phon level = 80 dB SPL 1 KHz, 120 dB SPL 20 Hz = 40 dB difference.
60 phon level = 60 dB SPL 1 KHz, 110 dB SPL 20 Hz = 50 dB difference.
40 phon level = 40 dB SPL 1 KHz, 100 dB SPL 20 Hz = 60 dB difference.
20 phon level = 20 dB SPL 1 KHz,  90 dB SPL 20 Hz = 70 dB difference.


Under the above conditions, the resultant SPLs for the example "music" played at its intended level are as follows:

Loud part:
1 KHz =  80 db SPL
20 Hz = 120 dB SPL (40 dB difference)
Quiet part:
1 KHz =  70 db SPL
20 Hz = 115 dB SPL (45 dB difference)

No loudness compensation is required for the quiet part. The music is being played at its "composed" level. The perceived balance between the "instruments" is as intended by the artist: in this case, both "instruments" are of equal perceived loudness during both parts.

The listener finds this too loud for their environment so reduces the "volume" level by 20 dB. The resultant SPLs are now as follows:
Loud part:
1 KHz =  60 db SPL
20 Hz = 100 dB SPL (40 dB difference)
Quiet part:
1 KHz =  50 db SPL
20 Hz =  95 dB SPL (45 dB difference)

The listener perceives a loss of bass, because the curves indicate that the "loud part" difference should be 50 dB and the "quiet part" difference should be 55 dB at this listening level. 10 dB of boost at 20 Hz needs to be applied to both the loud and quiet parts in order to restore the equal perceived loudness.

The listener reduces the "volume" level by a further 20 dB. The resultant SPLs are now as follows"
Loud part:
1 KHz =  40 db SPL
20 Hz =  80 dB SPL (40 dB difference)
Quiet part:
1 KHz =  30 db SPL
20 Hz =  75 dB SPL (45 dB difference)

The listener perceives a further loss of bass, because the curves indicate that the "loud part" difference should now be 60 dB and the "quiet part" difference should be 65 dB. Another 10 dB of boost at 20 Hz needs to be applied to both the loud and quiet parts in order to restore the equal perceived loudness.

The key points are that:

1. For each 10 db of difference between the "artist intended" SPL and the actual listening SPL, 5 dB (at 20 Hz) of "loudness compensation" needs to be applied. (Assuming the ISO 226:2003 curves are used.)

2. The same amount of "loudness compensation" needs to be applied to the loud and quiet parts of the music.

3. "Dynamic" compensation, that alters according to the changing dynamics within the piece of music, will alter the artist's intended balance. The compensation amount should be determined only by the difference between the "artist intended" SPL and the actual listening SPL, set by adjusting the "volume control".


For a practical implementation, two parameters need to be determined.

The first is the time constant of the tone control. This can be determined by examining the difference between two curves on the "equal loudness" curve. Two curves that differ by, for example, 20 dB SPL at 1 KHz, will end up differing by only 10 dB SPL by the time they get to 20 Hz. The curve derived from the differences will determine the required time constant for the tone control network. (Note that it's the difference between two different phon levels that is important, not the absolute phon level. For example, the curve derived from the difference between the 20 phon and 40 phon curves is the same as the curve derived from the difference between the 60 and 80 phon curves.) I've got as far as using Jeff Hackett's Matlab routine to spit out a spreadsheet of the curve values.

The second is the amount of loudness compensation needed for a given volume change. Again, it can be determined from the curves, noting for ISO 226:2003 that a 20 dB SPL change at 1 KHz requires a 10 dB change at 20 Hz in order to maintain a perceived balance between bass and midrange.

Does this make sense, or have I over-simplified due to an inadequate knowledge of how the auditory system responds?

Title: DSP Loudness Control
Post by: dc2bluelight on 2012-04-02 04:38:08
Oversimplified. 

Your example assumes that all stimulus within the entire spectrum start out at a perceived equal loudness relative to each other.  But what, for example, if a bass note were intended to be 10dB below a mid-band note?  At the original level, all is well, but if you alter the play level, you'll find that the two notes land on entirely different theoretical curves.  And even that example is too static to be realistic.

The problem is, the curves, all of them, were derived from data taken by using stimuli presented at "equal loudness levels" with respect to each other, and all were steady-state, with no variation over time, be they tones or band-limited noise signals.  It had to be done that way to characterize the ear/brain response in some easily presented manner.  The data shows what it would take for a stimulus to be perceived as equal in loudness to a mid-band reference signal.  So long as we realize that that is what the curves show, we're fine.  And we can see the nonlinear nature of LF ear/brain response represented by a curve family.  But the equal-loudness contour curves do not represent the actual correction needed, because they are not correction curves at all, they are graphs of stimuli level required for perceived equal loudness.  The actual correction curve must be derived from a known original reference level, and the new play level of each independent stimulus frequency, and the predicted differential in ear/brain response shown by the contours.  Note: you can't apply a single curve to the spectrum, you have to correct for the level of each frequency because (and this is the key) the ear/brain system is non-linear to a different degree for each frequency below mid-band. 

So, for proper loudness compensation you need to know:
1. The acoustic original level (at the time of recording)
2. The new acoustic level during playback of each frequency or group below 1KHz (actually, 400Hz would work as well)
3. The amount of correction needed for each frequency to re-establish spectral balance for playback at the new level considering its specific moment to moment acoustic level.

The problem is, for #1 you have to make certain assumptions, unless the original was mixed in a standardized control environment (again, this happens routinely in film and has for decades, not so much for music).  For #2, you have to know a lot about the play system to establish any meaningful relationship between real acoustic level and a knob position.  For #3, we open up the whole "which curve" can of worms, F/M, R/D, ISO, or what?  J.J. likes F/M, I've always felt the Stevens data was closer, then there's the wizards at Audyssey, who didn't use any existing data, but gathered their own, and didn't use an anechoic chamber, or tones, or band-limited noise.  They did it with actual program material, a wide variety at that, and they took terrabyes of data points.  And their solution is dynamic. 

Considering music as the stimulus, we must note that it changes moment by moment.  And yes, some music probably does present stimuli across the spectrum at equally perceived loudness levels, but certainly not all. If we expand our possible signals to include non-music material such as film soundtracks, the problem becomes even more obvious.  Moreover, the stumuli doesn't happen exactly concurrently in time or level. 

If you're willing to make certain compromises, you could apply a fixed correction to certain music because it doesn't have a lot of dynamic range, and is mixed with a relatively fixed relationship between bass and mid-band.  In fact, it's because of that fact that loudness compensation has worked at all over the years.  But you must realize that this solution is situational at best, certainly far from universal. 

The only volume setting at which a fixed, non-variable correction curve would be right would be that which results in the exact acoustic level of the original performance, at the original listening position.  In other words, the level at which no loudness compensation is required.

So, just because I'm running out of gas on this, here's the "agree to disagree" statement: feel free to build up your favorite curve set in a DSP, and give it a whirl. Make it adjustable using non-variable curves. Use the old one-knob or two, or three, or no knobs. Tweak it by ear, RTA, FFT, or fuzzy clustering.  One of the nice things is, today, it's not so big a deal to do that, even if you do prefer to ignore the research.  Who knows? You just might be thrilled with your result!  And that, I would have to say, will make it valid for you.  But you may want to grab a few of those loudness research papers and look them over.  At very least, you'll have a cure for insomnia.
Title: DSP Loudness Control
Post by: splice on 2012-04-02 06:23:39
Oversimplified.


Understood. I appreciate your point that sources with dynamics cause the ear to exhibit different behaviour than it does with steady state tones. I have some learning to do.

Your example assumes that all stimulus within the entire spectrum start out at a perceived equal loudness relative to each other.  But what, for example, if a bass note were intended to be 10dB below a mid-band note?  At the original level, all is well, but if you alter the play level, you'll find that the two notes land on entirely different theoretical curves.  And even that example is too static to be realistic.


Please bear with my static, unrealistic model for a moment. I ran the figures again assuming the bass in the loud part was 10 dB (SPL) below the level required to provide "equal loudness" compared with the midrange. The non-intuitive part was, "if the bass in the loud part is 10 dB SPL below the midrange, what level must the bass be in the quiet part to maintain the same perceived loudness difference?" And the answer was, 10 dB SPL below the midrange. (Note that in both cases, the "perceived loudness" difference is actually 20 dB. In other words, for both parts you would need to drop the midrange by 20 dB to make it "equal loudness" with the bass.) Dropping the listening level by 20 dB still showed the same difference: for both loud and quiet parts, 10 dB of "loudness compensation" at 20 Hz was required. Dropping the volume by a total of 40 dB, a total of 20 dB of compensation was required. So if you say it's wrong, I must be making a fundamental error in my calculations.


But the equal-loudness contour curves do not represent the actual correction needed, because they are not correction curves at all, they are graphs of stimuli level required for perceived equal loudness.  The actual correction curve must be derived from a known original reference level, and the new play level of each independent stimulus frequency, and the predicted differential in ear/brain response shown by the contours.


I'm quite clear on the point that the curves do not represent the actual correction needed. But I believe the compensation required can be derived from the curves, at least for steady state signals, because the relationship between the curves follows a single law. That is, although the curves are all different in shape, the differences between them work out to a single set of deltas. Applying this to each curve makes them all "line up" over each other. Take a curve at a given phon level, apply the correction, and it matches closely the next phon curve up. I'll have to post some graphs from my spreadsheet.

Note: you can't apply a single curve to the spectrum, you have to correct for the level of each frequency because (and this is the key) the ear/brain system is non-linear to a different degree for each frequency below mid-band.


I thought that's what I was doing? Apparently not.

So, for proper loudness compensation you need to know:
1. The acoustic original level (at the time of recording)
2. The new acoustic level during playback of each frequency or group below 1KHz (actually, 400Hz would work as well)
3. The amount of correction needed for each frequency to re-establish spectral balance for playback at the new level considering its specific moment to moment acoustic level.


I do understand that there has to be an initial reference level set that matches the intended playback level (not necessarily the original recording level, it may have been rebalanced in mixing / mastering to suit a different playback level.) I'm taking that as a given, in order to focus on what I'm doing wrong.

... The only volume setting at which a fixed, non-variable correction curve would be right would be that which results in the exact acoustic level of the original performance, at the original listening position.  In other words, the level at which no loudness compensation is required.


I'm not suggesting a "fixed, non-variable curve". I'm suggesting that there is a specific curve for every difference between the "original performance" level and the listener's choice of level. And this curve follows a simple law. I appreciate that this is only valid for steady-state signals. I'll need to learn more about how we hear, in particular the differences in response to time-varying signals versus steady state.

... So, just because I'm running out of gas on this, here's the "agree to disagree" statement: feel free to build up your favorite curve set in a DSP, and give it a whirl. Make it adjustable using non-variable curves. Use the old one-knob or two, or three, or no knobs. Tweak it by ear, RTA, FFT, or fuzzy clustering.  One of the nice things is, today, it's not so big a deal to do that, even if you do prefer to ignore the research.  Who knows? You just might be thrilled with your result!  And that, I would have to say, will make it valid for you.  But you may want to grab a few of those loudness research papers and look them over.  At very least, you'll have a cure for insomnia.


Thank you for your time. I appreciate it. I have plenty of areas to investigate now. Insomnia? I haven't had that problem for a very long time. My biggest problem is that I get involved in reading some paper, I start scribbling notes and diagrams and suddenly it's 3 in the morning... 



Title: DSP Loudness Control
Post by: dc2bluelight on 2012-04-02 16:14:06
I'd iterate, one last time, three points:

1. In research many have concluded the control has to respond to the input signal dynamically.  Those include, but are not limited to, the Holman paper, the Stevens data (apparently his Mark VI system revealed the issue, though he was focused on annoyance prediction), I suspect the data in the Torick-Bauer papers and others too, though my loudness file is almost two inches thick, and I'm not prone to read it all again having had that fact established quite well the first time.

2. Today's loudness compensation solutions implemented in DSP include the afore mentioned Audyssey Dynamic Volume and Audyssey Dynamic EQ, Dolby Volume, and THX Loudness Plus.  All are dynamic and respond to the input signal.  You may Google at will. All three had the same basic goal, all three are commonly available on inexpensive equipment. None of those organizations would put man-years and $$$ of R and D resources into such a project if the simple solution were adequate.

3. A very brief and informal survey of mine, though admittedly incomplete, shows that there are no loudness compensation systems on the market today that use a non-dynamic, steady state, user variable contour solution.  I may have missed one, but the high-end market would have nothing whatever to do with EQ of any kind, and loudness comp passed out of the consumer receiver market decades ago, including a single switch/single knob control, a two-knob system, and even a three-knob system approach.  If I have somehow not described the solution you suggest correctly, I apologize.  Again, I'd like to reference the Holman paper where it discusses an implementation of a three-control loudness comp system as impractical.

The above three points should be enough to send anyone interested in loudness comp to the library to see what the Loudness Jedi have learned over the years.  I'd just hate to see anyone try to re-invent the square wheel (even in DSP) when experts have already agreed it needs to be round.
Title: DSP Loudness Control
Post by: splice on 2012-04-02 21:37:53
I'd iterate, one last time, three points:


On your point 1: I accept that dynamic compensation is required in addition to "fixed" compensation for optimum performance. I interpret "dynamic" to mean "compensation EQ varies based on variations in the level of the program material", and "fixed" to mean "The EQ curve and amplitude will vary depending on the system volume setting, but is not affected by the program material".

On your point 2: I've spent a couple of days reading through the relevant threads of the Audyssey forum over at AVS. (Over 52,000 posts...)  Audyssey Dynamic Volume is not concerned with "loudness compensation". Audyssey Dynamic EQ is. The two systems work differently to different purposes. Audyssey Dynamic EQ does, as its name suggests, have a dynamic component on top of a "fixed" (varies with Master Volume setting, not with program level) component similar to the scheme I have been describing, though it uses proprietary curves rather than the published curves. I haven't looked at Dolby Volume yet. I've been told that THX Loudness Plus doesn't have a dynamic component, though I would expect it to do so.

On your point 3: The foundation of the Audyssey DEQ is "non-dynamic, steady state". They added a "user variable" gain offset in response to feedback from users. The control ("RLO") effectively reduces the amount of DEQ applied to a given signal by offseting the level set by the Master Volume. The result is a user variable contour. Audyssey suggest different settings for film versus music, and different settings for different music genres / dynamic structures.

The above three points should be enough to send anyone interested in loudness comp to the library to see what the Loudness Jedi have learned over the years.  I'd just hate to see anyone try to re-invent the square wheel (even in DSP) when experts have already agreed it needs to be round.


Why develop my own when I could buy a ready-made solution which benefits from much more R&D than I could possibly afford to do? For the experience.
I'm not into re-inventing wheels, but I am into making wheels that fit my requirements rather than buying ready-made ones that aren't quite right.
I'm reminded of a saying abut car manufacturers: They require their tyres to be round, black and cheap. And they aren't too fussy about the round part.

My original aim was to provide a solution to a different problem - how to manage the bass in an OB speaker system with optional subwoofer, using an analog implementation. I've come a long way from that... Next step on that path, get up to speed with the literature. I also want to write up an explanation of how the "fixed" part of loudness compensation works, that can be understood by non-technical users. There has been years of debate on whether the behaviour indicated by the "equal loudness" curves can be compensated for with a single control law, and the argument continues. I'll also need to point out that it is only part of the solution.

Edit: Just looking back to the OP in this thread, after all this he still doesn't have a workable solution. It reminds me of this cartoon:
http://www.projectcartoon.com/cartoon/2648 (http://www.projectcartoon.com/cartoon/2648)
Title: DSP Loudness Control
Post by: mbkennel on 2013-06-03 00:07:51
I don't know if I can solve the problem in its fullest form but I have done a little work.

I hope the board finds this useful. I found a MATLAB M-file for computing the ISO226 equal loudness contours.

I used this to compute what I believe to be reasonable loudness compensation curves.

I assume there is a reference dB SPL level (choose 80 or 90), and then a desired "level difference", i.e. how many db below the reference level we are generally listening at, and for which we need boost.

I take the equal loudness contours from REFDB and then REFDB-LEVEL_DIFF.  Then I normalize them by forcing them to match exactly at 1 kHz.  This is the somewhat arbitrary part.  Then the difference between them forms the loudness correction curve.

I found that the "reference dB level" is rather unimportant for the loudness correction over a reasonable range (e.g. 70 to 90db---the underlying approximation to ISO 226 doesn't accept values above 90dB)

It would be very nice if the computations involved could be wrapped into a nice foobar2000 plug in (e.g. a specialized version of the great Graphical Equalizer plugin from xnor which has great filters).
Loudness computations for MATLAB (https://dl.dropboxusercontent.com/u/87796515/Loudness_mbkennel_matlab_2013-06-02.zip)

(https://dl.dropboxusercontent.com/u/87796515/loudness_correction_curves.png)