Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Properly downmixing 5.1 to stereo (Read 44008 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Properly downmixing 5.1 to stereo

I'm currently working on a plugin to downmix 5.1 to stereo. First, I thought about how an ideal speaker setup looks like. I came up with this image (omitting the subwoofer):



What I conclude from this image is that the stereo separation of the rear channels is stronger than the stereo separation of the front channels. This means I have to mix the front channels differently into the stereo channels than the rear channels.

There are two extreme points that a speaker can have. It can be located at 0° (like the center channel). In that case the channel should go equally to the left and right channel. Or it can be located at ±90°, which means that 100% of the channel goes either to the left or to the right.

The front channels are positioned 30° from the 0° point, so the calculation would be as follows:
Front: 30° / 90° * 50 + 50 = 67%
So 67% of the channel goes to the same side, while the rest (33%) goes to the other side.

The calculation for the rear channels is similar:
Rear: 70° / 90° * 50 + 50 = 89%
So 89% of the channel goes to the same side, while the rest (11%) goes to the other side.

But then I noticed that this would be suitable for headphones but not for speakers. So I decided to set 70° as the maximum and not 90°:
Front: 30° / 70° * 50 + 50 = 71% (other side: 29%)
Rear: 70° / 70° * 50 + 50 = 100% (other side: 0%)

This way I have the widest possible stereo separation while maintaining the separation ratio between front and rear. But I still feel that it's just a compromise and not an ideal solution.

Then something else came to my mind. I noticed that most applications don't mix the center channel 50%/50% into the stereo channels but 71%/71% (-3.01dB = square root of 2, divided by 2). So, aren't two speakers with half the amplitude as loud as one speaker? If I should indeed use 71% instead of 50% I wonder how I have to apply this to the other channels.

Properly downmixing 5.1 to stereo

Reply #1
But then I noticed that this would be suitable for headphones but not for speakers.

It really depends on what you're trying to achieve.

For a good headphone experience you could simulate virtual sound sources (your 5 channels) using a simple model for head related tranfer functions. Search the web for HRTF model and/or check out this paper -- it was one of the first more promising google results I got. Of course, since the locations of your sound sources is fixed you could simply calculate all impulse responses in an offline process beforehand or just download HRIR recordings  . This for example looks interesting.

If you want a downmix for your home stereo I don't think you can do much better (*) than simply mixing the surround channels to the front channels like this:
Lt = FL + s*SL + c*C
Rt = FR + s*SR + c*C
where s (=surround mix) is usually something between 0.5 and 1
and c (=center mix) is usually 0.7. This would be a "normal" stereo downmix.

If want a "pro logic" downmix you could use something like this:
Lt = FL + s*(SL+SR) + c*C
Rt = FR - s*(SL+SR) + c*C  // 180° phase shift für SL+SR
with s=0.5 and c=0.7.

HRTF = head-related transfer function
HRIR = head-related impulse response
FL/FR = front left/right
SL/SR = surround left/right
C = center

Cheers!
SG

edit: (*) There's this "Dolby Virtual Surround". I have no idea what it does. But it might actually work. 

Properly downmixing 5.1 to stereo

Reply #2
If you want a downmix for your home stereo I don't think you can do much better than simply mixing the surround channels to the front channels like this:
Lt = FL + s*SL + c*C
Rt = FR + s*SR + c*C
where s (=surround mix) is usually something between 0.5 and 1
and c (=center mix) is usually 0.7. This would be a "normal" stereo downmix.

I read an article about stereo panning laws and now I know why it's 0.7(071...) and not 0.5 for the center. That also told me that my previous calculations are all wrong, because I have to apply the same panning law to the other channels, too. I calculated the new panning coefficients using the -3.01dB panning law. But the result is quite bad, because it makes the front channels nearly mono with only 33.3% left panning (79% to the left, 53% to the right and vice versa). So, I guess trying to preserve the original stereo separation ratios between the front and the rear channels doesn't really work out the way I wished.

Properly downmixing 5.1 to stereo

Reply #3
It seems to me that there is no direct formulaic way to do this.

If you want to simply make things sound good, you can use  Lt = L +Ls +C/sqrt(2) and the converse.

If you are worried about dialog audiblity you want to up the relative gain of C somewhat.

There is no "right" answer, really.
-----
J. D. (jj) Johnston

Properly downmixing 5.1 to stereo

Reply #4
How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?
EAC (Secure Mode) / LAME 3.97 (-V 2) / fb2k / M-Audio 24/96

Properly downmixing 5.1 to stereo

Reply #5
How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?



The audio stream in question probably has mixdown coef's contained somewhere in it. The processing shouldn't be hard once you know them.
-----
J. D. (jj) Johnston

Properly downmixing 5.1 to stereo

Reply #6

How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?



The audio stream in question probably has mixdown coef's contained somewhere in it. The processing shouldn't be hard once you know them.

So it's done differently on each disc then? I've  often wondered if it's downmixed with Dolby Pro Logic in mind or just plain stereo when using line out instead of spdif. Any info on this could be useful when doing your own downmixes.
EAC (Secure Mode) / LAME 3.97 (-V 2) / fb2k / M-Audio 24/96

Properly downmixing 5.1 to stereo

Reply #7


How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?



The audio stream in question probably has mixdown coef's contained somewhere in it. The processing shouldn't be hard once you know them.

So it's done differently on each disc then? I've  often wondered if it's downmixed with Dolby Pro Logic in mind or just plain stereo when using line out instead of spdif. Any info on this could be useful when doing your own downmixes.

May I direct you to atsc.org, there have a look on the A52/b-standard, describing AC-3 and E-AC3, chapter
7.8.1 gives a good idea what Dolby thinks about downmixing, thus it is the way it is implemented
in any DVD-player...

Properly downmixing 5.1 to stereo

Reply #8
According to the AC-3 specification on atsc.org this is the way to downmix 5.1 to stereo:
Quote
Lo = 1.0 * L + clev * C + slev * Ls ;
Ro = 1.0 * R + clev * C + slev * Rs ;
clev (center level) and slev (surround level) are provided by the AC-3 file.

That bring's up a new question for me: Until now I relied on the correctness of the AC-3 and DTS decoding plugins available for foobar2000. But these plugins don't do any downmixing. So, are clev and slev also used for decoding a 5.1 signal without downmixing to stereo? Because if they weren't there would be no way to use these values for downmixing afterwards.

Properly downmixing 5.1 to stereo

Reply #9
According to the AC-3 specification on atsc.org this is the way to downmix 5.1 to stereo:
Quote
Lo = 1.0 * L + clev * C + slev * Ls ;
Ro = 1.0 * R + clev * C + slev * Rs ;
clev (center level) and slev (surround level) are provided by the AC-3 file.

That bring's up a new question for me: Until now I relied on the correctness of the AC-3 and DTS decoding plugins available for foobar2000. But these plugins don't do any downmixing. So, are clev and slev also used for decoding a 5.1 signal without downmixing to stereo? Because if they weren't there would be no way to use these values for downmixing afterwards.

According to the spec, no, if the number of input channels equals the number of output channels, the signal is routed directly. In that case you'd have either to extract clev and slev from the decoder, or use the "worst case" downmix equation, which is also given somewhere in the doc, I think. Hope I understood your question correctly btw...

Properly downmixing 5.1 to stereo

Reply #10
The decoder should do the downmixing since it has access to the undecoded stream which may contain downmixing hints (clev and slev) as Woodinville and mcbear already noted. If you can only get 5.1 data and downmixing is up to you you should go with the clev and slev values used the most. (clev=sqrt(0.5), slev=1?)

Cheers!
SG

Properly downmixing 5.1 to stereo

Reply #11
So, clev and slev are only used for downmixing? In that case I would finally be able to do a proper downmix "by the book", at least if it's AC-3.

But what about DTS? I only found a rear channel attenuation setting in the encoder options (screenshot). What I don't know is if that's only a flag (that decoders must take into account) or if the rear channels are preprocessed so that it doesn't matter to the decoder anymore. And what are the "global" downmix rules for DTS if there are none embedded into each file?

Properly downmixing 5.1 to stereo

Reply #12
So, clev and slev are only used for downmixing? In that case I would finally be able to do a proper downmix "by the book", at least if it's AC-3.

Correct, according to the spec !

But what about DTS? I only found a rear channel attenuation setting in the encoder options (screenshot). What I don't know is if that's only a flag (that decoders must take into account) or if the rear channels are preprocessed so that it doesn't matter to the decoder anymore. And what are the "global" downmix rules for DTS if there are none embedded into each file?

DTS isn't that open as Dolby wrt this, i.e. to my knowledge you won't find any detailed information
as in the ATSC-documents. So any information posted here would probably bring some trouble
for the poster with it :-)
Depending on what you want to realize, I would go with the general approach, i.e. assume
you'll get 5.1 channels PCM and implement a downmix which prevents from overload under
worst case conditions. Which may lead to some loss, but will work.

Properly downmixing 5.1 to stereo

Reply #13
I've evaluated a DTS encoder and found out that the -3dB rear channel attenuation is in fact "hardcoded" into the stream. Therefore no additional attenuation of the rear channels should be necessary for a downmix.

I think it's sane to expect that the center channel is attenuated by -3dB. So the DTS downmixing formula should be as follows:
Code: [Select]
Lo = 1.0 * L + 0.7071 * C + 1.0 * Ls;
Ro = 1.0 * R + 0.7071 * C + 1.0 * Rs;

I derive the downmix factor of 1.0 for the rear channels from the fact that the rear speakers have the same distance to the listener as the front speakers and thus should be equally loud (ignoring the facing of the earlobes). But on the other hand the default value for AC-3 is -3dB. That's why I'm still not absolutely sure what to use.

Properly downmixing 5.1 to stereo

Reply #14
DTS isn't that open as Dolby wrt this, i.e. to my knowledge you won't find any detailed information
as in the ATSC-documents. So any information posted here would probably bring some trouble
for the poster with it :-)

You can get those docs legally without paying bucks. Unfortunately I don't remember the website. I registered somewhere and was allowed to download one document per day for free -- including the DTS specification.

Cheers!
SG

Properly downmixing 5.1 to stereo

Reply #15
There's a publicly available technical documentation for DTS available here.

Two passages caught my attention. Chapter 3.1.11 ("Stereo Down Mix") states that dynamic 2-channel downmixing coefficients can be embedded into the stream. Chapters 7.1.10 ("Embedded down mix flag") and 7.3.10 ("Stereo down mix coefficients") seem to specify that a bit further. It is strange however that I didn't find any options concerning this in the SurCode encoder and the screenshot of the official encoder. So maybe this was never used in any recording.

Properly downmixing 5.1 to stereo

Reply #16
This is "only" the white paper, btw.
...looks quite comprehensive. But I'm positive I got the real spec for free.
IIRC it was from ETSI.org / ETSI, free standards download page.

Cheers!
SG

Properly downmixing 5.1 to stereo

Reply #17
There's a publicly available technical documentation for DTS available here.

Two passages caught my attention. Chapter 3.1.11 ("Stereo Down Mix") states that dynamic 2-channel downmixing coefficients can be embedded into the stream. Chapters 7.1.10 ("Embedded down mix flag") and 7.3.10 ("Stereo down mix coefficients") seem to specify that a bit further. It is strange however that I didn't find any options concerning this in the SurCode encoder and the screenshot of the official encoder. So maybe this was never used in any recording.

Thanks for the link/links...at least something open to the public which can be refered to now.
In any case, using the "fail safe" coefficients seems to be advisable, since obviously you can't rely
on the downmix coefficients being embedded in the stream, and you'd need the means to extract
/access them.

Properly downmixing 5.1 to stereo

Reply #18
Let me revive this thread. Looking for some embedded downmix coefficients isn't really the way to go for me as not all surround formats provide this information. AC-3 has it, but DTS doesn't (at least it's not available in the programs I know). DVD-Audio might provide it but only two of my seven discs actually have it. Then I thought again about the best general formula and came to this:
Code: [Select]
L = Lf + C/2 + Ls
R= Rf + C/2 + Rs

I know that this looks quite different from what is most often used. But let me explain how I came to these values.

I. Center Channel
I was especially concerned about the center level as an attenuation of 3 dB seems to be the general rule of thumb instead of 6 dB. My goal was to achieve a phantom center channel that is equally loud than the original dedicated center channel. I imaged: What if I would split the speaker in half (halving the amplitude of each side, i.e. attenuating by 6 dB) and shove one half next to the left speaker and the other half next to the right one? In my theory this would not change the overall volume because both halved amplitudes would combine to the full amplitude when they both reach the listener (0.5 + 0.5 = 1). But I still wasn't sure if two speakers (each with half the amplitude) are really outputting the same energy as one speaker (with the full amplitude). So I conducted a test. I placed my two stereo speakers directly next to each other, placed a microphone in front of them, calibrated both sides so that the microphone picks up the same amplitude from left and right and then played back three test samples while measuring the amplitude the microphone registers.

Test sample 1: A sound (sine wave of 500 Hz) of full amplitude on one side and the other side being silent (factor: 1.0).

Test sample 2: The same sound distributed to both sides with an attenuation of 3 dB (factor: 0.707).

Test sample 3: The same sound distributed to both sides with an attenuation of 6 dB (factor: 0.5).

The conclusion is that my theory is correct in that sample 3 reproduces the same amplitude as sample 1. Sample 2 has a higher amplitude (by about 3 dB).

So, I think that if one wants to downmix 5.1 to stereo and wants to keep the same volume for the center channel one has to use a center channel attenuation of 6 dB and not 3 dB.

II. Surround Channels
Now, why no attenuation of the surround channels? I think if you want to preserve the relative volume of the original channels you should not attenuate the surround channels. In a 5.1 setup all speakers (ignoring the LFE channel) are equidistant to the listener. So why should one channel that has the same distance to the listener like another channel be attenuated while the other one isn't? And I don't think that the human head attenuates signals coming from behind by 3 to 6 dB compared to signals coming from the front (correct me if I'm wrong). So, attenuating the surround channels cannot be called "most accurate reproduction" but can only be seen like "adjusting it to one's tastes". Or am I wrong?

Re: Properly downmixing 5.1 to stereo

Reply #19
1st of all sorry to revive this old 3ad, but i'm still looking for the "optimal" way to downmix multichannel to 2.0 movie's audio.

I found this interesting (2013) research and wanna understand how to implement it using FFMPEG:

Novel 5.1 Downmix Algorithm with Improved Dialogue Intelligibility

Thanks in advice to anyone that will help.
F.O.R.A.R.T. npo

 

Re: Properly downmixing 5.1 to stereo

Reply #20
1st of all sorry to revive this old 3ad, but i'm still looking for the "optimal" way to downmix multichannel to 2.0 movie's audio.

I found this interesting (2013) research and wanna understand how to implement it using FFMPEG:

Novel 5.1 Downmix Algorithm with Improved Dialogue Intelligibility

Thanks in advice to anyone that will help.

something along the lines of

Code: [Select]
 ffmpeg -i  6chan-input.wav  -af "pan=stereo|FL < 1.0*FL + 0.707*FC + 0.707*BL|FR < 1.0*FR + 0.707*FC + 0.707*BR" -ac copy stereo.wav 


should do the trick, & should be straightforward to adjust as necessary - I only glanced at the paper. ...
I suggest trying with a short clip that includes  effects & dialog, and comparing that output with the ffmpeg default option of `-ac2`for stereo mixdown.

Re: Properly downmixing 5.1 to stereo

Reply #21
Sorry to revive this old thread, but if anyone knows a software with graphical user interface  that allows mixing 5.1 channels to home stereo, like this:
Code: [Select]
FL=1.0*FL + 0.707*FC + 0.707*BL
please let me know. Unfortunately the Command Line Interface is too ambiguous for me.
Thank's in advance :)