5.1 downmix to 2.0 (again) and buried dialogs

Topic: 5.1 downmix to 2.0 (again) and buried dialogs (Read 1769 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

5.1 downmix to 2.0 (again) and buried dialogs

2022-08-28 21:27:36

Hi,

Since I a have only a stereo setup (albeit a decent one) attached to my TV, I started a while ago to generate downmixed 2.0 tracks with ffmpeg on my video files with 5.1 (or 7.1) tracks.

My original motivation was a too low perceived loudness of the dialogs compared to the music/ambiant sound in *some* movies (not all of them!). My hypothesis at that time was that the built-in downmixing of my equipment was overweighting the left and right channels (both front and side) compared to the central channel where most dialogs are supposed to be placed.

So I started with the "-ac 2" option in ffmpeg... Which basically changed nothing (as far as I could say, at least). Investigating more I then found the -af "pan=stereo| FL< ... | FR< ..." syntax to chose the weighting coefficient of each 5.1 channel to build the stereo channels.

There were recommended coefficients:
FL < 1.0*FL + 0.707*FC + 0.707*SL
FR < 1.0*FR + 0.707*FC + 0.707*SR
These ones were giving the same result than -ac 2 to my ears.

There are also tons of alternate formula described on various web sites... I ended up with
FL < 0.707*FL + 1.0*FC + 0.707*SL
FR < 0.707*FR + 1.0*FC + 0.707*SR
It was doing what it was supposed to do: louder dialogs compared to music and ambient sounds.

However I finally observed that it was also narrowing the stereo image. Indeed, FC does not contain only voices but also a large part of the music and ambient sounds. Overweighting FC would not narrow the stereo image if it was containing only the voices, but this is not the case.

I kept wondering why the dialog loudness is sometimes perceived too low after downmixing, and I have a possible explanation: the brain is very good at isolating a voice buried in the ambient noise because it can locate where it comes from. That's why people with hearing aids still have difficulties to follow a conversation when multiple people speak at the same time: the earings aids can restore the volume, but the directivity is (mostly) lost... So, with a real 5.1 or 7.1 setup the brain is not bothered by the side/rear channels when it comes to focus on the central dialogs, because they come from fully different directions. But after downmix, what was coming from the side/rear channels is now coming from the front channels, making the separation task more difficult for the brain. The solution is hence to downweight the side/rear channels... Therefore I am now using:

FL < 1.0*FL + 0.707*FC + 0.4*SL
FR < 1.0*FR + 0.707*FC + 0.4*SR

And it seems better to me: dialogs are clearer, without narrowing the stereo image. But maybe this is just what I desperately want to hear...

Any thought on all of this ?

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #1 – 2022-08-28 22:30:16

It certainly seems plausible, but how many movies have you tried? It's possible that this method will produce better results on some movies and worse results on others. The -ac 2 switch in ffmpeg uses Dolby's official downmixing algorithm, so the result should be the same as if you played the 5.1 track and downmixed to stereo in realtime. Note that if you downmix to pcm_s16le in ffmpeg, the volume will be cut in half to prevent clipping, so you may want to downmix to pcm_f32le to prevent the volume reduction, instead.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #2 – 2022-08-29 07:24:42

I'm not worried by a global volume reduction (if any... I am encoding to aac and haven't notice such a reduction), for which I just have to turn up the volume knob when plying the movie

I have tried on a few movies/series where I could feel this "buried dialogs" problem, which are indeed only a minority. Most movies are OK when downmixed in realtime when playing them. That's why I am always keeping the 5.1 track and play it by default, and switching to my downmixed stereo track only when I don't like the realtime downmix.

And yes, I haven't noticed any difference between ffmpeg -ac 2 and realtime downmixing.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #3 – 2022-08-29 21:46:29

If you only have a stereo system, I'd say you ought to create 2 stereo tracks. One would use -ac 2, and the other would use your custom downmixing formula, just in case. Keeping the 5.1 track takes a lot of extra space if you're never able to listen to it in 5.1.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #4 – 2022-08-29 22:05:48

Quote from: Aleron Ives on 2022-08-29 21:46:29

Keeping the 5.1 track takes a lot of extra space if you're never able to listen to it in 5.1.

At the moment, yes, but who knows in the future... Plus, I occasionnally pass the files to friends who have 5.1 systems...

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #5 – 2022-08-29 23:37:51

Quote from: Aleron Ives on 2022-08-29 21:46:29

Keeping the 5.1 track takes a lot of extra space if you're never able to listen to it in 5.1.

Universally supported AC3 at 448 kbit/s will use like ~385 MB of space for 2 hour long movie and it will be transparent.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #6 – 2022-08-30 00:33:28

AAC at 128 kbps will use ~110 MiB, be transparent, and not take the downmixing volume hit that 5.1 AC3 will. Any device that can decode AVC will also decode AAC, which is not true for AC3.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #7 – 2022-08-30 02:22:41

I was talking about 5.1, not 2.0.
5.1 AAC is not supported by anything. It will only work with brand new devices that support uncompressed PCM (once decoded) via HDMI ...
... or it will be encoded to AC3 in real-time. Lossy -> lossy is big no-no.
AC3 will work everywhere. My 10+ year old LG TV will happily send AC3 over Toslink to my 20 year old Sony receiver and it will play just fine.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #8 – 2022-08-30 04:37:30

I know. I was talking about 2.0, not 5.1. He's already listening in stereo, so he can save quite a bit of space when transcoding movies by downmixing to stereo AAC and discarding the 5.1 track he can't hear, anyway. AVC 1080p (and 720p) + AAC 2.0 will play on anything, whereas some devices (e.g. iPad) can't decode AC3, no matter how many channels it has. Manual stereo downmixing also gives you the benefit of being able to configure DRC, should your playback device not give you that option.

Re: 5.1 downmix to 2.0 (again) and buried dialogs

Reply #9 – 2022-08-30 13:44:32

Quote from: Aleron Ives on 2022-08-30 04:37:30

Manual stereo downmixing also gives you the benefit of being able to configure DRC

By the way, any advice to apply soft/mild DRC with ffmeg to arbitrary streams (drc_scale is limited to AC3 input streams AFAIK)? I played a bit with -filter_complex "compand=..." but never clearly understood the parameters (in particular I could not reproduce the compressor effect of Audacity...)

Notice