Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Making sense of the pipeline for spatial sound in Windows (Read 2885 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Making sense of the pipeline for spatial sound in Windows

Hi all,
I've been gratefully using this forum as a source of information on all things audio for years but have never posted before. I hope this question is appropriate for this subforum.

I have noticed quite variable results when upmixing stereo sources to different surround sources, including dependencies on the order of operations in the DSP chain. I would like to better understand the underlying pipeline better and I hope you can help me. Please bear with me as I give some relevant info as briefly as possible.

I do not have any surround (5.1 / 7.1) audio setup in my home. My setups are all geared primarily for music, which is stereo, and I've always felt that audio is best enjoyed played back in the same format as it was mixed, rather than upmixing stereo to 5.1 etc. I also generally dislike the 'virtual surround' effect that can be found on various stereo equipment and in certain software packages.

However, I have recently been very surprised by how much I like the 'virtual surround' effect when upmixing ordinary stereo recordings to 4.0 / 5.1 / 7.1 in combination with Dolby Atmos for Headphones. I upmix the stereo sources in Foobar2000 using the various, native upmxing DSPs that Foobar has (4.0, 5.1, 7.1), and have Atmos for Headphones selected as the global spatial audio processor in Windows. Several things strike me:
  • All upmixed stereo sources (4.0, 5.1, 7.1) sound surprsingly good using Atmos.
  • 4.0 sounds quite different from 5.1 and 7.1, whereas the latter two sound quite similar (though not identical)
  • When the upmixing DSP is followed by a downmix to stereo, the result is different from both the unprocessed stereo mix and the only-upmixed stereo source (not followed by explicit downmix)

In all cases my DSP chain in Foobar is:
1. Resampler
2. Upmix [4.0 / 5.1 / 7.1]
3. [Downmix to stereo]

Especially that last observation surprised me and I would love to understand what is going on here. I would expect either of the following three things to happen, but clearly that is not the whole story:
  • When only upmixing (not downmixing) in Foobar, Windows recognizes there's more than 2 channels and passes the upmixed audio to Atmos. Atmos does its magic and passes a downmixed stereo signal to the output stages
  • When upmixing, followed by downmixing, in Foobar, all audio is routed 'inside' foobar until after the last DSP. It 'leaves' Foobar as stereo and hence windows does not see it as spatial sound and does not pass it to Atmos; output should be identical to the situation without any upmixing; however this is clearly not the case.
  • When upmixing, followed by downmixing, in Foobar, Windows initially passes the audio to Atmos, which does it's magic, produces a downmixed stereo signal, passes that back to Foobar, which dowmixes it again, which effectively achieves nothing because the sound was already stereo. Output should be identical to the situation with upmixing, but without downmixing, in Foobar, but this is also clearly not the case.

Can anybody shine some light on the different steps in the pipeline and at which points (before/after up/down sampling) the audio is handled by which piece of software?

Many thanks for any help! Just to be clear: I have no stake in Dolby in any way and frankly I'm almost a bit irked by how much I like it, having basically been a 2.0 purist for my entire adult life.

Cheers



Re: Making sense of the pipeline for spatial sound in Windows

Reply #1
Maybe this would be better asked on the foobar2000 forum.

Anyway, if you use multiple DSP on foobar2000, nothing goes out until the last step, so you can discard your case "3" since that does not happen.

What you seem to assume is that upmixing and downmixing would make no difference, but it's not like that.

I don't know how it is implemented ( I guess Peter is the only one that knows), but think about this:

stereo to 5.1 means:
- extract the center channel from the stereo channel
- Maybe, but not necessarily, remove center channel from stereo.
- generate back stereo satellites from stereo, most probably with lower volume and possibly some feedback. Possibly also filtering out some bass.
- the .1 in 5.1 and 7.1 usually means "FX channel", not really "subbass channel", so I don't really know what would be put there from upmixing.

Downmixing is a bit more common, but generally it would mean
- add center to both left and right, usually with some reduction in volume,
- add .1
- and satellites might be mixed with lower volume or with some additional effects.

Atmos is just an implementation of downmixing, that also does additional post-processing knowing that left and right will be fully panned (i.e. one on one ear and the other in the other).


Re: Making sense of the pipeline for spatial sound in Windows

Reply #2
Thanks for your answer! I think I have figured out why it sounds different when I upmix then downmix in foobar vs. skipping up- and downmixing altogether; volume isn't normalized properly after downmixing; the difference was just a difference in volume, not quality. So, I guess that mystery is solved.

Re: Making sense of the pipeline for spatial sound in Windows

Reply #3
Quote
stereo to 5.1 means:
There's no "standard" way of doing it.   My home theater receiver has a selection of "Sound Field" options.    For music I use a "hall" effect that adds reverb to the rear channels.

Of course some older movies have matrix encoded "Dolby Surround" ("AKA "Pro Logic:) and the Pro Logic Movie Mode correctly decodes the surround.   If you have a really old movie with mono, Pro Logic Movie Mode will play it "correctly" with all of the sound coming only from the center speaker.   Regular stereo can get messed-up in Movie Mode.

Quote
- the .1 in 5.1 and 7.1 usually means "FX channel", not really "subbass channel", so I don't really know what would be put there from upmixing.
Right .  The point-one channel is "low frequency effects" (booms & explosions).  

In a "real theater" the regular surround speakers handle regular bass and ONLY the LFE goes to the subwoofer,   With regular stereo the subwoofer isn't used.

But, most home systems use "small" surround speakers that can't accurately reproduce bass, so home theater receivers have optional "Bass Management" where the bass from the 5 (or 7) surround speakers is mixed with the LFE and  sent to the subwoofer so all of the bass goes to the subwoofer.    And with stereo, all ALL of the bass is re-routed to the subwoofer. 


Here are the standard formulas for downmixing.    But you may  have to reduce the levels to prevent clipping.  (Movies usually have enough headroom so you can downmix without clipping.)

Note that the LFE is NOT uses when downmixing.   But in a system with bass management the "regular bass" will be sent to the sub so the sub is still used.

Re: Making sense of the pipeline for spatial sound in Windows

Reply #4
Thank you also for a very informative reply! I was also looking for some explanation on how upmixing works and found this link @ soundonsound. Does that general scheme seem more or less correct / broadly applicable or do you know of any other sources on this?

I also have two more questions about the LFE/sub channel.

1) I believe my receiver sends a full spectrum mono signal through the LFE/sub channel when I'm playing a stereo source; I tune the cutoff frequency of that on my subwoofer itself. Is this also happening in a native x.1 mix? Because if only the true subbass is passed on the LFE channel, setups without whole-range satellites would lose a lot of the 'normal' bass sounds.

2) I read that receivers and setups are often incorrectly labeled as x.2.x, because there is only ever one LFE channel, regardless of how many subwoofers you have. A '5.2.2' system might have two LFE/subwoofer outputs, but these are effectively the same channel. Is that correct or are there actually stereo LFE sources?

 

Re: Making sense of the pipeline for spatial sound in Windows

Reply #5
Quote
1) I believe my receiver sends a full spectrum mono signal through the LFE/sub channel when I'm playing a stereo source
Do you have something else you can plug-in?  Maybe powered "computer speakers"?    Or if you have a computer with a regular soundcard you can plug it into line-in and record it.

Quote
Because if only the true subbass is passed on the LFE channel, setups without whole-range satellites would lose a lot of the 'normal' bass sounds.
That's the 'bass management".   I believe the setup for my receiver has settings for "small" "medium" or "large" speakers.  

Quote
2) I read that receivers and setups are often incorrectly labeled as x.2.x, because there is only ever one LFE channel, regardless of how many subwoofers you have.
I don't know.   And I don't have any DVDs or Blu-Rays with "x.2" or Atmos.   ...I do have two subs on one output with a Y splitter. 

Quote
...found this link @ soundonsound. Does that general scheme seem more or less correct / broadly applicable or do you know of any other sources on this?
Like I said, I normally use the "soundfield" settings and let my receiver do it in real-time during playback.

But, a few years ago I made a 5.1 up-mix/re-mix of a mono video concert.  I went crazy with it!!!   I used complementary EQ adjustments to make "fake stereo" with the regular music parts.    I can't remember exactly but I think I included the center, maybe for the mid-vocal range.    

I copied those EQ'd front channels to the rear at reduced levels with some delay and added reverb.

When there was talking between songs I panned to the center channel.   When there was applause, I panned toward the rear and I mixed-in some additional applause-only clips to the rear whenever there was music and applause at the same time.    I used different applause clips for the left & right rear.

I also used a sub-harmonic bass generator effect for the LFE channel.    (Of course, you're not "supposed" to use the LFE with music.)

I burned it to a DVD so I was able to keep the original mono option and there's a menu selection if you want the surround re-mix.