curious difference between DTS and DSD version of same multichannel mix

Topic: curious difference between DTS and DSD version of same multichannel mix (Read 4829 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

curious difference between DTS and DSD version of same multichannel mix

2023-05-12 19:01:59

Questions often arise as to the supposed audible difference between a DTS/Dolby version and a lossless version of the same multichannel mix. I was investigating a recent claim about this on a surround audio forum and find something that looks odd..and I'm wondering what the correct explanation is.

I own both the official DTS 96/25 and SACD multichannel versions of Genesis A Trick of the Tail . The mix itself is the same on both. I long presumed that there's no substantial difference between them (e.g., ultrasonic content in one or the other doesn't matter to anyone's ears) but when I rip both of them -- decoding to 96/24 PCM in the DTS case using ffmpeg, and to 88/24 PCM in the DSD case, using foobar's SACD plugin, with its level and LFE adjustments at 0 -- I see undeniable differences in levels in Audacity, even after overall peak level-matching.

The attached figures show the DTS rip/decode (top) and the SACD rip/PCM conversion (amplified to peak at 0dB*)

(*The raw DSD-->PCM rip (not shown) was -4.5dB lower level than the DTS version in the main channels, something I expected because SACD spec usually means a DSD track will be lower than its PCM counterpart. So for the rest of the comparison in Audacity I amplified it to peak at 0dB (I did NOT use Audacity normalize, since Audacity's normalization plugin seems to handle 6 channel audio badly) .

Right away, the DTS decode (top) , many of its peaks are at 0dBFS (red lines, found by Audacity View-->Show Clipping**), including more than a few instances of actual clipping (according to Audacity's Analyze-->Find Clipping) ..and this is NOT the case for the peak level-matched DSD-->PCM (bottom). I zoomed in on several instances, to verify that this was truly the case. (Not shown)

(**The Show Clipping tool flags any 0dB peak as 'clipping'..though sometimes it doesn't! The 'Find Clipping' tool seems more reliable, as it only calls 3 or more consecutive 0dB peaks as 'clipping'. The 'Peak Finding' plugin gives yet another output, not always in accord, but I won't deal with that now)

So how might this come about? Is it a mastering difference, or an artifact of the DTS decoding versus DSD conversion, or something else? I do need to compare some other examples, and throw in some DVD-A/BluRay , to know if this is peculiar to the Genesis series.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #1 – 2023-05-12 20:59:38

Quote from: krabapple on 2023-05-12 19:01:59

So how might this come about?

Unless you're using the lossless DTS-HD MA extension, DTS is a lossy codec. Samples that were originally at or near 0dBFS can end up above 0dBFS as a result of the lossy transformation.

Something similar can happen when resampling: if you measure the peak by the highest sample and not the highest point on the smooth audio waveform, you might see the "peak" increase after resampling when one of the new samples is closer to the real peak.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #2 – 2023-05-12 21:22:15

Quote from: krabapple on 2023-05-12 19:01:59

Right away, the DTS decode (top) , many of its peaks are at 0dBFS (red lines, found by Audacity View-->Show Clipping**), including more than a few instances of actual clipping (according to Audacity's Analyze-->Find Clipping) ..and this is NOT the case for the peak level-matched DSD-->PCM (bottom). I zoomed in on several instances, to verify that this was truly the case. (Not shown)

DSD itself cannot "clip" because the bitstream can only be +/-1 so in this sense it is always "clipped".

When converting DSD to PCM, the bitstream is firstly converted to multibit, for example, floating point with +/-1.0. Then a filter is applied. The filter will change the amplitude of the bitstream in a way that there would be no consecutively identical amplitude values that can trigger the clip detector.

Relevant topic:
https://hydrogenaud.io/index.php/topic,121906.msg1006400.html#msg1006400

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #3 – 2023-05-13 06:47:35

Example of real music instead of test signal. Upper channel is PCM converted to DSD then back to PCM, lower channel is the original PCM file, which is originally clipped.

Now zoom in the highlighted red region, the peaks of the upper channel are no longer perfectly clipped.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #4 – 2023-05-16 18:07:04

Quote from: Octocontrabass on 2023-05-12 20:59:38

Quote from: krabapple on 2023-05-12 19:01:59
So how might this come about?
Unless you're using the lossless DTS-HD MA extension, DTS is a lossy codec. Samples that were originally at or near 0dBFS can end up above 0dBFS as a result of the lossy transformation.

The only transformation occurring here for the DTS 96/24 file (which is lossy, of course) is its decoding to PCM; all (lossy) DTS files decode to PCM at whatever their specified rate is -- in this case 96kHz 24 bits.

Quote

Something similar can happen when resampling: if you measure the peak by the highest sample and not the highest point on the smooth audio waveform, you might see the "peak" increase after resampling when one of the new samples is closer to the real peak.

I don't really get your distinction between 'highest sample' and 'highest point on the smooth audio waveform'.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #5 – 2023-05-16 18:33:02

Quote from: bennetng on 2023-05-12 21:22:15

Quote from: krabapple on 2023-05-12 19:01:59
Right away, the DTS decode (top) , many of its peaks are at 0dBFS (red lines, found by Audacity View-->Show Clipping**), including more than a few instances of actual clipping (according to Audacity's Analyze-->Find Clipping) ..and this is NOT the case for the peak level-matched DSD-->PCM (bottom). I zoomed in on several instances, to verify that this was truly the case. (Not shown)
DSD itself cannot "clip" because the bitstream can only be +/-1 so in this sense it is always "clipped".

When converting DSD to PCM, the bitstream is firstly converted to multibit, for example, floating point with +/-1.0. Then a filter is applied. The filter will change the amplitude of the bitstream in a way that there would be no consecutively identical amplitude values that can trigger the clip detector.

Relevant topic:
https://hydrogenaud.io/index.php/topic,121906.msg1006400.html#msg1006400

This brings up a question of what steps were taken from the original analog multitrack masters , to the final multichannel DVD-V and SACD releases

I suspect, but don't know for sure, that the old analog multitracks are digitized to PCM , not DSD.
I extremely strongly suspect that multichannel mixing and production and mastering are done in the PCM realm
The final step is either release as lossless PCM (DVD-A or BluRay), or PCM encoded to DTS / AC3 (DVD-V), or PCM converted to DSD (SACD), depending on release format.

So let's assume the SACD version of this track already underwent a PCM master -->DSD conversion. And the DTS version is a lossy encode of the PCM master.

That means the clipping* seen in the DTS version was on the PCM master... or can encoding to DTS 96/24 introduce it?

And the elimination of clipping seen in the DSD (after conversion back to PCM) -- that is an artifact of the original PCM--DSD conversion, or of my DSD-->PCM conversion

*defined as consecutive 0dBFS PCM samples

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #6 – 2023-05-16 19:05:06

Quote from: krabapple on 2023-05-16 18:07:04

The only transformation occurring here for the DTS 96/24 file (which is lossy, of course) is its decoding to PCM; all (lossy) DTS files decode to PCM at whatever their specified rate is -- in this case 96kHz 24 bits.

I'm referring to the lossy transformation that occurred during encoding.

Quote from: krabapple on 2023-05-16 18:07:04

I don't really get your distinction between 'highest sample' and 'highest point on the smooth audio waveform'.

Open the attached file in Audacity. Zoom way in so you can see the samples. Normalize it to 0dBFS and then resample it to 48kHz. You'll see samples above 0dBFS. The audio didn't change, but some of the new samples have a higher amplitude than any of the original samples.

(For anyone curious, this particular file is based on a test signal specified in ITU-T Recommendation G.711.)

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #7 – 2023-05-16 19:18:48

Quote from: krabapple on 2023-05-16 18:33:02

I suspect, but don't know for sure, that the old analog multitracks are digitized to PCM , not DSD.

Likely.

Quote

I extremely strongly suspect that multichannel mixing and production and mastering are done in the PCM realm

It is certainly the case because digital mixing requires more than 1 bit, just a simple math 1+1=2. With 1 bit the bits can only be flipped. It does not make any sense to do all the mixing and effect processing using analog equipment then re-digitize to DSD again.

Quote

The final step is either release as lossless PCM (DVD-A or BluRay), or PCM encoded to DTS / AC3 (DVD-V), or PCM converted to DSD (SACD), depending on release format.

So let's assume the SACD version of this track already underwent a PCM master -->DSD conversion. And the DTS version is a lossy encode of the PCM master.

That means the clipping* seen in the DTS version was on the PCM master... or can encoding to DTS 96/24 introduce it?

I am not familiar with the DTS lossy codec, but in general, transform-based lossy codecs work in a way like this:
https://izotope-rx.livejournal.com/5760.html

So basically, if you convert these lossy formats to fixed point PCM instead of floating point, clipping could be introduced due to the fact that fixed point formats cannot contain any >0dBFS sample value. Does the conversion software you use capable of floating point decoding?

Quote

And the elimination of clipping seen in the DSD (after conversion back to PCM) -- that is an artifact of the original PCM--DSD conversion, or of my DSD-->PCM conversion

Both. Because PCM-DSD conversions involve sample rate conversion, and the concept of 'highest sample' and 'highest point on the smooth audio waveform' can be easier to understand by reading this (again, iZotope) article:
https://techblog.izotope.com/2015/08/24/true-peak-detection/

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #8 – 2023-05-16 20:17:41

Quote from: Octocontrabass on 2023-05-16 19:05:06

Quote from: krabapple on 2023-05-16 18:07:04
The only transformation occurring here for the DTS 96/24 file (which is lossy, of course) is its decoding to PCM; all (lossy) DTS files decode to PCM at whatever their specified rate is -- in this case 96kHz 24 bits.
I'm referring to the lossy transformation that occurred during encoding.

OK clear now.

Quote

Quote from: krabapple on 2023-05-16 18:07:04
I don't really get your distinction between 'highest sample' and 'highest point on the smooth audio waveform'.
Open the attached file in Audacity. Zoom way in so you can see the samples. Normalize it to 0dBFS and then resample it to 48kHz. You'll see samples above 0dBFS. The audio didn't change, but some of the new samples have a higher amplitude than any of the original samples.

(For anyone curious, this particular file is based on a test signal specified in ITU-T Recommendation G.711.)

What I see after that, is that where there were two samples at 0dBFS in the normalized 8kHz SR file, there are now 7 @0dBFS in the 48kH SR file. Audacity doesn't show me 'overs' (above 0 samples).

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #9 – 2023-05-16 20:32:08

Quote from: bennetng on 2023-05-16 19:18:48

Quote from: krabapple on 2023-05-16 18:33:02
I suspect, but don't know for sure, that the old analog multitracks are digitized to PCM , not DSD.
Likely.

Quote
I extremely strongly suspect that multichannel mixing and production and mastering are done in the PCM realm
It is certainly the case because digital mixing requires more than 1 bit, just a simple math 1+1=2. With 1 bit the bits can only be flipped. It does not make any sense to do all the mixing and effect processing using analog equipment then re-digitize to DSD again.

Agreed but you might be surprised. It is definitely not unheard of in the mixing/mastering suite for a signal to go through an analog process again after being digitized (and then being redigitized for the final product)

Quote

Quote
The final step is either release as lossless PCM (DVD-A or BluRay), or PCM encoded to DTS / AC3 (DVD-V), or PCM converted to DSD (SACD), depending on release format.

So let's assume the SACD version of this track already underwent a PCM master -->DSD conversion. And the DTS version is a lossy encode of the PCM master.

That means the clipping* seen in the DTS version was on the PCM master... or can encoding to DTS 96/24 introduce it?

I am not familiar with the DTS lossy codec, but in general, transform-based lossy codecs work in a way like this:
https://izotope-rx.livejournal.com/5760.html

So basically, if you convert these lossy formats to fixed point PCM instead of floating point, clipping could be introduced due to the fact that fixed point formats cannot contain any >0dBFS sample value. Does the conversion software you use capable of floating point decoding?

I used ffmpeg to decode the raw DTS file. Simply this:

ffmpeg -i inputfile.dts outputfile.wav

If the input dts file is dts 96/24, the decoded output file is 96/24 PCM. If it's 'plain' (aka core) DTS, the decoded output is 48/24 PCM

version info:

Code: [Select]

ffmpeg version N-93774-gfec4212d8e Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 8.3.1 (GCC) 20190414
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil      56. 26.101 / 56. 26.101
libavcodec     58. 52.101 / 58. 52.101
libavformat    58. 27.103 / 58. 27.103
libavdevice    58.  7.100 / 58.  7.100
libavfilter     7. 50.100 /  7. 50.100
libswscale      5.  4.100 /  5.  4.100
libswresample   3.  4.100 /  3.  4.100
libpostproc    55.  4.100 / 55.  4.100

Foobar2K, Audiomuxer, and DVD Audio Extractor also do DTS decoding, but I didn't use them and I'm not sure if they all decode DTS 96/24 , and how.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #10 – 2023-05-16 20:45:53

Quote from: krabapple on 2023-05-16 20:32:08

ffmpeg -i inputfile.dts outputfile.wav

If the input dts file is dts 96/24, the decoded output file is 96/24 PCM. If it's 'plain' (aka core) DTS, the decoded output is 48/24 PCM

By default ffmpeg does truncation to 16 bit if output format is wav.
Also, lossy DTS has no fixed bitdepth (just like any other lossy codec).

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #11 – 2023-05-16 20:51:24

Quote from: krabapple on 2023-05-16 20:32:08

Agreed but you might be surprised. It is definitely not unheard of in the mixing/mastering suite for a signal to go through an analog process again after being digitized (and then being redigitized for the final product)

Yes possible, but rather pointless. Here is an example of "analog upsampling":
https://hydrogenaud.io/index.php/topic,93853.msg1006575.html#msg1006575

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #12 – 2023-05-16 21:51:42

Quote from: Bogozo on 2023-05-16 20:45:53

Quote from: krabapple on 2023-05-16 20:32:08
ffmpeg -i inputfile.dts outputfile.wav

If the input dts file is dts 96/24, the decoded output file is 96/24 PCM. If it's 'plain' (aka core) DTS, the decoded output is 48/24 PCM

By default ffmpeg does truncation to 16 bit if output format is wav.
Also, lossy DTS has no fixed bitdepth (just like any other lossy codec).

You're right, my mistake about ffmpeg out.

foobar2k (v2) decodes a 96/24 DTS file to 96/16-bit PCM too.

Audiomuxer 0.9.6.4 default decode of the same file is 96/24bitPCM . Same for DVD Audio Extractor. 32-bit float is not an option in either, though <24 is.

The same file dropped into Audacity 3.3.2 decodes to 48kHz in its default 32-bit float environment. So Audacity only decodes the 'core' DTS.

The bitrate of the starting .dts file btw is 1536 kbps, which I believe is the highest that DTS offers for lossy home media.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #13 – 2023-05-17 02:29:42

Quote from: krabapple on 2023-05-16 20:17:41

What I see after that, is that where there were two samples at 0dBFS in the normalized 8kHz SR file, there are now 7 @0dBFS in the 48kH SR file. Audacity doesn't show me 'overs' (above 0 samples).

Adjust the zoom level to see them. (Right-click on the amplitude scale.)

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #14 – 2023-05-17 05:57:22

Just to add that fixed-point decoding is not doomed to clipping if appropriate tools are available and if being done right. A popular example is mp3gain which is capable of changing the gain structure of mp3 and aac files internally so that they won't clip when decoded to fixed-point.

In either case, with resampling, lossy processing or analog chain (analog is always lossy anyway) it is unreliable to detect clipping by finding consecutively identical sample values in time domain.

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #15 – 2023-05-17 15:42:59

Quote from: Octocontrabass on 2023-05-17 02:29:42

Quote from: krabapple on 2023-05-16 20:17:41
What I see after that, is that where there were two samples at 0dBFS in the normalized 8kHz SR file, there are now 7 @0dBFS in the 48kH SR file. Audacity doesn't show me 'overs' (above 0 samples).
Adjust the zoom level to see them. (Right-click on the amplitude scale.)

I did. I see nothing above 0dB

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #16 – 2023-05-17 16:12:48

Quote from: bennetng on 2023-05-17 05:57:22

Just to add that fixed-point decoding is not doomed to clipping if appropriate tools are available and if being done right. A popular example is mp3gain which is capable of changing the gain structure of mp3 and aac files internally so that they won't clip when decoded to fixed-point.

I guess it comes down to knowing the details of how DTS encoding works. I'm inclined in this instance to believe that the original audio was often peaking near 0dB (i.e., dynamic range compression was liberally applied) before DTS encoding.
One line of evidence is from spectral view ranging up to 48kHz: the decoded DTS audio is visibly band limited at ~26kHz ( which is a strange value to me, rather than 22 or 24 or 48, but anyway) with a ragged, spiky 'haircut' profile like what 's shown on that izotope tutorial page as evidence of lossy encoding. So that's as expected. The SACD audio (DSD-->PCM @ 88kHz) has low-level 'musical' content above 26kHz, and a 'natural' profile with no obvious haircut/bandwidth limit. When normalized to full scale, it also has numerous samples at or near 0dB in the same areas the decoded DTS does. So I', inclined to think this mix was just born 'hot' rather than having been made so by lossy encoding.

Quote

In either case, with resampling, lossy processing or analog chain (analog is always lossy anyway) it is unreliable to detect clipping by finding consecutively identical sample values in time domain.

I'm not sure I understand why that would be true. If the audio was clipped (a row of samples at 0dB) but the overall level was then lowered by 1 dB, a simplistic 'clip detector' wouldn't find the row, but that audio would still be 'clipped'. So I think consecutive samples at the peak of the audio (whatever that peak is) is a good indicator that there was clipping. But maybe you're referring to the possibility that clipped audio , after resampling, might no longer look like a row of identical peak samples? Or something else?

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #17 – 2023-05-17 16:14:10

Here are what I got:
Before

After

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #18 – 2023-05-17 16:23:44

Quote from: krabapple on 2023-05-17 16:12:48

But maybe you're referring to the possibility that clipped audio , after resampling, might no longer look like a row of identical peak samples?

This. A few clipped samples may appear smoothed out after resampling. Of course, if the clipping is really severe, for example, more than dozens of consecutive samples before resampling, even if the exact sample values may not be identical after resampling, the clipping can still be easily identified visually.

[edit]This should be relevant, identifying clipping after a cassette deck loopback:
https://www.audiosciencereview.com/forum/index.php?threads/measurements-of-nakamichi-dragon-cassette-deck.5595/post-124666

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #19 – 2023-05-17 16:44:46

Quote from: bennetng on 2023-05-17 16:14:10

Here are what I got:
Before
[attach type=image]25820[/attach]

After
[attach type=image]25818[/attach]

I was told to do this:

Normalize it to 0dBFS and then resample it to 48kHz.

So within Audacity, I imported the alaw file, normalized to 0dB , set the project rate to 48kHz, then exported , then re-imported.

It looks like you did something else?

Re: curious difference between DTS and DSD version of same multichannel mix

Reply #20 – 2023-05-17 17:04:04

After normalize, select Tracks > Resample..., don't export.

Notice