Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Help understanding downsampling of "fake" high-res files (Read 3712 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Help understanding downsampling of "fake" high-res files

I have a number of "high-res" blu-ray audio discs that I've ripped that largely or entirely consist of 44.1 KHz music basically stuffed into a 96 KHz container.  See first attachment for an example - that was pretty clearly either recorded or previously mixed at 44.1, despite being a "hi-res" file. 

I can run that file through sox to downsample it (sox in.wav -V -G --comment '' -r 44100 out.wav) and create a 44.1 KHz version that, at least as far as spek visualizes it, seems largely identical as the original.  See second attached image for example.  Doing this reduces the file size significantly while seemingly preserving the quality.

What I'm trying to understand, and would appreciate any feedback on, is how that downsampling process actually effects the audio.  Just looking at the spectrums, for example, it looks like the process just discards the empty/unused portion of the data and leaves the audio behind.  I'm sure it's not simply lopping off the upper half of the audio data like the spectrum implies, though - the audio data must be getting processed as well, but to what extent?  In cases like this, how much am I actually affecting or reducing audio quality?

Thanks.

Re: Help understanding downsampling of "fake" high-res files

Reply #1
Looking at the spectrogram of the 96kHz file, there's practically nothing of significance above ~21kHz. This is still below the Nyquist frequency (which is half of the sample rate, and the highest frequency a given sample rate can reproduce) of a 44.1kHz signal. You can see from the downsampled image that the signal still cuts off at around 21kHz, so you're not losing anything (or at least, anything you can hear anyway). Of course, when you're downsampling, the signal has to be lowpass filtered to fit into the new sample rate; in case of this signal though, what gets lopped off (as you put it) or filtered out is the ~22-48kHz range which is about the upper half of the original 96kHz file, which in this case contains for all practical purposes only silence, and which a 44.1kHz sample rate can't even reproduce.
Here's how SoX actually filters the signal in a 96->44.1 conversion: https://src.infinitewave.ca/?Top=SoX14_VHQ_LP&Bot=SoX144_HQ&Spec=0122 (I don't know which mode, HQ or VHQ SoX defaults to, but in either case, you'll be fine).
There is what looks like a faint sine wave at around 44kHz (and a very faint around 36kHz) in the 96kHz file. I have no idea what purpose that serves (other than perhaps "proving" that the file does contain high frequency signals that justify the 96kHz sample rate), but in any case, you can't hear those anyway.

Re: Help understanding downsampling of "fake" high-res files

Reply #2
(These are all stereo, it isn't just that you picked a stereo for the illustration?)

There are some tests around. SoX resampler has a good reputation and is fast.
https://hydrogenaud.io/index.php/topic,118566.0.html refers to https://audiophilesoft.ru/publ/my/foo_resamplers/11-1-0-34 where you see that the difference between the steep filters and the smoother ones (in SoX: between 99 percent and 95 percent passband) amount to something between a quarter-tone and a half-tone higher frequencies covered.
(Puzzlingly to an ignorant like myself, these dive before 22 kHz even if target is set at 48 kHz. If you want a "bigger safer distance to audible", 48 kHz should be playable on anything as well.)

Curious though:
What are the file size differences pre and post resampling? New FLAC 1.4.0 makes for significant improvements of upsampled material. For WavPack, I am touting the -hx4 (or -hhx4) switch for high resolution material, while TAK ... just go for -p4m, but it doesn't support 7.1 surround. On 5.1 it rules supreme.

Re: Help understanding downsampling of "fake" high-res files

Reply #3
Neither upsampling nor downsampling is lossless - you won't get bit-to-bit the same file after upsampling and then downsampling back (except for some very special cases and resampling algorithms which are not normally used). Unless resampler is faulty, it's far beyond audibility though.

Re: Help understanding downsampling of "fake" high-res files

Reply #4
Puzzlingly to an ignorant like myself, these dive before 22 kHz even if target is set at 48 kHz.
They are testing upsampling from 44.1k, so that's expected, I think.

Re: Help understanding downsampling of "fake" high-res files

Reply #5
and resampling algorithms which are not normally used).
Microsoft Windows' resampler might not be the best: http://archimago.blogspot.com/2015/11/measurements-windows-10-audio-stack.html
The bottom graph indicates -2.5 dB at 15 kHz. I wouldn't be surprised if some (younger) ears could hear it.
(And this is when upsampling - isn't it completely unnecessary to make artefacts like that then?)

@danadam Thanks. Maybe I linked to less relevant tests then, I might imagine an "improvement in relevance" over comparing upsampling to downsampling.


Re: Help understanding downsampling of "fake" high-res files

Reply #7
What are the file size differences pre and post resampling?

I know you didn't ask me :) but I did a test with a release I have: https://microcosmos.bandcamp.com/album/microcosmos-chill-out-vol-5 (files are 96kHz/24-bit but contain nothing above ~22kHz)

Original total file size: 2,011,692,282
flac -8 @ 96/24: 929,280,630 (flac v1.4.1: https://github.com/xiph/flac/releases/tag/1.4.1)
Resampled to 44.1/24 with r8brain free, then flac -8: 695,240,243 (r8brain free 2.9: https://www.voxengo.com/product/r8brain/)
Same as above, 48/24: 740,150,372

Re: Help understanding downsampling of "fake" high-res files

Reply #8
Thanks for the replies!  Replying to a few specific points below, but the gist of what I'm getting is that downsampling with sox as I've shown only discards effectively unused data and the way it resamples the actual audio data should not negatively affect how it sounds on playback.  Is that an accurate summary?

There is what looks like a faint sine wave at around 44kHz (and a very faint around 36kHz) in the 96kHz file. I have no idea what purpose that serves (other than perhaps "proving" that the file does contain high frequency signals that justify the 96kHz sample rate), but in any case, you can't hear those anyway.

Yeah, I've seen a lot of these "fake" high-res tracks that have a line like that in the upper range of the graph.  I'm not sure what that's for, either.  Figured it was some kind of artifact from the conversion process.

(These are all stereo, it isn't just that you picked a stereo for the illustration?)

Yes, most of the tracks like this (in my collection) are stereo.  I haven't looked as closely at the surround tracks.

What are the file size differences pre and post resampling? New FLAC 1.4.0 makes for significant improvements of upsampled material.

For one of my albums I get:
original 96/24 wav: 11G
original 96/24 flac: 6.3G
resampled 44.1/24 flac: 3.5G
resampled 44.1/16 flac: 1.9G

So, close to have the size just dropping to 44.1.  That's why I'm so interested in this - lot of space savings, specially when dealing with multiple albums.  But I've read conflicting things on the internet about the effects of downsampling, hence my post here.

And thanks for the tip about FLAC 1.4.0!  I'm on 1.3.4 right now and hadn't heard about those new improvements yet.  Excited to check that out.

Neither upsampling nor downsampling is lossless - you won't get bit-to-bit the same file after upsampling and then downsampling back (except for some very special cases and resampling algorithms which are not normally used). Unless resampler is faulty, it's far beyond audibility though.

This is getting to the heart of what I'm asking.  I understand the operation isn't lossless, but as long as it's not destroying / discarding actual audio data like a lossy encoder (which seems to be the case) then it seems like a worthwhile tradeoff.

Re: Help understanding downsampling of "fake" high-res files

Reply #9
Thanks for the replies!  Replying to a few specific points below, but the gist of what I'm getting is that downsampling with sox as I've shown only discards effectively unused data and the way it resamples the actual audio data should not negatively affect how it sounds on playback.  Is that an accurate summary?

In case of these fake 96kHz files, what's effectively discarded is silence. Of course if you have actual 96kHz files with frequency content up to 48kHz, then actual audio data is also filtered out. Whether you can actually hear this has been debated to death, so your best judge is your own ears here :)

Quote
Yeah, I've seen a lot of these "fake" high-res tracks that have a line like that in the upper range of the graph.  I'm not sure what that's for, either.  Figured it was some kind of artifact from the conversion process.

That test I posted in reply #7 doesn't have these high frequency signals, so it varies on a case by case basis.

Quote
And thanks for the tip about FLAC 1.4.0!  I'm on 1.3.4 right now and hadn't heard about those new improvements yet.  Excited to check that out.

FLAC 1.4.1 was just released today: https://github.com/xiph/flac/releases/tag/1.4.1

 

Re: Help understanding downsampling of "fake" high-res files

Reply #10
In case of these fake 96kHz files, what's effectively discarded is silence. Of course if you have actual 96kHz files with frequency content up to 48kHz, then actual audio data is also filtered out. Whether you can actually hear this has been debated to death, so your best judge is your own ears here :)
I think the question concerns whether the filtering process has any side effects that take out more than a clean "remove all above Nyquist".

There is also the issue of whether a 24-bit file is faked up from, say, 16 bits. Indeed for DVD (I think also for BluRay, but then size constraint shouldn't be an issue) one can peel off the bottom bits to save space. But FLAC detects that and compresses "16 bits wrapped in 24-bit file" just as well as 16 ...
... but, if it is upconverted with dither applied, the compressed file becomes much bigger.
Anyway, 16 bits properly dithered is good enough. One should maybe check whether each DVD/BluRay has a peak reasonably close to 0 dB (I mean, anything above digita 0.5 is very reasonably close!), and should it for stupid reasons not be, one can handle that as well.

Re: Help understanding downsampling of "fake" high-res files

Reply #11
Of course if you have actual 96kHz files with frequency content up to 48kHz, then actual audio data is also filtered out. Whether you can actually hear this has been debated to death, so your best judge is your own ears here :)

Yeah, I've seen a lot of those debates.  I'm generally of the mind that if the audio is there, preserve it, even if I can't hear it.  But wasting so much space on silence is what's really galling to me.

BTW, I just checked out flac 1.4.0 (1.4.1 isn't in my package manager's repo yet, and looks like it's just minor bug fixes anyway) and did a quick comparison against 1.3.4.  This particular set of tracks data compresses down to 6.2 GB with 1.3.4 and 5.9 GB with 1.4.0.  Not as impressive as the examples in that post you linked to, but hey, saving 300 MB for free?  I'll take it.  Seems like some great improvements in this release.

Re: Help understanding downsampling of "fake" high-res files

Reply #12
There is also the issue of whether a 24-bit file is faked up from, say, 16 bits. Indeed for DVD (I think also for BluRay, but then size constraint shouldn't be an issue) one can peel off the bottom bits to save space. But FLAC detects that and compresses "16 bits wrapped in 24-bit file" just as well as 16 ...
... but, if it is upconverted with dither applied, the compressed file becomes much bigger.
Anyway, 16 bits properly dithered is good enough. One should maybe check whether each DVD/BluRay has a peak reasonably close to 0 dB (I mean, anything above digita 0.5 is very reasonably close!), and should it for stupid reasons not be, one can handle that as well.

This is another topic I've been looking into, but not sure how to tackle it.  With sample rate, I can use a tool like spek to very clearly tell when a 44.1 or 48 KHz track has been upsampled.  But I can't find any obvious way to do the same for 16 -> 24-bit conversions.  If it's been converted from 16- to 24-bit, then there shouldn't be any real loss converting back to 16-bit, but I don't want to do that if it's a native 24-bit track.  I kind of assume that if the track was originally 44.1 KHz then it was also 16-bit, but I don't know how true that generally is.

Re: Help understanding downsampling of "fake" high-res files

Reply #13
This is getting to the heart of what I'm asking.  I understand the operation isn't lossless, but as long as it's not destroying / discarding actual audio data like a lossy encoder (which seems to be the case) then it seems like a worthwhile tradeoff.
Technically, it is destroying (altering) audio data. But with a high quality resampler like SoX, the "destruction" is outside of what's considered humanly audible. So I wouldn't worry about it.

But I can't find any obvious way to do the same for 16 -> 24-bit conversions.
Check this: https://hydrogenaud.io/index.php/topic,114816.msg1010860.html#msg1010860 (and the whole thread for the discussion)
And this: https://www.stillwellaudio.com/plugins/bitter/

Quote
I kind of assume that if the track was originally 44.1 KHz then it was also 16-bit, but I don't know how true that generally is.
A lot of music is produced at 44.1k, but the bit depth during production is always higher than 16bit. So if it's released on Blu-ray, it's possible that it's genuine 24bit audio.

Re: Help understanding downsampling of "fake" high-res files

Reply #14
Don't over-think this...    If you have a reason for downsampling, go ahead and do it.     If you don't have a good reasion, don't resample unnecessarily.

16/44.1 is generally better than human hearing so it's usually good-enough for anything.      If you want small files a good quality MP3 or AAC file can often be transparent, or you might have to listen every carefully and compare to hear a difference.    

Quote
With sample rate, I can use a tool like spek to very clearly tell when a 44.1 or 48 KHz track has been upsampled. 
Or maybe the ultrasonics (and maybe subsonics) were filtered-out, either intentionally or unintentionally, during mixing or mastering, without down-sampling,      Some mastering engineers like to use analog equipment and some analog equipment doesn't go much beyond the audio range.    Some audio engineers just don't want inaudible "junk" in the file.

And, most natural sounds don't have much ultrasonic energy to begin with.    And, most microphones don't go into the ultrasonic range, and they often start rolling-off in the upper-audio range.

Quote
But I can't find any obvious way to do the same for 16 -> 24-bit conversions.
There are tools that try to find the "true" but depth and it's possible to upsample "carefully" in a way that fills the 8 least significant bits with zeros.   But if you also

It's also possible to fake both if you really want to....  Almost any change (including a slight volume change) will fill the extra bits with non-zero data.    And there are effects that add high-frequency information.

The "pro studio standard" has been 24/96 for a long time so there should be very few up-sampled commercial Blu-Ray discs.     Even recordings that were digitized from analog are probably digitized at 24/96, even if the recording will be eventually released on CD or MP3     44.1kHz is not standard for video.    Most DVDs are 48kHz and Blu-Rays are 48kHz or higher.  

Dolby Digital and DTS on DVDs is lossy compression and it doesn't have a "bit depth".   LPCM stereo on DVDs is usually 16/48.     The advanced Blu-Ray formats are lossless up to 24/192.

Re: Help understanding downsampling of "fake" high-res files

Reply #15
The following attitude might not suit you, but think over it:
Lossless is lossless and everything else is lossy.

Meaning:
* So you have the physical disc, and I since you are willing to tamper with the files, I assume that it means you are happy with the physical disc as "archival/backup original" in case your conversion should not be good enough (I mean, user error could mess up even when the tools are good enough).
* If you want to convert like this, you are doing lossy conversion. If you are willing to do lossy conversion, then a lossy format should be on the table.

... now whenever I downconvert (or transcode lossy->lossy) I make sure that it is duly flagged, even if I should happen to delete a tag. One way of doing that, is using something you haven't found in the wild. Like possibly Vorbis at q9 or q10 (not applicable to me, some artist released an album that way once ...) - or Musepack, if you have playback support for it (have you seen an .mpc in the wild?)

Re: Help understanding downsampling of "fake" high-res files

Reply #16
Something like personal recordings from mic/line-in of MD walkmans before the Net MD era.

Even with a compatible MD player that supports USB ripping and without copy protection (MZ-RH1), the best format one can get from these recordings is 16/44 PCM decoded from 292kbps ATRAC, but not the original ATRAC data.

https://hydrogenaud.io/index.php?topic=98039
https://www.minidisc.org/NetMD_faq.html

I bought an MZ-R3 in 1996 before knowing what is "lossy", and wondered why MD recorders were cheaper than DAT recorders. I was lucky enough to borrow an MZ-RH1 about 10 years ago so that I could rip to PCM via USB without paying anything.