xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Topic: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’ (Read 101920 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #125 – 2022-01-06 22:11:24

That 768 Ray Charles track is available for free ... if a GB download is free where you are. So I picked it up for curiosity. It WavPacks to around 70 percent, which is on par with a downsample to 48/24.

This stupid file size takes some CPU to process. Monkey's Insane not making it to 2x realtime decoding - how's that for a trip down memory lane, folks? ;-)
Actually, this track fools Monkey's Insane into making 1.7 percent larger files than Monkey's Extra High. Size order Extra high < High < Normal < Insane < Fast. (Since I was at it: Running WavPack at -hx4 brought the file sizes down to below Monkey's Extra high. Half realtime encoding, saves power if you want to listen to it a few times. OptimFrog wins on size, even at default setting.)

Quote from: bennetng on 2022-01-06 16:14:29

yes, flac can no longer handle these sample rates, and no usable 32-bit encoder. WavPack has a chance to dominate the Hi-Res market now

OptimFrog needs that all-important DSD mode to compete!

(Say what you will about DSD; some of the SACD masters are different than their CDDA counterparts - likely in order to fool users into thinking that better sound was due to format. And, I like WavPack.)

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #126 – 2022-01-06 22:13:18

Quote from: guruboolez on 2022-01-06 15:48:25

So if I understand correctly:
— sound is digitally recorded at DXD "format" (PCM-352,8 KHz) [A-D encoding]
— mixing and mastering is then recorded on analog tape [D-A]
— then it's converted again in the digital domain… at twice the original sampling rate

My pet bat says the Studer's high range is "Not bad for a human", and then does her best Alien screech impersonation. Just kidding

Quote

For the sake of curitosity I bought a 32 bit DXD triple album:

Did they use float or integer?

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #127 – 2022-01-07 06:46:44

32 bit floating for this recording.
Thanks for the lossless comparison !

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #128 – 2022-01-07 09:46:47

Quote from: guruboolez on 2022-01-07 06:46:44

32 bit floating for this recording.

Float protects against clipping, so it makes sense to use in processing. (Cf. the topic; Monty took care to make the point that the considerations applied to formats to deliver to end-users.)

Quote from: guruboolez on 2022-01-07 06:46:44

Thanks for the lossless comparison !

Since you mentioned .zip; the 768 Gomes track was delivered as .wav, but:
Windows send to zip: 94 percent (while NTFS compression saved 38 ppm ...)
7z ultra: 80 percent
Audio compressors: 70 percent

Other vendors would also deliver single files as they are, but start zipping when there are more than one; so not primarily for compression, but for delivering a "folder". Using anything but the lowest common denominator compression algorithm will get customer support overworked I guess.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #129 – 2022-01-07 10:32:41

Quote from: Porcus on 2022-01-06 22:11:24

Monkey's Insane not making it to 2x realtime decoding - how's that for a trip down memory lane, folks? ;-)

Reminds me that FhG WinPlay3 has an option to instruct mp3 to decode for 80486 or Pentium class processors.

If this trend continues .wav will no longer be usable too. While it is possible to have 64-bit float and multi MHz .wav files, the legitimate file size is 4GB. Individual software can have their hacked version of .wav but it will cause compatibility issues with other software. There are uncompressed formats like w64 and caf but I suppose these formats are not usually supported in standalone players and streamers.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #130 – 2022-01-07 10:54:13

That godawful site disables the in-page text search feature.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #131 – 2022-01-07 11:00:06

Quote from: kode54 on 2022-01-07 10:54:13

That godawful site disables the in-page text search feature.

Firefox user here. Ctrl-F doesn't work but Edit > Find in Page works.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #132 – 2022-01-08 17:59:54

Quote from: bennetng on 2022-01-06 16:14:29

And yes, flac can no longer handle these sample rates

Well, actually, the FLAC format goes up to 2^20 (1'048'576Hz), but it is not subset and currently not supported by the reference encoder. Adding it is trivial however, and there is a pull request waiting for merge: https://github.com/xiph/flac/pull/219

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #133 – 2022-01-08 19:34:36

So there is no point to distribute music in 24 bit/192 kHz because flac supports up to 1048576Hz

The FAQ says floats will not be supported, but would there be 32-bit integer support?

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #134 – 2022-01-08 22:09:06

Quote from: bennetng on 2022-01-08 19:34:36

The FAQ says floats will not be supported, but would there be 32-bit integer support?

I would imagine that supporting the less common (for good reason) of the 32-bit formats, will increase the noise to signal ratio in the bug report system.
But that hypothesis has been put to the test? ALAC supports 32-bit integer but not float (well at least that goes for refalac).

Then on the other hand, if the reference encoder wants to stick to "--lax" being required for non-subset (... and, *checks notes*, not reassign the "reserved" 011 to 32) then one doesn't don't need to tout a new 32-bit support as loud as the warnings of "non-subset, may not play, --lax required").

Edit: some SVN version of flake could do 32 bits, without that leading to hordes of desperate users asking HA to decode their .flac files

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #135 – 2022-01-09 00:08:01

Quote from: Porcus on 2022-01-08 22:09:06

some SVN version of flake could do 32 bits, without that leading to hordes of desperate users asking HA to decode their .flac files

Here is .exe in attachment for curious ones. fb2k (as for all 32 bit integer, it will be converted to 32 bit float) and ffmpeg can decode 32 bit integer FLAC created by it if stereo decorrelation is disabled (option -s 0).

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #136 – 2022-01-09 15:27:52

Quote from: Rollin on 2022-01-09 00:08:01

ffmpeg can decode 32 bit integer FLAC created by it if stereo decorrelation is disabled (option -s 0).

It probably causes signed integer overflow in the decoder though. This might work, but is undefined behaviour, so it might also stop working at some point.

I didn't want to say this right away (I too think 32-bit int audio is completely nonsensical) but as it is being discussed now anyway....

Here's a very recent (i.e. sent in yesterday) patch for ffmpeg to create such 32 bit files backwards compatible with libFLAC (not the flac command line utility though) from 1.2.1 onwards and ffmpeg from May 2015 onwards. It is specifically crafted to not cause overflow issues. If it cannot find a work-around for a certain subframe, it falls back to using a verbatim (i.e. uncompressed) subframe.

I haven't checked, but I don't think flake did the same. This seems like a safer way.

Still, I like to stress this again, I think it is dumb to use 32-bit int for audio, but if it happens, I prefer a backwards-compatible approach.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #137 – 2022-01-10 02:04:57

I'm surprised that stereo decorrelation can't just take advantage of largely different values having a small difference if you allow numbers to wrap within the target bit size. But then you'd need special versions of the math functions that are either supposed to saturate or wrap, depending on the purpose.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #138 – 2022-01-10 06:35:38

Quote from: kode54 on 2022-01-10 02:04:57

But then you'd need special versions of the math functions that are either supposed to saturate or wrap

Yes, and as those functions aren't available in hardware on many platforms (especially embedded platforms in for example a receiver or a portable audio player) these functions would have to be implemented in software. I think you'd be looking at a 5x slower decoder in that case, for very, very little gain.

.... but this has nothing to do with why this 32-bit ffmpeg patch or the flake binary can't use stereo decorrelation. That is simply because libFLAC rejects 33-bit subframes outright, and because creating a work-around on frame level instead of at subframe level is a lot more complicated.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #139 – 2022-01-17 13:24:42

So it is better to assume flac won't support any 32-bit format. In fact, both 32-bit formats are useless as distribution formats. 32-bit DAC/ADC use integer math. It makes perfect sense because they are used to convert to and from analog so 1500dB of dynamic range is useless. DSP involved in these devices like interpolation, decimation filters and modulators are more about precision within a reasonable range instead of 1500dB. If I were to propose a 32-bit float format for these kinds of operations, I would rather stripping 3 exponent bits and use them as mantissa.

On the other hand, some field recorders use floating point recording format as a form of marketing, even provide sample files to impress consumers:
https://www.sounddevices.com/sample-32-bit-float-and-24-bit-fixed-wav-files/

Basically, these things record two copies of the same audio at the same time, but with different analog input levels, then route the ADC output to a floating point DSP. When one of the ADC clips, it seamlessly corssfade to another ADC. However it is common sense that the preceding physical and analog chain (mic and preamp) can still clip, faulty DSP logic (instead of "not enough" bits) can also create glitches when combining different ADCs. Here is a review of a Zoom floating point recorder:
https://www.audiosciencereview.com/forum/index.php?threads/zoom-f6-portable-field-recorder-review.15668/

Here are some RMAA results of several converters:
https://www.audiosciencereview.com/forum/index.php?threads/rmaa-tests-welcome-to-add-others.16332/post-532248

In fact, high quality 24-bit traditional converters have far better results. Also, even if floating point math is involved in combining different ADCs, it stills makes much more sense for the recorder to scale the resulting waveform to normal range and save as 24-bit or below. Just like the old Pro Tools TDM is externally 24-bit, but internally 56-bit fixed point (48-bit processing with 8 headroom bits), and the recent versions are 32/64-bit float, as well as some other DAWs like Reaper and others.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #140 – 2022-01-17 14:22:12

I suspect that if one forty years ago had discussed a floating-point format for the single purpose of audio, we might have had a 12+4 or something? Or, any votes for 13+3?

Quote from: bennetng on 2022-01-17 13:24:42

If I were to propose a 32-bit float format for these kinds of operations

So one settled for an already-established general purpose format, with some limitations: yes you can get the pain threshold and either a mosquito or a nuclear explosion at the same time, and even all three in the same file - but for the seconds you nuke the Earth, the mosquito will be lost.
As audio might very well live happily with a volume control that cannot compress air into metal, you might argue that it wasn't an even trade-off between mantissa and exponent, but ... ...

... but unless a good nuking doesn't makes you desperate for that mosquito, then I'd say that it is just fine to use an already established general purpose 32-bit float format, a standard which has been hardware-supported since long before before the famous Pentium bug. At least when it losslessly contains your 24-bit signal.

Quote from: bennetng on 2022-01-17 13:24:42

On the other hand, some field recorders use floating point recording format as a form of marketing, even provide sample files to impress consumers:
https://www.sounddevices.com/sample-32-bit-float-and-24-bit-fixed-wav-files/

Doesn't that describe the reason for float? If you think your recording might be 30 dB low, then boos it 30 dB without ever having to worry about clipping.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #141 – 2022-01-17 14:54:00

Quote from: Porcus on 2022-01-17 14:22:12

I suspect that if one forty years ago had discussed a floating-point format for the single purpose of audio

For example, there are also bfloat16 which is optimized for AI, and half-precision float for GPUs: RGBA for colors and XYZW for coordinates, 16 bits for each component. Many older GPUs for examples only support 32/64 bit float and the additions of 16-bit floats save a lot of memory. bfloat and half-float are both 16 bits but blfoat has more exponent bits while half-float has more mantissa bits.
https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

Quote

Doesn't that describe the reason for float? If you think your recording might be 30 dB low, then boos it 30 dB without ever having to worry about clipping.

30dB is quite different from 1500dB and can be perfectly handled with integer math. For example, 32-bit integer with 5 bits of headroom and 27 bits of integer are still plenty. The headroom doesn't need to be exposed to end users: 0dBFS to the eyes of users but adds several bits on top of that during DSP. Actually, many products use this approach, including RME:
https://www.google.com/search?q=rme+totalmix+42-bit
Thousands of dB make more sense for mixing hundreds of tracks or software synth/samplers with a lot of complex patches and voices (polyphony), on top of other insert and auxiliary effects. The RMEs have FPGA mixers for basic signal routing task with low latency, but never as complex as a complete DAW.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #142 – 2022-01-17 16:14:53

Quote from: bennetng on 2022-01-17 14:54:00

30dB is quite different from 1500dB and can be perfectly handled with integer math.

Not if you started out anywhere close to peak-normalized. Which, so it happens, often is the case. No, certainly it does not have to be that way, but it seems to be. Integer formats go to full volume, and more bits -> more at the bottom.
(Someone here - I don't remember who - once championed float for end-user format if only to screw up the loudness war. I'd give that a "fascinating thought experiment!".)

Quote

Actually, many products use this approach, including RME:

Google link led me to https://archiv.rme-audio.de/en/support/techinfo/hdsp_totalmix_hardware.php , which is interesting. One format per operation type?!
The "multiplier" uses 24 bits integer, but now you got a full mighty sixteen bits volume control, that's more like it. ("65563" :-) )

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #143 – 2022-01-17 16:53:18

Quote from: Porcus on 2022-01-17 16:14:53

Not if you started out anywhere close to peak-normalized. Which, so it happens, often is the case. No, certainly it does not have to be that way, but it seems to be. Integer formats go to full volume, and more bits -> more at the bottom.

From the perspective of a file format, integer has to pad to the MSB for consistent level to end users. For the perspective of processing, the DSP designers can have their own implementations. That means, regardless of integer or floating point math, for a field recorder, it only needs to spew out a normalized 24 or 16-bit file, and that's why I said in Reply #139 "both 32-bit formats are useless as distribution formats". In fact, integer math often involve "accumulator" -- the internal bit-depth, even for apparently simple hardware like DAC chips, with 48-64 bits. It is just a trade off between processing speed and memory bandwidth. Floats need processing power to shift bits, integers need more bandwidth so no shifting math is used. Dither is not always used, but if used, only when converting from the accumulator's bit-depth to the destination bit depth, instead of in every intermediate step.

Quote

Google link led me to https://archiv.rme-audio.de/en/support/techinfo/hdsp_totalmix_hardware.php , which is interesting. One format per operation type?!
The "multiplier" uses 24 bits integer, but now you got a full mighty sixteen bits volume control, that's more like it. ("65563" :-) )

Consider these products either output to analog or SPDIF/AES3/ADAT (24-bit integer), or in reverse, record from these sources, more precision means nothing other than added cost or fewer simultaneous real time operations.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #144 – 2022-01-18 11:35:40

Download the pdf files and see how poor the results, even after firmware update. Really doesn't justify the use of 32-bit float file format, as users ultimately need to scale down the misconfigured float file to see the full waveform and listen to it anyway. It could be bad DSP logic and/or broken analog stage led to these results.
https://www.audiosciencereview.com/forum/index.php?threads/zoom-f6-portable-field-recorder-review.15668/post-786854

Attached RMAA results of my 12-bit version of UA-law for comparison, the Zoom's 32-bit float 192kHz file is not really better than 12-bit 96kHz version of UA-law. Notice when the signal is stronger, UA-law's noise floor and noise shaping are also stronger, and vice versa. The internal math is not 12-bit of course, the programming language used simply doesn't have these data types. Even 24-bit is created by combining 3 bytes to 32-bit integer.
https://hydrogenaud.io/index.php?topic=121181.msg1005031#msg1005031

Actually in ASR, MC, RME's boss, pointed me to that old TotalMix article, while I and other members were speculating whether TotalMix FX (FX = DSP effects including reverb, EQ etc) is completely done on the FPGA or not. We were talking about Chord's poor use of FPGA processing power on the M-Scaler, a very expensive hardware resampler.
https://www.audiosciencereview.com/forum/index.php?threads/chord-hugo-m-scaler-stereophile-review-measurements-also.11868/post-658146

Scroll down to see MC's replies, as well as the next pages.
https://www.audiosciencereview.com/forum/index.php?threads/chord-hugo-m-scaler-stereophile-review-measurements-also.11868/post-677200

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #145 – 2022-01-18 22:11:37

Off topic alert, but:
Who the f**c is the apparent forum joke Rob Watts? The musician? Has he made claims about how his ears need higher resolution audio than Neil Young's or something? Released a sad song about "My dog left me out of envy for my hearing and he took my wife with him"?

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #146 – 2022-01-19 06:13:46

Rob Watts = Chord's boss.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #147 – 2022-01-19 09:09:39

If I'm paying for something, I'll take as many bits as possible please! I don't care about a race to the bottom, or calculating the exact frequency that I can no longer hear - if this music was produced in 192kHz @ 32bit, I'll take that please.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #148 – 2022-01-19 10:26:24

https://hydrogenaud.io/index.php?topic=108864.msg948272#msg948272
If I am the one who sells music I'd be extremely happy to have these kinds of customers.

http://melancholyaudio.blog.fc2.com/blog-entry-17.html
Google translate:

Quote

Next, let's take a look at the graph with DSEE HX. The result is ... Oh, the high frequencies are higher than I expected. What surprised me most was that it did not drop at all around 20kHz to 22kHz and was decaying naturally.

What the industry needs is more these kinds of plugins to fool the customers.

Re: xiphmont’s ‘There is no point to distributing music in 24 bit/192 kHz’

Reply #149 – 2022-01-19 11:45:53

Quote from: NateHigs on 2022-01-19 09:09:39

If I'm paying for something, I'll take as many bits as possible please! I don't care about a race to the bottom, or calculating the exact frequency that I can no longer hear - if this music was produced in 192kHz @ 32bit, I'll take that please.

If what I am offered is a file straight from the artist's DAW - the artist likely not even taking note that it is a different kind of .wav than in the stores - then yes please, float or not.
Not because of the audio quality, but because it is closer to the artist's hand. Just like if a painter releases a litho series and there is a video with "and this time I finally learned to do the damn printing by hand without help" - it is not because it makes the printing anything more accurate. Heck, when I realized that WavPack handles all the quirks of 32-bit formats and can restore the original .wav bit by bit, I went back to the .wavs and re-WavPack'd them, using the WavPack exe rather than fb2k.

Especially if the .wav is for free, less incentives of faking it. (I have a hunch that 32-bit .wavs tend to disappear from Soundcloud when fans ask why they cannot play them.)

One example still up on Soundcloud: https://soundcloud.com/termo-records/sets/the-opium-cartel . One of the label owners' own projects, with Tim Bowness of No-Man. The last file is as weird as 44.1 kHz/32-bit. And then there's a nice little Blue Öyster Cult cover.