HydrogenAudio

Lossless Audio Compression => WavPack => Topic started by: bennetng on 2023-05-10 18:19:28

Title: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-10 18:19:28
Hello @bryant

https://www.soundliaison.com/index.php/6-compare-formats

The file:
A Fool For You - DXD 352kHz-32bit
Can be losslessly saved as float, probably because the DAW used to render the file uses float as internal format. By using x4 and x6 I got these results with non-audio data stripped:
Code: [Select]
   Length Name
   ------ ----
642175416 a-fool-for-you-carmen-gomes-inc-dxd352-32.wav
341974990 float x4.wv
340610026 float x6.wv
396557238 int x4.wv
395185830 int x6.wv
Is it possible to take advantage of this during encoding, but keeping the format as fixed point during decoding?
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-05-11 00:12:55
Interesting find! Yeah, you’re analysis makes sense too. The 32-bit integer code in WavPack looks for redundancy in the LSBs, but only constant redundancy sample-to-sample. In this case the number of zeroed LSBs would shift with the sample’s magnitude.

I wonder how many 32-bit files, which are already kind of rare compared to float, would fall into this category.

This would be pretty easy to implement. The only issue is there would be no way to make it backward compatible, which is a big drawback for me, especially considering how long it’s been since I made a decoder-breaking change.

Anyway, thanks for finding this and letting me know, and I’ll give it some thought.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-11 05:41:01
Thanks. I was thinking about WavPack 4 compatibility too. In this case, how about offer a command-line option to let user explicitly convert to float, but keeping all non-audio data intact?

I don't expect this could be done when using pipe: the encoder would need to ensure the whole file can be converted without loss, when it is not possible, the process should pause or quit, and notify the user. This would still be much more convenient than requiring users to manually check for bit-perfectness, converting the file, and worry about if non-audio data is being altered or not.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-05-11 08:57:26
how about offer a command-line option to let user explicitly convert to float, but keeping all non-audio data intact?
I was kinda thinking one that kills the non-audio data rather than preserving it. Preserving non-audio is for getting the file back bit by bit; removing non-audio (with -r) is for those who don't wish that; this conversion is "even more severe". But then OTOH: does --pre-quantize keep headers? That is even lossy.
The opposite way could be interesting too. I have found in the wild something that apparently was a 16 bit signal opened in some application and saved as 32-bit float (no other processing, no dither no nothing).

This would still be much more convenient than requiring users to manually check for bit-perfectness, converting the file, and worry about if non-audio data is being altered or not.

Hadn't it been for the compression gains, that would have been an idea for wvunpack. It can already do some source-format-override (say with --wav), and it is a more natural workflow to keep the source until you know that you want to change it.
Hadn't it been for the compression gains, yes.

Musing aloud: a "-R" (letter selected for including -r functionality) that takes a numerical argument.
Either
-R 0 = do nothing. (Switch off a previously given -R)
-R 1 = -r
-R 2 = if file is integer, convert it to float if that is lossless (will not be reversible by WavPack 5 and below)
[-R 3: 2&1]
4: if file is float, convert it to integer if that is lossless
8: this bit controls peak-normalization.
16: go ahead do whatever that improves compression

Or a different scheme, keeping "-r" functionality out of it:
-R 0: do nothing.
-R 1 to -R 3: prefer WAVE type 1 to 3 format, amend headers accordingly (type 3 is float and only float, right?) and something analogous for AIFF, with 3 being float. WARNING: will not be reversible by WavPack 5 and below
add 4 or 8: peak-normalize / brute-force "prefer smallest file".

I've also mistaken --normalize-floats for being peak-normalization, so ... maybe suggest something that does precisely that  O:)
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-11 11:23:06
how about offer a command-line option to let user explicitly convert to float, but keeping all non-audio data intact?
I was kinda thinking one that kills the non-audio data rather than preserving it. Preserving non-audio is for getting the file back bit by bit; removing non-audio (with -r) is for those who don't wish that; this conversion is "even more severe". But then OTOH: does --pre-quantize keep headers? That is even lossy.
The opposite way could be interesting too. I have found in the wild something that apparently was a 16 bit signal opened in some application and saved as 32-bit float (no other processing, no dither no nothing).
Anything <= 24-bit should be pretty easy to check and I am keeping these files in my own projects as well: the oldish Audition I use opens float files much faster than 24-bit.

At this point perhaps WavPack can detect uncompressed μ-law and A-law too, which are basically 8-bit floats without infinities and NaNs.

Keeping non-audio data or not of course is a case-by-case and user specific choice. It would be disastrous if for example, DAW projects lose all loop points and regions when importing samples.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-05-11 13:06:24
At this point perhaps WavPack can detect uncompressed μ-law and A-law too, which are basically 8-bit floats without infinities and NaNs.

Which brings me to a question:
Lossless compressors reject µ-law and A-law in WAVE/AIFF (well at one point there was a compressor which by mistake failed to weed them out), presumably because they won't come out right if they are played back without the expansion. I use "expansion" for "reverse dynamic compression", the second step of "companding" ...
In principle, they could have been compressed. Equipping the file with a flag saying "do not play the stream, read the WAVE header and play as you would play the WAVE", kicking that decoding further down the road, rather than passing "unexpanded" and hence wrong audio down the playback chain.
Question 1: Is there any codec that can do this?
Uneducated guess is that the answer is negative. And:
Question 2: Do I guess correctly that this would be hard to retro-fit into a format, because existing implementations would indeed pass the stream out and play it without the expansion?
Which brings me to:
Question 3: Is it easier for <codec/format X which unpacks to bit-exactly the source file> to implement a flag that prevents playback (but not decoding to file)? If you cannot enforce correct playback, can you prevent playback?

Part of this came out of how @bryant explained that - if I understood it correctly, and that is a substantial reservation - AIFF allows non-integer sampling rate, but WavPack will have to use integer upon playback (and well, what sound card handles float there anyway) even if wvunpacking gets the right thing back. Potentially, if a WavPack file could assign "playback" to be done with a sampling rate of "0" (meaning: wait infinitely long for the next sample, so if you are smart then do not even try!) then the file would be unplayable but decodable.

(What is the use of an unplayable file? Storing it in a checksummed format that can detect corruption, and with tags. Any compression gain would be a bonus.)


Although the market for µ-law or A-law compression (hm, could *ADPCM in WAVE be handled that way too?) might be quite meager to say the least, the idea got me curious. Since memory does not always serve me right, there is even a risk that I might have pestered bryant with the question on some occasion already.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-11 17:39:25
Without getting too complicated, the easiest thing to do regarding the original topic is encoding the input to a WavPack 4 compliant float file and check for bit-perfectness, and by default preserve non-audio data unless being told to strip them. This should be extremely easy to do given the fact that both of them are 32-bit so chunk sizes and such don't need to be changed.

The only drawback I can think of is some devices (e.g. streamers, DAPs) don't support float. A reminder on the help file would be enough.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-05-11 18:02:44
Yeah, if user wants a functionality to convert if that is lossless - kinda trivial but would require some temporary files: Maybe you don't want to wait for a -hhx6 only to find out that sorry, lossy, deleting - doing that to keep filename.f.wv and filename.s.wv (for float and signed) should be "optional" then?
Meaning, you need a temporary float file to check for losslessness, either uncompressed of a temporary encode with -f?

to a WavPack 4 compliant float file
As far as I understand, the WavPack 5 file format is "WavPack 4 compliant" in the sense that more primitive executables decode them just fine.
(At least as long as you stay clear of features that WavPack 4 couldn't even handle, like huge channel count.)
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-11 18:23:55
For example, Bryant offers several Cool Edit/Audition plugins from the oldest Syntrillium ones to the latest Adobe CC ones, and the plugin that I am using should be WavPack 4 based, it can decode WavPack 5 files and WavPack DSD, but only capable of decoding to PCM.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-11 20:52:16
Yeah, if user wants a functionality to convert if that is lossless - kinda trivial but would require some temporary files: Maybe you don't want to wait for a -hhx6 only to find out that sorry, lossy, deleting - doing that to keep filename.f.wv and filename.s.wv (for float and signed) should be "optional" then?
Meaning, you need a temporary float file to check for losslessness, either uncompressed of a temporary encode with -f?
This shouldn't require big temp files or a lot of RAM if done within the native wavpack executable without pipe. It just needs to run a decoding pass (if the input is a wavpack file) and verify the possibility of lossless float conversion. If the current block fails then immediately quit, or proceed to another file in the queue. This would be as fast as verifying integrity via decoding. Encoding should only start if the first pass is completed.

If the input is already an uncompressed PCM file it is even easier, just parse the whole file and only use float to encode if it is bit-perfect, otherwise fallback to the original input format.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-05-12 00:04:27
Created a quick proof-of-concept of this for my own curiosity. I was surprised to discover that the 32-bit integer file in the WavPack test suite also has this “feature”, and both that file and the “fool” file improve by about 9% when compressed as float. Maybe most 32-bit integer files are like this? I was amused once when a user requesting 64-bit float support sent me a sample needle-drop file and it was also losslessly representable in 32-bit float!

The easiest approach would be to add an option (e.g., --32bit-int-to-float) that would work with either PCM files or WavPack files and would fail if the operation was lossy, or perhaps optionally just display a warning at the end (e.g., --force-32bit-int-to-float). Starting over on failure would obviously be a complication with pipes, and there's no provision for that behavior now. I think it’s extremely unlikely that a file would only fail near the end; if it’s going to fail it would be right away. Maybe this could also be a way to handle 64-bit floats?

With respect to non-audio metadata the simplest thing would be to discard it (because it’s obviously no longer valid and you wouldn’t want to make a file with it). Of course, most formats can be switched “in-place” from integer to float (with the notable ugly exception of AIFF) so maybe that would be an option, but unfortunately the current architecture makes that ugly. Since 32-bit integers don’t make sense as a DAW format (they don’t clip gracefully and don’t process efficiently), my guess is that this is mostly a “distribution” format where metadata would not be relevant anyway (except for ID3 tags, which this file had).

And yes, WavPack does already handle all the cases where, for example, 32-bit float data was sourced directly from a fixed integer size (which would be commonly found in DAW files).

Regarding μ-law and A-law, I just by crazy coincidence have a μ-law file because my voicemail to e-mail service uses them. I tried compressing that file using a whole assortment of tools including my two general-purpose compressors and WavPack hacked to ignore the format specifier and pack as 8-bit PCM (which it kinda is with a little flipping of values):
Code: [Select]
-rw-rw-r-- 1 david david 183738 Apr 28 16:07 message.wav
-rw-rw-r-- 1 david david 148280 May 11 09:04 message.wav.lzw
-rw-rw-r-- 1 david david 131878 Apr 28 16:07 message.wav.gz
-rw-rw-r-- 1 david david 127799 May 11 09:12 message.wav.newpack
-rw-rw-r-- 1 david david 119673 Apr 28 16:07 message.wav.bz2
-rw-rw-r-- 1 david david 113548 Apr 28 16:07 message.wav.xz
-rw-rw-r-- 1 david david 108888 May 11 15:02 message.wv
I think that an algorithm optimized for μ-law and A-law could improve on that by converting to linear and making the prediction there, and then converting the prediction back to non-linear and entropy encoding the difference. In any event, I can’t think of a way this could be done as a non-breaking change, and so it almost certainly won’t happen. Unless someone wants it…  :)
 
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-12 06:24:47
Created a quick proof-of-concept of this for my own curiosity. I was surprised to discover that the 32-bit integer file in the WavPack test suite also has this “feature”, and both that file and the “fool” file improve by about 9% when compressed as float. Maybe most 32-bit integer files are like this? I was amused once when a user requesting 64-bit float support sent me a sample needle-drop file and it was also losslessly representable in 32-bit float!
Software like Audacity for example, use 32-bit float internal format, but allow saving as 32-bit integer and 64-bit float. On the other hand, SoX command-line tool uses 32-bit integer as internal format, but also allows saving as 32/64-bit float.

Quote
The easiest approach would be to add an option (e.g., --32bit-int-to-float) that would work with either PCM files or WavPack files and would fail if the operation was lossy, or perhaps optionally just display a warning at the end (e.g., --force-32bit-int-to-float). Starting over on failure would obviously be a complication with pipes, and there's no provision for that behavior now. I think it’s extremely unlikely that a file would only fail near the end; if it’s going to fail it would be right away. Maybe this could also be a way to handle 64-bit floats?
The same can also be offered in wvunpack if users want to switch to another 32-bit format during decoding for whatever reasons (e.g. compatibility), and let they know if the conversion is lossy or not.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-05-12 07:42:10
Software like Audacity for example, use 32-bit float internal format, but allow saving as 32-bit integer and 64-bit float. On the other hand, SoX command-line tool uses 32-bit integer as internal format, but also allows saving as 32/64-bit float.
There was some discussion here somewhere on DAWs using 32-bit integer, but I don't find it. Apart from SoX I don't remember any to name and shame, but it isn't that long since 24-bit integer was teh shitz and fancy names like "DXD" were introduced.
Other applications?


By the way, sizes.
403 375 548 with -hx
394 134 751 with -hhx6, ever so slightly beating every monkey
393 584 015 FLAC -8pe -l32 -b8192 --keep-foreign-metadata.
387 724 880 MPEG-4 ALS -7 -p and that even beats OptimFROG --preset 10. Also the frog throws an error, likely due to some WAVE metadata, and even if using --incorrectheader.

So the big difference down to the 340 610 026 cannot be explained by this being a particularly WavPack-unfriendly signal. Yeah sure you could point out that -hhx4 could often outcompress FLAC (and by more of a margin, Monkey's) on this sort of resolution, but the impact is too big to put it down to that.
A gamechanger ... well admittedly, in the "bragging rights" game, since I guess the overall hard drive cost saved would hardly be worth the effort  O:)
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-12 07:52:56
The same can also be offered in wvunpack if users want to switch to another 32-bit format during decoding for whatever reasons (e.g. compatibility), and let they know if the conversion is lossy or not.
Especially clipping, which is a serious thing.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-12 07:56:42
"bragging rights" game
Another bragging right is: Now WavPack can authenticate the master quality of the hi-res files you purchased!
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-05-12 10:56:57
Are you saying that some vendor now offers .wv downloads "featuring" the MQA death spasms?
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-12 11:16:18
Are you saying that some vendor now offers .wv downloads "featuring" the MQA death spasms?
No, just want to say that the upcoming update of WavPack command-line tool can check whether the origin of the 32-bit file is float or not. Because some audio interfaces are capable of 32-bit integer recording if done in the right way, e.g. record and edit with Reaper and set the recording/bounce format to 32-bit integer.

Related topic:
https://hydrogenaud.io/index.php/topic,114816.msg1026865.html#msg1026865
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-13 13:55:11
So here are some "integer friendly" floating point mixing techniques that would never happen in real world. For example, I think the song "The Saga Of Harrison Crabfeathers" sounds great:
https://cambridge-mt.com/ms/mtk/#Araujo

Click the "223 MB" link to download the multitrack 24-bit archive. Basically, drag all files into Audacity, select all tracks and mixdown to stereo directly without any adjustment.
X

Naturally, after mixdown, there will be some overs. Typical users may either do a peak normalize or apply a limiter to deal with them, but in this case, open the Nyquist prompt and apply an integer friendly gain value to the mixed track:
X

Then render to a 32-bit float file. Now the file can be losslessly saved as fixed point.
Code: [Select]
-------------------------------------------------------------------------------
E:\download\01_KickIn.wav
00:03:26.0606122 = 18174546 samples / 2-ch @ 44100 Hz
32-bit floating point
          Ch    Position     Value                     dBFS
Maximum   0     7680784      0.7279655933380127        -2.757782935073358
Minimum   0     5459222      -0.92048293352127075      -0.71968518494290912
Abs.min   0     152          7.4505805969238281e-08    -142.55619765854985
Round Trip: 27
-------------------------------------------------------------------------------
oldsCool 1.0.0.4 read-only mode
Code: [Select]
All tracks decoded fine, no differences found.

Comparing:
"E:\download\8p.flac"
"E:\download\gx4-32f.wv"
Compared 9087273 samples.
No differences in decoded data found.
Channel peaks: 0.920483 (-0.72 dBTP) 0.919624 (-0.73 dBTP)

Comparing:
"E:\download\01_KickIn.wav"
"E:\download\gx4-32i.wv"
Compared 9087273 samples.
No differences in decoded data found.
Channel peaks: 0.920483 (-0.72 dBTP) 0.919624 (-0.73 dBTP)

Now the float file is bloated.
Code: [Select]
  Length Name
  ------ ----
72698272 01_KickIn.wav
35661546 8p.flac
39560446 gx4-32f.wv
35732920 gx4-32i.wv
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-05-19 15:19:26
Of course µ-law and A-law are the key to future world domination ... just kidding, the following question is out of curiosity:
I think that an algorithm optimized for μ-law and A-law could improve on that by converting to linear and making the prediction there, and then converting the prediction back to non-linear and entropy encoding the difference.
Is that obvious?
It would be if the integer-encoded LPCM (before the µ-/A-law transformation) were AR(n) or the suitable generalization - but that assumption doesn't hold up. It surely has enough of a linear component to it, that linear decorrelation captures big %%s of the size. (... I don't even know WavPack's internals ...), but you would expect the µ-/A-law byte stream to share that property to a certain degree too - larger or smaller.
Sure an algorithm specifically optimized for µ-/A-law would be expected to outperform one that has already been optimized for LPCM, but is there any reason it would be out of that particular measure?
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-05-24 03:16:03
So the more I thought about just converting 32-bit integer files to float files the less I liked it. It would be a mess while still leaving some situations unhandled.

Then I started thinking it would be best to byte the bullet and break decoders for this. But then I thought of submitting patches to FFmpeg and Rockbox and the fact that some hardware devices (like I describe here (https://hydrogenaud.io/index.php/topic,119143.0.html)) would never be compatible. I haven’t broken decoding for almost 20 years, and I decided I didn’t want to do that either.

But I didn’t want to leave 9% on the table either, even though I think these files are broken and have no good reason for existing.

Finally I came up with a solution that has very minimal downside. The way WavPack stores 32-bit formats (either float or integer) is first by converting to normalized 24-bit integers (which is the most that the regular WavPack integer code can handle). This is accompanied by a metadata chunk indicating by how much the values are normalized (shift count) so that this 24-bit audio can be converted to either 32-bit float or integer. Of course that is mathematically lossy some of the time, but obviously it’s for all practical purposes lossless. In fact, a long time ago I actually had a -p option (for “practical”) that was just that. Since each block can have a different normalization it retains the dynamic range / clipping advantage of float, and the compression performance is significantly better than lossless.

Additionally, for true lossless mode there was a bitstream that provided the missing data to get to the complete 32-bit float or integer values. This stream resided in the main file for regular lossless and in the "correction" file for hybrid lossless. For lossy modes that was the first thing to go, and then the hybrid mode would simply act on the 24-bit data to get to the target bitrate.

So my idea for handling these float-derived integer files is to create a new, smaller bitstream for the “completion” data with a new metadata ID. Old decoders will simply ignore it and consider the file as “lossy” (although we’re still talking less noise than the best DACs). Updated decoders will recognize the new ID and do the lossless decoding. Rockbox never did anything with that info anyway, so it will be unaffected (what’s Rockbox going to do with more than 24 bits...I think it is 16-bit internally). The new format would only be used when it actually makes a difference, and we get the extra compression without breaking anything.

I’ve already created simulated files and verified that Rockbox and VLC have no issues, so I think it’s going to work. I still have to try them on my Plenue and my Lotoo, but I don’t expect any surprises.

As for the u-Law and A-Law, @Porcus, when you listen to u-Law as linear (once you’re corrected for the crazy value order) it sounds distorted compared to the properly decoded version (and louder, obviously, but that’s understood). That distortion would hurt compression compared to doing the prediction in the linear domain, but I don’t know whether that would be a little or a lot. Unless it gives a big improvement though it would not be worth the complexity...just convert to signed 8-bit and be happy. Like DSD, applications that don’t understand the formats get PCM.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-05-24 05:46:24
byte the bullet
Good to know that there is a way to improve compression without breaking compatibility with hardware devices so there is no need to byte the bullet :D

Quote
But I didn’t want to leave 9% on the table either, even though I think these files are broken and have no good reason for existing.
Agreed. In fact, another reason that I like about WavPack's ability to preserve the whole file instead of only audio data is that I can keep these files as specimens. So it would be great if the upcoming decoder can decode to the exact same file.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-06-16 20:01:52
So I have implemented this optimized mode for 32-bit integer data and it seems to work as expected, consistently improving compression by around 9% for audio that is sourced from 32-bit float. Thanks again @bennetng for the tip!

After a little thinking I realized that there’s actually a symmetrical relationship with conversions between 32-bit float and 32-bit integer; both conversion directions are lossy (even without clipping). But once the conversion has been done once (either way) you end up with data values that can losslessly be represented in either format and converted back and forth forever without loss (assuming a non-broken conversion algorithm). And in theory, since they’re essentially the same values, either should losslessly compress by the same amount.

However I noticed that the target DXD file now compresses almost half a percent better in 32-bit integer than in 32-bit float. So I needed to added some similar optimizations to the 32-bit float code to properly handle that case, and to also handle the case of a reduced-width mantissa. After that, the 32-bit float version of this file is only 2216 bytes larger than the 32-bit integer version (0.0003%).

The new version is attached (the option is --optimize-32bit), and also has the multithreading improvements discussed in this thread (https://hydrogenaud.io/index.php/topic,124188.0.html).

As for the μ-law and A-law data, I tried my idea of using the WavPack decorrelation code to generate predictions in the linear domain, convert them back into an 8-bit non-linear code, and store the non-linear deltas using the standard WavPack entropy encoder. The improvement was decent (about 5% to 10%) and I was able to get the message file above down to 97274 bytes.

However, I also discovered that lossless compression of non-linear audio is already a thing. Our friend Florin Guido wrote a paper about this (https://ieeexplore.ieee.org/abstract/document/4217066) and ITU-T Recommendation G.711.0 (https://www.itu.int/rec/T-REC-G.711.0-200909-I/en) covers lossless compression of μ-law and A-law audio signals. I didn’t build the code supplied there, but looking at the test vectors it seems that it compresses about the same as the best I achieved, but with much smaller frames. So my curiosity is satisfied, and I don’t think this is anything I am going to spend any more effort on. The morbidly curious can take a look at the branch on GitHub (https://github.com/dbry/WavPack/tree/experimental-muLaw).   :D
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-06-17 08:31:06
Thanks Bryant, here some benchmarks using "A Fool For You - DXD 352kHz-32bit":

Encoding in seconds
09:36 optimize
07:84 normal
96:55 optimize x4
95:12 normal x4

Decoding
04:41 optimize x4
04:21 normal x4

I also tried to apply 32-bit integer fade to the file to make it partially contains 32-bit integer exclusive values and found no issues.

The example in Reply #17 (https://hydrogenaud.io/index.php/topic,124142.msg1027165.html#msg1027165) has also been improved.

Is it true that decoding will be lossless (or as lossless as possible within limitations of individual third party programs) in updated Audition plugins and third party programs that properly utilize the updated decoder?
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-06-18 17:29:31
Thanks for the benchmarking; these match my results. Unfortunately there is a little more work for this during decoding to check the magnitude of the result to determine how many bits to read from the “make up” stream (previously we always read a fixed number of bits). This penalty does not occur for floats.

As for the compatibility, any application that is built with (or uses dynamically) a new libwavpack will transparently get the correct fully lossless decoding. Applications that use older versions of libwavpack (or FFmpeg) will decode such files as lossy. The reason this doesn’t bother me too much is that calling them “lossy” is rather academic. Each frame is stored as normalized 24-bit, so the difference between the lossy and lossless versions is always going to be around 150 dB down, and of course except when using those crazy 32-bit DACs, the data going to hardware is going to be identical.

I’ll eventually build a new Cool Edit / Audition 1.0 – 3.0 filter. I’m not sure about the later Audition versions (CS and CC) because I can’t even test those any more and I get the impression there’s not a lot of development going on there (and since Audition immediately converts to float anyway, there’s even less difference). I'll also submit a patch to FFmpeg, but that doesn't mean it will ever go in.

Using the transcoding option it’s possible to switch the optimization on and off, so there’s no real danger of these files becoming obsolete. That said, I would not use this unless the final use case was well defined, and I haven’t decided yet whether to include the switch in the help / man page in the first release, or just keep it "experimental" for now (but the decoding part is so simple that I'll definitely leave that in there).
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-06-18 18:39:17
Thanks for the explanations. I am using Audition 1.5 released in 2004 and for more complex stuff I can use Reaper so it should be lossless in future releases.

I hate the fact that the Audition FLAC filter I am using was released in 2007 and it cannot open 32-bit FLAC files at all.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-06-19 16:01:27
any application that is built with (or uses dynamically) a new libwavpack will transparently get the correct fully lossless decoding. Applications that use older versions of libwavpack (or FFmpeg) will decode such files as lossy.

"new" as in WavPack 5, or "new" as in "5.6.6" meaning it will break compatibility? Spoiler (click to show/hide)
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-06-22 20:00:44
any application that is built with (or uses dynamically) a new libwavpack will transparently get the correct fully lossless decoding. Applications that use older versions of libwavpack (or FFmpeg) will decode such files as lossy.

"new" as in WavPack 5, or "new" as in "5.6.6" meaning it will break compatibility?
"new" as in "new new" (5.6.6). Still haven't decided how to approach this.

@bennetng I created a new Cool Edit / Audition filter with read-only support for this. The gain with float is only around 0.5% and the filter cannot write 32-bit integers anyway (although the improvement would obviously show up then).

However, I also turned on the new multithreading and the speedup is huge.  ;D

Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-06-23 08:55:48
@bennetng I created a new Cool Edit / Audition filter with read-only support for this. The gain with float is only around 0.5% and the filter cannot write 32-bit integers anyway (although the improvement would obviously show up then).

However, I also turned on the new multithreading and the speedup is huge.  ;D
Works great, thank you very much.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-09-14 21:10:49
Sound Liaison replied, so keep an eye on file changes.
https://www.audiosciencereview.com/forum/index.php?threads/sound-liaison-pcm-dxd-dsd-free-compare-formats-sampler-a-new-2-0-version.23274/post-1716135
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-12-02 14:32:14
Here is a gigabyte 32-bit integer in the wild, with an impact of 12 percentage points - closing in on twenty percent: https://soundcloud.com/kyrokotei/in-the-distant-travels .
Remixing dreamy synths into some Alcest-ish post-black metal track. (Original at https://sadnessmusic.bandcamp.com/album/i-want-to-be-there and also put up later at higher price tag.)

1187200698 bytes AIFF at 192 kHz. --optimize-32bit saves around 144 MB both for -f and for -hhx6 --threads (the only multi-threaded I ran), that is about 12 percentage points. Ranging from 17.4 to 19.5 percent - not quite catching 20 :-o
It also will also outdo OptimFROG - them 144 are an order of magnitude above the megabytes that the frog can save over -hhx4 (after converting to WAVE, because the frog doesn't support AIFF.)

At 192 kHz, the file behaves in a way that would be "surprising", hadn't I already been "surprised" at hi-rez artefacts too many times already:
* -fx beats -hhx, -hx and -x (with and without --optimize-32bit). Although at -x4, order is restored.
* flac -0 beats all "single x" (without --optimize-32bit, of course). flac -0 uses fixed predictors only and dual mono. (I gave --keep-foreign-metadata to keep it comparable.) And from -7, FLAC mingles between the x4's.
* Not so surprising is that all apes and ALS are found between "worst x" and "best x4".
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-12-02 17:22:22
So 32-bit integer files are as suspicious as viruses. ::)

Did you (or other members) find any float file which can be improved by using --optimize-32bit? I posted a theoretical example on Reply #17 (https://hydrogenaud.io/index.php/topic,124142.msg1027165.html#msg1027165) but I can't find such files in the wild, so Bryant may not include this optimization in release version.
https://github.com/dbry/WavPack/commit/67435bc8d61d73707cf883812c5addc9fa503332#commitcomment-129941853
Quote
I may drop the floating-point portion because the gain is so rare/minimal
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-12-04 17:29:19
I recall I tested --optimize-32bit on float "by mistake", I thought it was an integer-only thing and then a wildcard sent it off on more files than I intended. The files were different, so I realized then it does make a "difference" - but the size savings were small indeed. I tested one now for 0.002 percent size difference, and then (using fb2k bitcompare, uses an official WavPack that doesn't read those 5.6.6 corrections) I infer that --optimize-32bit was invoked in 79 percent of the samples.

So I am not sure this warrants any --optimize-32bit=<0 for none, 1 for integer, 2 for float, 3 for both, defaulting to 1> except in a version for testing. If we postulate that 32-bit integer is "kinda stupid anyway", then it might be a good idea to stick to handling the waste those create, keeping compatibility for everything else.

Spoiler (click to show/hide)
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-12-04 21:24:45
A few more tests on floats with and without --optimize-32bit. All downloaded from https://soundcloud.com/soundchaotic
11 .wav files, ran on default, -hx, -hx4, -hhx6 --threads, all four with and without --optimize-32bit

Actually, --optimize-32bit makes every file bigger.
Depending on how it was implemented, that may indicate (1) nope, not worth it - or (2) not implemented optimally yet. Maybe both. For example, I have no idea whether the optimize-32bit choice can be made per frame or has to be made per file - but looks like fat chance that even a a per-frame brute-force-select between "integer algorithm" and "float algorithm" could make big impact on this corpus.

File sizes for those who care (all with -m):
Code: [Select]
135 892 540 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-g--optimize-32bit.wv
135 891 042 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-g.wv
134 574 576 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-hx--optimize-32bit.wv
134 573 078 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-hx.wv
131 648 216 SoundChaotic - Brutoaler Supermarkt.-g--optimize-32bit.wv
131 646 774 SoundChaotic - Brutoaler Supermarkt.-g.wv
130 604 540 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-mvhx4--optimize-32bit.wv
130 603 042 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-mvhx4.wv
129 999 292 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-hhx6--threads--optimize-32bit.wv
129 997 794 Jasmine Thompson & Calum Scott - Love is just a Word (SoundChaotic Hard-Kick-Bootleg).-hhx6--threads.wv
129 527 162 SoundChaotic - Brutoaler Supermarkt.-hx--optimize-32bit.wv
129 525 720 SoundChaotic - Brutoaler Supermarkt.-hx.wv
126 855 624 SoundChaotic - peppr�parat.-hx--optimize-32bit.wv
126 854 176 SoundChaotic - peppr�parat.-hx.wv
124 780 032 SoundChaotic - peppr�parat.-g--optimize-32bit.wv
124 778 584 SoundChaotic - peppr�parat.-g.wv
121 951 848 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-g--optimize-32bit.wv
121 950 572 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-g.wv
120 737 574 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-hx--optimize-32bit.wv
120 736 298 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-hx.wv
119 672 162 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-mvhx4--optimize-32bit.wv
119 670 886 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-mvhx4.wv
119 222 184 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-hhx6--threads--optimize-32bit.wv
119 220 908 Monsieur Periné feat. Vicente García - Nuestra Canción (feat. Vicente García)(SoundChaotic Hard-Kick-Vocal-Edit).-hhx6--threads.wv
115 175 634 SoundChaotic - peppr�parat.-mvhx4--optimize-32bit.wv
115 174 186 SoundChaotic - peppr�parat.-mvhx4.wv
113 144 540 SoundChaotic - Brutoaler Supermarkt.-mvhx4--optimize-32bit.wv
113 143 098 SoundChaotic - Brutoaler Supermarkt.-mvhx4.wv
112 945 604 SoundChaotic - Brutoaler Supermarkt.-hhx6--threads--optimize-32bit.wv
112 944 162 SoundChaotic - Brutoaler Supermarkt.-hhx6--threads.wv
105 812 012 SoundChaotic - peppr�parat.-hhx6--threads--optimize-32bit.wv
105 810 564 SoundChaotic - peppr�parat.-hhx6--threads.wv
 77 824 390 wunderbare schließfächer(191).-g--optimize-32bit.wv
 77 823 644 wunderbare schließfächer(191).-g.wv
 77 705 398 wunderbare schließfächer(191).-hx--optimize-32bit.wv
 77 704 652 wunderbare schließfächer(191).-hx.wv
 77 587 660 wunderbare schließfächer(191).-mvhx4--optimize-32bit.wv
 77 586 914 wunderbare schließfächer(191).-mvhx4.wv
 77 529 236 wunderbare schließfächer(191).-hhx6--threads--optimize-32bit.wv
 77 528 490 wunderbare schließfächer(191).-hhx6--threads.wv
 66 905 162 SoundChaotic - Ave Maria (Bootleg).-g--optimize-32bit.wv
 66 904 490 SoundChaotic - Ave Maria (Bootleg).-g.wv
 66 418 906 SoundChaotic - Ave Maria (Bootleg).-hx--optimize-32bit.wv
 66 418 234 SoundChaotic - Ave Maria (Bootleg).-hx.wv
 66 022 726 SoundChaotic - Ave Maria (Bootleg).-mvhx4--optimize-32bit.wv
 66 022 054 SoundChaotic - Ave Maria (Bootleg).-mvhx4.wv
 65 832 444 SoundChaotic - Ave Maria (Bootleg).-hhx6--threads--optimize-32bit.wv
 65 831 772 SoundChaotic - Ave Maria (Bootleg).-hhx6--threads.wv
 65 590 442 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-g--optimize-32bit.wv
 65 589 054 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-g.wv
 64 921 404 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-hx--optimize-32bit.wv
 64 920 678 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-hx.wv
 63 017 692 french-flow-top (adilette 4.2).-g--optimize-32bit.wv
 63 017 088 french-flow-top (adilette 4.2).-g.wv
 62 395 966 french-flow-top (adilette 4.2).-hx--optimize-32bit.wv
 62 395 362 french-flow-top (adilette 4.2).-hx.wv
 62 135 468 french-flow-top (adilette 4.2).-mvhx4--optimize-32bit.wv
 62 134 864 french-flow-top (adilette 4.2).-mvhx4.wv
 62 012 880 french-flow-top (adilette 4.2).-hhx6--threads--optimize-32bit.wv
 62 012 276 french-flow-top (adilette 4.2).-hhx6--threads.wv
 60 941 328 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-mvhx4--optimize-32bit.wv
 60 940 602 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-mvhx4.wv
 60 326 594 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-hhx6--threads--optimize-32bit.wv
 60 325 868 Koen Groeneveld - Tonight The Music Seems So Loud (SoundChaotic Early Frantic Style).-hhx6--threads.wv
 51 611 052 SoundChaotic - Teddybären Core.-g--optimize-32bit.wv
 51 610 516 SoundChaotic - Teddybären Core.-g.wv
 51 239 586 SoundChaotic - Teddybären Core.-hx--optimize-32bit.wv
 51 239 050 SoundChaotic - Teddybären Core.-hx.wv
 50 932 904 SoundChaotic - Teddybären Core.-mvhx4--optimize-32bit.wv
 50 932 368 SoundChaotic - Teddybären Core.-mvhx4.wv
 50 842 498 french-flow-top (morgens bin ich immer müde 3).-g--optimize-32bit.wv
 50 841 946 french-flow-top (morgens bin ich immer müde 3).-g.wv
 50 821 604 SoundChaotic - Teddybären Core.-hhx6--threads--optimize-32bit.wv
 50 821 068 SoundChaotic - Teddybären Core.-hhx6--threads.wv
 50 392 592 french-flow-top (morgens bin ich immer müde 3).-hx--optimize-32bit.wv
 50 392 040 french-flow-top (morgens bin ich immer müde 3).-hx.wv
 50 005 536 french-flow-top (morgens bin ich immer müde 3).-mvhx4--optimize-32bit.wv
 50 004 984 french-flow-top (morgens bin ich immer müde 3).-mvhx4.wv
 49 779 740 french-flow-top (morgens bin ich immer müde 3).-hhx6--threads--optimize-32bit.wv
 49 779 188 french-flow-top (morgens bin ich immer müde 3).-hhx6--threads.wv
 38 282 888 Soundchaotic - Du Bist Liebe 3.-g--optimize-32bit.wv
 38 282 180 Soundchaotic - Du Bist Liebe 3.-g.wv
 37 735 506 Soundchaotic - Du Bist Liebe 3.-hx--optimize-32bit.wv
 37 735 148 Soundchaotic - Du Bist Liebe 3.-hx.wv
 36 653 190 Soundchaotic - Du Bist Liebe 3.-mvhx4--optimize-32bit.wv
 36 652 832 Soundchaotic - Du Bist Liebe 3.-mvhx4.wv
 36 377 552 Soundchaotic - Du Bist Liebe 3.-hhx6--threads--optimize-32bit.wv
 36 377 194 Soundchaotic - Du Bist Liebe 3.-hhx6--threads.wv

If we postulate that 32-bit integer is "kinda stupid anyway", then it might be a good idea to stick to handling the waste those create, keeping compatibility for everything else.
Does WavPack's compatibility policy extend to 5.70 decoding whatever 5.66 can encode?  :))
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-12-05 05:54:17
The small amount of overhead could be due to the creation of an additional data stream. Notice in your test data that the overhead is proportional to file size, if it is done on per file basis the increment should be fixed.

I think it is like adding a flag in each frame and expecting optimizable incoming data, but when there is none the flag itself will take some space, and there could be some technical difficulties to remove the flag and reclaim the space after knowing the frame cannot be optimized.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-12-11 04:34:23
Quote
If we postulate that 32-bit integer is "kinda stupid anyway", then it might be a good idea to stick to handling the waste those create, keeping compatibility for everything else.
Does WavPack's compatibility policy extend to 5.70 decoding whatever 5.66 can encode?  :))
Of course, since this version has been out for a while, and the “help” display mentions nothing about it being experimental, I will leave the decoder portion in so as not to obsolete any existing files, even if I decide to remove the encoder option.

The small amount of overhead could be due to the creation of an additional data stream. Notice in your test data that the overhead is proportional to file size, if it is done on per file basis the increment should be fixed.

I think it is like adding a flag in each frame and expecting optimizable incoming data, but when there is none the flag itself will take some space, and there could be some technical difficulties to remove the flag and reclaim the space after knowing the frame cannot be optimized.
Yes, this is exactly right. The new format adds 2 bytes per frame even if it achieves nothing. It would be possible to back up and rewrite if I detect that there was no improvement, but that would slow everything down significantly, and there was actually another reason to not do that. If the new format was invoked, I wanted a frame to indicate it as soon as possible, otherwise it would increase the likelihood of a file being incorrectly identified by an old decoder as lossless that was actually lossy (if the first occurrence of the improved stream did not occur until well into the file). In any event, the loss is very tiny.

As for the float version, the improvement in float files that were derived from 32-bit integer files is about 0.5%, and these are probably not that common. Except maybe files directly converted from 32-bit ADCS? The situations where it makes a big difference is files with mantissas truncated to less than 24 bits, and in these cases the improvement can go over 10%. But I have never seen files like this in the wild, which is why I’m more inclined to not include this feature in the next release.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bennetng on 2023-12-11 07:19:46
As for the float version, the improvement in float files that were derived from 32-bit integer files is about 0.5%, and these are probably not that common. Except maybe files directly converted from 32-bit ADCS? The situations where it makes a big difference is files with mantissas truncated to less than 24 bits, and in these cases the improvement can go over 10%. But I have never seen files like this in the wild, which is why I’m more inclined to not include this feature in the next release.
Recently, "32-bit float ADCs" are somewhat popular among field recorders, but they are basically stacking multiple traditional fixed point ADCs together using a floating point DSP after digitization, and the floating point math will output float data which cannot be optimized. Search for patent US9654134 for one of these implementations, as well as the files below for some examples:
https://www.sounddevices.com/sample-32-bit-float-and-24-bit-fixed-wav-files/

Other vendors like Tascam and Zoom also sell floating point recorders. I don't have sample files to try out but I guess the recorded files cannot be optimized as well. So not including this optimization at this moment could be a correct decision.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: Porcus on 2023-12-11 16:43:31
Warning: musing-aloud that possibly does nothing but exposing my ignorance.
But has the odd chance of being relevant when one introduces special treatment of 32-bit PCM in a compatibility-challenging manner:

That big third-party implementation does sometimes write 32-bit WavPack files when the source isn't 32-bit.  I think I have only seen it when source is 24-bit integer (then result is stored as 32-bit integer too ... I think?) - so in that case it is maybe precisely what is being addressed, to the extent that the wasted-bits treatment hasn't fixed it already?
Possibly relevant is how float signals aren't always treated gracefully by applications (WavPack nor Wave nor AIFC) - does that affect their conversion to/from .wv in a way that one might want to address?  If one still considers to give float an alternative treatment that some applications might treat in a lossy way, then ... should it offer to normalize volume in a way that proofs against bad behaviour (with in-between-frames correction)? 
(To be honest I don't really understand what --normalize-floats does, even.)

I have a hunch that it can be summarized into "problem originates soffmwhere else, should be solved there", but my excuse for stupid question is that 32-bit particularities are to be addressed anyway.
Title: Re: Improve compression efficiency by treating 32-bit fixed point as float.
Post by: bryant on 2023-12-20 23:43:45
That big third-party implementation does sometimes write 32-bit WavPack files when the source isn't 32-bit.  I think I have only seen it when source is 24-bit integer (then result is stored as 32-bit integer too ... I think?) - so in that case it is maybe precisely what is being addressed, to the extent that the wasted-bits treatment hasn't fixed it already?
Possibly relevant is how float signals aren't always treated gracefully by applications (WavPack nor Wave nor AIFC) - does that affect their conversion to/from .wv in a way that one might want to address?  If one still considers to give float an alternative treatment that some applications might treat in a lossy way, then ... should it offer to normalize volume in a way that proofs against bad behaviour (with in-between-frames correction)? 
(To be honest I don't really understand what --normalize-floats does, even.)
Well, what’s really being addressed here is not the 32-bit peculiarities, which really is just Cool Edit’s decision to exploit WAV file format redundancies to make their own format, but is improving the compression of some 32-bit data that wasn’t anticipated when I implemented the “wasted-bits” strategy. Fortunately, in the integer case, I do efficiently handle all cases of a fixed number of wasted bits, so the 24-bit stored as 32-bit works great, and shouldn't be any different even with --optimize-32bit. The case that wasn’t handled efficiently was a variable number of wasted bits which happens when your source was at some point floating point.

The --normalize-floats may not have been ideally named, but fortunately it’s harmless and won’t do anything unless it’s applicable, which is just with WavPack files created from those same silly Cool Edit WAVs. All float files from everywhere else are normalized to +/- 1.0 already, even though they may have values that exceed that (one of float’s advantages in fact).