Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: WavPack 5.5.0 Release Candidate (Read 29871 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: WavPack 5.5.0 Release Candidate

Reply #25
For wiki etc, and since trying to create a 257 channel .wav is a hassle and I'd anyway have to ask to get the "officially supported":

(1) I see the WavPack front page has changed the description of channel count to "up to at least 256 channels".
 * Does that mean you have removed the 256 channel cap to accommodate that BrainWavPack application
and if so:
 * Would a "reasonably accurate" description be that WavPack supports 256 channels but will accept higher channel count provided that other limitations are satisfied (not all tested) - up to a hard  definite maximum of 4096?

(2) "integer sampling rates up to 1 GHz" it says. I have gotten it to work to 4 Gi - 1.
 * Is it kinda the same as I suggested above, officially 1 GHz but won't object until 4 GiHz?
Just because "supports any .wav sampling rate (i.e. integers between 0 4 GiHz)" is easier to explain than "you see there is this thing called 'long integer' which is why you can cover signals you won't even call audio ...".

Re: WavPack 5.5.0 Release Candidate

Reply #26
For wiki etc, and since trying to create a 257 channel .wav is a hassle and I'd anyway have to ask to get the "officially supported":

(1) I see the WavPack front page has changed the description of channel count to "up to at least 256 channels".
 * Does that mean you have removed the 256 channel cap to accommodate that BrainWavPack application
and if so:
 * Would a "reasonably accurate" description be that WavPack supports 256 channels but will accept higher channel count provided that other limitations are satisfied (not all tested) - up to a hard  definite maximum of 4096?
The official support remains 256 channels, which seems high enough to me for all but the most esoteric brain-wave type stuff.

The format and library have supported 4096 channels for a while, but has not been well-tested, and there's not been a way to even make them. With this release I have added an undocumented option --raw-pcm-ex that lifts the limitation for experimentation purposes (and the brain-wave guys).

But just one example of how this can manifest in weirdness, if you unpack one of these files to another uncompressed format (like WAV) then that resulting WAV file will not not be compatible with WavPack. So maybe in the future, after some more testing, I will lift the 256 channel limit everywhere. Or maybe have some sort of --chill-out option for that (the alternative always reminds me of the airport).

Quote
(2) "integer sampling rates up to 1 GHz" it says. I have gotten it to work to 4 Gi - 1.
 * Is it kinda the same as I suggested above, officially 1 GHz but won't object until 4 GiHz?
Just because "supports any .wav sampling rate (i.e. integers between 0 4 GiHz)" is easier to explain than "you see there is this thing called 'long integer' which is why you can cover signals you won't even call audio ...".
Here again the format and library limit is 2 Gi - 1, so I suspect that's what you got to work. And of course since the WAV header has another unsigned 32-bit field "bytes per second", its real limit varies with channel count and sample depth. Oof.

I've had at least one CVE around wacky sample rates and so have gotten a little shy in this area, which may explain why I set the --raw-pcm limit to 1 GHz. It certainly could have been 2 GHz, and you can get that with WAV (disregarding the other issue above), but 1 GHz is easy to remember and seems like a reasonable upper limit for most sampled physical phenomena.

The cool formats (AIFF and CAF) support floating-point values and I really should have done that, and maybe will in the future. Handy when specifying 0.01 Hz for barometric data, for example, but then I need to add months and years to the "duration" display. You can see how this can be complicating in unpredictable ways.

Sorry I couldn't just answer "yes" or "no" for these and thanks again for your valuable updates to the Wiki!   :)

Re: WavPack 5.5.0 Release Candidate

Reply #27
Edit: So with the last few wiki updates, I think I have removed some errors and not introduced new ones? https://wiki.hydrogenaud.io/index.php?title=WavPack&diff=35791&oldid=35777

Here again the format and library limit is 2 Gi - 1, so I suspect that's what you got to work.
Looks like my memory had gotten too much pro-WavPack bias after everything else working just as good with WavPack as with anything else. So here are the corner cases where some animals are outpacking WavPack, I think:
* 2Gi fails for WavPack. (Also ffmpeg cannot handle it.)
* 4Gi - 1 is accepted by Monkey's and OptimFROG and also by MPEG-4 ALS which with -v returns this fun output:
Code: [Select]
Audio format : int / 16 bit / -1 Hz / 1 ch
Bit rate     : -0.0 kbit/s
Playing time : -100000000.0 sec
PCM file size: 200000104 bytes
ALS file size: 11466700 bytes
Compr. ratio : 17.442 (5.73 %)
Average bps  : 0.917
Average rate : -0.0 kbit/s

Processing took 7.59 sec (-13168290.8 x real-time)

The cool formats (AIFF and CAF) support floating-point values
Quick and pigsty-dirty idea for those who use have allocated 4 bytes and declared it a long signed integer variable for something inherently positive: use the sign to indicate that it is float  :))


Re: WavPack 5.5.0 Release Candidate

Reply #28

Perhaps Peter will chime in with more context.


I'll have a look at this again to see if I can get it working with more supported/standard APIs

Re: WavPack 5.5.0 Release Candidate

Reply #29
Thanks. If it is not too complicated, it would be interesting to have a browser-based decoding speed benchmark with user-supplied WavPack files too.

Re: WavPack 5.5.0 Release Candidate

Reply #30
And of course since the WAV header has another unsigned 32-bit field "bytes per second", its real limit varies with channel count and sample depth.
My pathetic SATA SSD is not fast enough to handle this data rate, but then the file can only be 1 second long and therefore not so useful either. So WAV is basically useless for this purpose. Luckily some of the "best" audio formats like DSD2048 is still below 100MHz :))

Re: WavPack 5.5.0 Release Candidate

Reply #31
The cool formats (AIFF and CAF) support floating-point values
Quick and pigsty-dirty idea for those who use have allocated 4 bytes and declared it a long signed integer variable for something inherently positive: use the sign to indicate that it is float  :))

Haha, yeah, that would work! At first I thought you would only get 31 bits for the float value, but it would actually be the negation of the sample rate. Clever! Unfortunately I can't add it to WavPack that way because it wouldn't be backward compatible. I would need to have the integer version remain (and be as close as possible to the actual value) and then have another field for the floating point that old decoders would never see.

And what's crazy is that in AIFF it's actually an 80-bit extended precision float! I wonder what language back then (1988) supported that (it's poorly supported now). They tried to cover all the possibilities but never considered anyone would need over 4 GB. Anyway, for CAF they went back to a more reasonable 64-bit double.

Re: WavPack 5.5.0 Release Candidate

Reply #32
And of course since the WAV header has another unsigned 32-bit field "bytes per second", its real limit varies with channel count and sample depth.
My pathetic SATA SSD is not fast enough to handle this data rate, but then the file can only be 1 second long and therefore not so useful either. So WAV is basically useless for this purpose. Luckily some of the "best" audio formats like DSD2048 is still below 100MHz :))

4 GB per second is crazy fast. I'm not sure PC RAM can go that fast.

My Tektronix oscilloscope is up to 8 GB per second (2 GS/s * 4-ch * 8-bit) but only goes for 2.5K samples (i.e., just over 1 uS).

Re: WavPack 5.5.0 Release Candidate

Reply #33
RAM can go that fast - most storage, however, would find it rather challenging.

Re: WavPack 5.5.0 Release Candidate

Reply #34
And what's crazy is that in AIFF it's actually an 80-bit extended precision float! I wonder what language back then (1988) supported that (it's poorly supported now).
Probably Pascal. It was a standard data type on the 68k FPU and in Apple's mathematics libraries.

Re: WavPack 5.5.0 Release Candidate

Reply #35
Concerning why put AIFF off until WavPack can import AIFF metadata:
WavPack does not universally have that capability with WAVE either. (E.g. here, Mp3tag sees a COMMENT tag in non-ID3 RIFF.)

So ... porcine ignorant question time again, but as far as WavPack cannot universally import WAVE RIFF metadata (that it stores perfectly) and WavPack cannot universally import metadata from DSD files (that it stores perfectly) - is there anything in particular about AIFF except getting the --import-id3 to work [which, anyway, it doesn't do for all ID3]?

... well there might be, of course, not related to metadata, but ...
And what's crazy is that in AIFF it's actually an 80-bit extended precision float!
... the choir demands 64-bit float "just in case" they would ever encounter such a file  ;D

Re: WavPack 5.5.0 Release Candidate

Reply #36
Concerning why put AIFF off until WavPack can import AIFF metadata:
WavPack does not universally have that capability with WAVE either. (E.g. here, Mp3tag sees a COMMENT tag in non-ID3 RIFF.)

So ... porcine ignorant question time again, but as far as WavPack cannot universally import WAVE RIFF metadata (that it stores perfectly) and WavPack cannot universally import metadata from DSD files (that it stores perfectly) - is there anything in particular about AIFF except getting the --import-id3 to work [which, anyway, it doesn't do for all ID3]?
No, the two projects (AIFF and ID3v2.4) are completely orthogonal. For silly historical reasons I was doing them together, but I could certainly do AIFF first and it would just work with ID3v2.3 (like everything else). That may even be the most common ID3 variant...

Quote
And what's crazy is that in AIFF it's actually an 80-bit extended precision float!
... the choir demands 64-bit float "just in case" they would ever encounter such a file  ;D
Don't know if I've ever mentioned this story, but I got a 64-bit float WAV file from a guy who was asking about support for it. The very first thing I tried was to verify that the 64-bit float values were in fact not representable in 32-bit floats, because otherwise this would just be 100% bloat. Well, every sample survived the roundtrip to 32-bit losslessly! I got him to convert it to 32-bit float (with FFmpeg or Foobar2000, I don't remember) and he agreed it sounded the same. I've been doubly dubious of this ever since.

At one point I was thinking that it would be fairly east to implement this in a backward-compatible way by dividing each 64-bit float value into the sum of two 32-bit floats. One could be stored in the regular stream and the "correction" stream would be stored such that old decoders would ignore it (or omitted in lossy mode, obviously). Unfortunately when I looked a little deeper I realized that since the mantissa is more than twice as large on 64-bit floats (52 vs. 23 bits) this won't work.  :(

Re: WavPack 5.5.0 Release Candidate

Reply #37
Thanks. If it is not too complicated, it would be interesting to have a browser-based decoding speed benchmark with user-supplied WavPack files too.

I've updated the code so it should now work with Firefox too.
https://www.wavpack.com/WebAssembly/index.htm

If you're interested in the source code etc, that can be found on my GitHub page:

https://github.com/soiaf/WebAssembly-WavPack

I also wrote a very quick (read 'hack') system to show how quick it can decode a file. David was kind enough to also host this on his site at
https://www.wavpack.com/WebAssembly/timer.htm

It's certainly interesting to see how fast it can decode WavPack files, from my very brief tests, Firefox seemed to be the fastest browser, but I don't know if this is better JavaScript or WebAssembly processing (or both!). However maybe others will get different results!


Re: WavPack 5.5.0 Release Candidate

Reply #38
Thank you very much! Here is what I got from a CDDA image I encoded many years ago using WavPack 4 with high setting, I am using i3-12100, 16GB DDR4 on Windows 10 and Firefox:
Code: [Select]
It took 27094 milliseconds to decode the file
We decoded 187598460 samples
Another CDDA image encoded with WavPack 5 fast settings:
Code: [Select]
It took 10299 milliseconds to decode the file
We decoded 133040880 samples
I repeated the test several times and got similar results.

Re: WavPack 5.5.0 Release Candidate

Reply #39
Thanks @soiaf ... very nice!

And thanks @bennetng for posting your timing results.

It's amazing to me that WavPack can decode in a browser at over 100x realtime! Just for fun I opened up 5 different tabs and had a WavPack file playing in each one and was still under 20% CPU load.

Re: WavPack 5.5.0 Release Candidate

Reply #40
Just for the braintrust ...
There is actually some scientific literature on lossless compression of EEG data - that Google Scholar search returened 1620 hits although not all are relevant. In particular, none of the hits on https://scholar.google.com/scholar?q="lossless+compression"+"eeg"+"wavpack" are ...
(Shorten was used here, as they found a 1999 paper that tested it: https://dr.ntu.edu.sg/bitstream/10356/101141/1/A%20two-dimensional%20approach%20for%20lossless%20EEG%20compression.pdf )

Re: WavPack 5.5.0 Release Candidate

Reply #41
  • fixed: DSD to PCM decimation: small clicks between tracks and tiny DC offset
I just read the other thread about the audible clicks, but out of curiosity, what was the DC offset bug in the previous versions?

 

Re: WavPack 5.5.0 Release Candidate

Reply #42
  • fixed: DSD to PCM decimation: small clicks between tracks and tiny DC offset
I just read the other thread about the audible clicks, but out of curiosity, what was the DC offset bug in the previous versions?

It was a rounding omission in the DSD to PCM code that introduced a negative DC offset, but only 1/2 of a LSB (at 24 bits), so nothing that could be audible (or even measurable except on silent test files).

I happen to see it when I was in there attempting to reduce the transition clicks and fixed it more for “correctness” than anything else.

P.S. This was the offending source line.

Re: WavPack 5.5.0 Release Candidate

Reply #43
It was a rounding omission in the DSD to PCM code that introduced a negative DC offset, but only 1/2 of a LSB (at 24 bits), so nothing that could be audible (or even measurable except on silent test files).

I happen to see it when I was in there attempting to reduce the transition clicks and fixed it more for “correctness” than anything else.

P.S. This was the offending source line.

Thank you, bryant, for the explanation!

If I want to use WvUnpack to convert DSD to PCM and avoid sample extrapolation, I was considering the following process:

  • append 2 seconds of digital silence to the beginning of the DSF file
  • compress it with WavPack
  • modify the line "*samples++ = sum >> 4;" to "*samples++ = (sum + 8 ) >> 4;" in the source code (5.4.0)
  • rebuild WvUnpack from modified version 5.4.0
  • unpack the compressed DSF file using the "--wav" option
  • and finally use something like SoX to remove the first 2 seconds (then resample or apply a low-pass filter)

This way, any clicks that may occur will be contained within the added 2 seconds and can be discarded after conversion. Do you see any potential issues with this process?

Re: WavPack 5.5.0 Release Candidate

Reply #44
I’m not sure exactly what you’re trying to do, or avoid, but let me see if I can clarify things.

The issue with WavPack’s DSD to PCM conversion is that because it attempts to preserve the exact length of the DSD file, and the decimation filter requires some history, the first 6 PCM samples do not have the actual correct history to work with. In version 5.4.0 (and before) those samples were created by simply pre-filling the filter with DSD “silence”.

The problem with that was that, unlike PCM where silence is represented with zeros, DSD silence is represented with various repeating patterns of +1 and -1, and when two different “flavors” of such silence are joined (or even the same flavor is joined with itself, but offset) a glitch results. And, of course, if the file wasn’t silent at the beginning then another kind of glitch would be created, although this was less likely to be audible because it would be masked by signal.

The solution for version 5.5.0 was to replace just those 6 samples with a linear extrapolation of the beginning of the PCM audio. In all the audio clips I tried, including those generously supplied by the OP, with both silent and non-silent transitions, this eliminated the audible glitch as far as I could tell. There are still probably samples where it might still be audible, but I suspect that this is better in the vast majority of cases. Unfortunately, that thread’s OP never responded as to whether it fixed his problem, nor did anyone else ever indicate that it did or didn’t make an improvement.

So obviously my first question is are you still hearing glitches in some circumstances? Is 5.5.0 better or worse than 5.4.0? Or is this all somewhat academic?

As for an answer to your question, if you want to use WvUnpack to do the DSD to PCM conversion without the extrapolation, the easiest way to do that is to add --skip=6 to the command line to discard those extrapolated samples. That will work perfectly with silent transitions, but of course might introduce a small glitch in non-silent transitions. It also will work perfectly in a stand-alone case where the file is never played gaplessly with the previous file.

Unfortunately, your suggested fix is basically the same thing that I did originally and a glitch may occur where your selected flavor of silence meets the silence of the source file. It is in fact exactly the same except you don’t have to use 2 seconds of silence; 56 bits of silence will do it.

As I type I realize that a more complete fix would have been to first look at the first few bytes of DSD audio and search for a repeating pattern. If one is found then pre-filling the decimation filter with an extrapolation of that would be ideal, and only reverting to the extrapolated PCM when that doesn’t work. Maybe I’ll look at that for the next release.

Does this help and/or make sense?
 


Re: WavPack 5.5.0 Release Candidate

Reply #45
It does help (and makes sense)!

Regarding your questions, version 5.5.0+ is better than 5.4.0, but I did hear a few glitches in specific circumstances. For instance, when transitioning from KV 16 (the files shared by the OP) first movement to the second movement, I could hear a tiny click on the right channel. However, such clicks are not very noticeable.

I used a 2-second buffer because that is typically the pre-gap duration before the first track of a SACD. My intention was to convert some of my DSD/SACD collection to PCM so I could play them on my iPhone. To completely avoid clicks in my use scenario, I was thinking to concatenate all tracks from an album to a single file, convert it to PCM, discard the first pre-gap, and then split the tracks. Since the first pre-gap is discarded, extrapolation is unnecessary in this case. I didn't realize until now that only the first 6 PCM samples would be affected. I did some tests in the past few days, using modified versions 5.4.0 (sum → sum + 8 ), 5.5.0, and 5.6.0 to convert a DSD album as a whole to PCM. The results were identical after the first pre-gap was trimmed out, so I'm happy with it :-)

Re: WavPack 5.5.0 Release Candidate

Reply #46
Okay, great. Thanks for the feedback!

Yeah, you'll definitely get the best results by leaving the image in the full album version as long as possible and not splitting until after all downsampling and processing (if at all). When I use sacd_extract I always use the "full image" option and embed the cuesheet which Foobar is happy to honor.