Skip to main content

Topic: 7z beats other codecs on 24bit 48khz sample (Read 20742 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • Nystagmus
  • [*][*]
7z beats other codecs on 24bit 48khz sample
Reply #25
I'm not going to claim which technique(s) I think are better, but it's worth mentioning that the type of music (spectrally, the amount and size of repetition, and it terms of timbre) really profoundly affects audio compression successfullness.  But I do like 7zip as an archiver, and it's pretty darned cool that Foobar2000 media player can play contents of 7z archives (at least with the foo addon). 

Life is pretty good these days for music and media files.  I recently switched to BandiZip because it has 7zip support but can do a few things that 7zip can't.  BandiZip almost out-7zipped 7-zip itself!  But I still use 7zip because it can do a few things that BandiZip can't.  Occasionally I try out PeaZip for the same reasons.  But 7zip is pretty awesome for being able to open up MHTs, XPIs, XPSs, DMGs (macOS), and setup/installer programs of some types.  I really love bypassing installers and just extracting the contents.  Sometimes that really saves tons of headaches with annoying installers.  At first I couldn't get 7zip to run on Windows 7, but after auto-elevating UAC I got it to work fine again.  Luckily, 7zip is still being developed so it's all good.  I don't use ZIP for compression anymore and RAR is still a bit proprietary-like in some ways, but I use what I just mentioned to open em. 

For the record, I also enjoy using FLAC and WavPack.  They each have some awesome advantages for certain situations.  And I love how they are still supported and hardware support is growing through stuff like RockBox and Linux distros.
  • Last Edit: 02 January, 2014, 05:02:16 PM by Nystagmus
Be a false negative of yourself! 

  • Porcus
  • [*][*][*][*][*]
7z beats other codecs on 24bit 48khz sample
Reply #26
Since the thread is already bumped:

http://www.squeezechart.com/audio.html


Three "general" compressors beating TAK on size, and Stuffit being on par at encoding speed? I am impressed, even though they are not seekable.


(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely,  .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

  • bryant
  • [*][*][*][*][*]
  • Developer (Donating)
7z beats other codecs on 24bit 48khz sample
Reply #27
(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely,  .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

I don't think there's possibility of exploitation there. The maximum allowable 16-bit PCM values are -32768 and +32767, so if you simply call 32768 full scale (1.0), then 32767 becomes 0.999969. I can imagine all kinds of normalization algorithms that would leave the sample values clipped to +/-32767 after converting from a higher resolution master. Unless all the even values are missing, that doesn't give you much to work with.

As for the "general" compressors doing better than TAK, my guess is that they have some audio specialized processing in there that is getting invoked. It's just not very likely that any general purpose algorithm will do well with stereo 16-bit PCM audio by accident (even with a huge dictionary). Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!


  • Porcus
  • [*][*][*][*][*]
7z beats other codecs on 24bit 48khz sample
Reply #28
Ah, 32767 ... almost facepalming I didn't think of that.


As for the "general" compressors doing better than TAK, my guess is that they have some audio specialized processing in there that is getting invoked.


Well certainly (like WinZip), but I am still surprised that they bother to take it to the extent that they beat everything faster than WavPack. People like yourself and Thomas have spent quite some effort on the audio part, and I wonder if any Stuffit customer would be disappointed if Stuffit simply grabbed some reasonable codec that is already out there under a permissive license ... (then OTOH for marketing purposes they maybe do not want to see anyone pointing out that they charge money for something that compresses .wavs to exactly the size of a .vw plus file header difference?  Of course the similarity to TAK in speed/compression performance could easily fuel other speculations  ). 



Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Starting from LPCM-optimized WavPack, I presume?

BTW, I did the following comparison out of curiosity, taking the "most FLACable" and "least FLACable" CD rip in my collection and compared WavPack to FLAC to TAK: http://www.hydrogenaudio.org/forums/index....mp;#entry800823
I take the hunch that you don't listen much to Piaf and Thomas doesn't listen much to Merzbow ;-)
  • Last Edit: 03 January, 2014, 03:20:31 AM by Porcus

  • ktf
  • [*][*][*][*][*]
7z beats other codecs on 24bit 48khz sample
Reply #29
I've used XZ (which uses a compression algorithm very similar to 7-zip) in the first revision of my Lossless audio codec comparison, it's in the raw data, the cvs file. There's only one album where XZ beats any codec, and that's on mono material with lots of silence. Pretty much everywhere else, XZ doesn't even come close. I guess the mentioned sample was something special, quantized in some special way for example.
  • Last Edit: 03 January, 2014, 03:38:57 AM by ktf
Music: sounds arranged such that they construct feelings.

  • Mangix
  • [*][*][*][*][*]
7z beats other codecs on 24bit 48khz sample
Reply #30
Since the thread is already bumped:

http://www.squeezechart.com/audio.html


Three "general" compressors beating TAK on size, and Stuffit being on par at encoding speed? I am impressed, even though they are not seekable.


(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely,  .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

Those general compressors have specialized models that deal with .wav files. Typically some context mixing algorithm with filters. They also operate on large blocksizes so seeking has to be done in a similar manner to mp3(decoding the whole file). They're also quite slow and unsuitable for real-time playback. WinRK and NanoZip anyway.

FLAC usually has a blocksize of 4096 if I'm not mistaken. TAK i think goes up to 16384. Smaller blocksizes lose compression in exchange for seekability.

Quote
There's only one album where XZ beats any codec, and that's on mono material with lots of silence.
LZ77 does very well when you have repeating sequences. Silence falls into that category. Although xz does use a "delta" filter when applied to audio data. It probably allows the LZ77 model to find the patterns.
  • Last Edit: 03 January, 2014, 05:19:28 AM by Mangix

  • Porcus
  • [*][*][*][*][*]
7z beats other codecs on 24bit 48khz sample
Reply #31
As a sidenote, after someone here on HA pointed out to me that FLAC could be used as a general purpose compressor by --force-raw-format, I did for fun try it on a few files. Not very competitive.

  • Mangix
  • [*][*][*][*][*]
7z beats other codecs on 24bit 48khz sample
Reply #32
FLAC would probably do better on general files if the Rice coding was replaced with Arithmetic coding or FSE. Actually in the case of the latter, decode speed should improve.
  • Last Edit: 03 January, 2014, 06:22:54 PM by Mangix

  • bryant
  • [*][*][*][*][*]
  • Developer (Donating)
7z beats other codecs on 24bit 48khz sample
Reply #33
As a sidenote, after someone here on HA pointed out to me that FLAC could be used as a general purpose compressor by --force-raw-format, I did for fun try it on a few files. Not very competitive.

WavPack can do this too with "--raw-pcm" and yes, they're generally not too competitive. Switching to 8-bit mono sometimes helps. I think the only practical value of this is finding bugs in the code.

  • bryant
  • [*][*][*][*][*]
  • Developer (Donating)
7z beats other codecs on 24bit 48khz sample
Reply #34
Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Starting from LPCM-optimized WavPack, I presume?

No, actually starting from scratch and using arithmetic coding. And one of the methods actually was a decent general purpose compressor that beat WinZip on one huge pdf that I had! 

  • thebombzen
  • [*]
7z beats other codecs on 24bit 48khz sample
Reply #35

There's only one album where XZ beats any codec, and that's on mono material with lots of silence.

LZ77 does very well when you have repeating sequences. Silence falls into that category. Although xz does use a "delta" filter when applied to audio data. It probably allows the LZ77 model to find the patterns.


XZ can do particularly well if you use a custom filter chain. I can get pretty good ratios (though still not as good as flac -8) when using XZ's ability to use a custom filter chain. Specifically, I use
Code: [Select]
xz -vvk --delta=dist=4 --delta=dist=4 --lzma2=dict=128MiB,lc=0,lp=2,pb=2,mode=normal,nice=273,mf=bt4,depth=1024 Audio_file.wav

but I change the dictionary size to be the smallest value that is either of the form 2^n or 2^n + 2^(n-1) that's larger than the file I'm compressing, because anything larger is unnecessary and those are the values that XZ Supports.

Note that with delta, you can specify the distance, which is extremely useful because each sample is 4 bytes long (16-bit stereo, adjust for other formats) so the corresponding byte would be 4 bytes away. Using delta twice improves the ratio further in every audio file I've tried it on, but I don't entirely know why. For some reason, three delta filters consistently performs worse than two, even though two consistently performs better than one. Someone else will have to explain this one to me.

Also note the values I'm using for lc, lp, and pb. (The other non-dict values are just max settings.) It's easier to explain if I quote the XZ manpages:
Quote from: man xz link=msg=0 date=


By using lp=2 and pb=2, I set LZMA2 to assume 4-byte alignment for everything, which is exactly how I want it for 16-bit stereo samples. I'd change these to 1 for 16-bit mono, 24-bit stereo, and 8-bit stereo. (24-bit stereo only contains one factor of two, but 16-bit stereo contains two; that's why it's lower for this one). The choice of lc=0, lc=1, or lc=2 is not critical: I tried all three on several samples and got nearly identical results (as in within .001 ratio), better or worse depending on the samples.

So there you have it, your guide on how to better compress your wav files with XZ. Final answer: Use FLAC or any other audio-oriented program. FLAC compresses better and also has much faster compression and decompression; FLAC decompressed around 6x faster than XZ and compressed around 20x faster.