HydrogenAudio

Lossless Audio Compression => Lossless / Other Codecs => Topic started by: Dave_Scream on 2013-08-17 09:55:51

Title: 7z beats other codecs on 24bit 48khz sample
Post by: Dave_Scream on 2013-08-17 09:55:51
(http://cs308716.vk.me/v308716838/ac11/EbJ2D6pvO_g.jpg)

I just lold ))
flac highest  610 -> 377
monkey's insane 610 -> 375
7zip 610 -> 211

Use 7zip guys )))
---
UPD. added Monkeys audio
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Propheticus on 2013-08-17 10:41:45
Have you tried zipping the flac or other codecs? The comparison is not totally fair now as zip is no audio codec. It cant be played back directly or seeked ahead in. It needs to be unpacked to memory or temp file in full before the enclosed data can be played. The other formats can be played from halfway the file without decoding the whole file first.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Dave_Scream on 2013-08-17 10:56:27
Have you tried zipping the flac or other codecs? The comparison is not totally fair now as zip is no audio codec. It cant be played back directly or seeked ahead in. It needs to be unpacked to memory or temp file in full before the enclosed data can be played. The other formats can be played from halfway the file without decoding the whole file first.

Hello.
Zipping already compressed things is bad idea)

Yes, you're right, but I was surprised that 7zip, which used for common compression, got better results than most popular audio compression codecs on their highest settings. And 7zip is not just better, it have ~55 - 75% bonus in file size.

For me, lossless is only for archive needs. For listening needs I have lossy codecs, because my SD card or flash card in my phone dosent have so much space.

I think peoples like me, who need lossless for archive needs must think about this moment)
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Propheticus on 2013-08-17 11:06:05
I can see how this is interesting for purely archiving purposes. It not that surprising though. The zipping can use much large blocks and compress patterns throughout the file, while an audio codec must use smaller blocks (sequentially) to be realtime decodeable and playable.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Jan S. on 2013-08-17 11:25:11
I fail to see how this is interesting that you found one file where 7z wins.

I tried on an album:
wav: 678MB
7z: 572MB
rar: 373MB
wv: 255MB


I would take any wager on general compressors losing that battle 95%+ of the time.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: lvqcl on 2013-08-17 11:30:13
My test:

WAV 16/44: 569 743 820 bytes
7z LZMA Ultra: 529 341 057 bytes
RAR5 Best: 427 763 395  bytes
FLAC -8:  362 224 323 bytes
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Dave_Scream on 2013-08-17 11:31:11
I fail to see how this is interesting that you found one file where 7z wins.

I tried on an album:
wav: 678MB
7z: 572MB
rar: 373MB
wv: 255MB


I would take any wager on general compressors losing that battle 95%+ of the time.

strange. maybe its because I used 24bit 48khz wav input? and lossless codecs are not optimized good to compress >22khz frequencies? so pure math of 7z make it winner?
here is the link to my test file http://yadi.sk/d/hpGxgr9y8-z1e (http://yadi.sk/d/hpGxgr9y8-z1e)
and here is spectrogramm that shows that frequencies >22khz available
Spoiler (click to show/hide)
Title: 7z beats other codecs on 24bit 48khz sample
Post by: TBeck on 2013-08-17 13:54:38
strange. maybe its because I used 24bit 48khz wav input? and lossless codecs are not optimized good to compress >22khz frequencies? so pure math of 7z make it winner?
here is the link to my test file http://yadi.sk/d/hpGxgr9y8-z1e (http://yadi.sk/d/hpGxgr9y8-z1e)

One possible explaination: Quite few of the possible sample values between the files minimum and maximum value are present in the file. That's nice for general purpose file compressors.

Some possible reasons:

- Amplification by a quite large factor
- Companded source signal (a-law, u-law etc.)

The resulting sample distribution then contains a lot of holes.

While some audio compressor can detect an amplification by an integer power of 2 (the wasted bits feature), to my knowledge only OptimFrog's experimental mode can take advantage of other transformations.

I tried it. Because this feature currently only works for 16 bit samples, i converted your file to 16 bit / 96 khz (without dithering).

Results:

Optimfrog Normal:  41.87 % (of the uncompressed file size)
Optimfrog Normal -- experimental:  29.46 %

This seems to support my hypothesis.

Title: 7z beats other codecs on 24bit 48khz sample
Post by: saratoga on 2013-08-18 03:51:38
I can see how this is interesting for purely archiving purposes. It not that surprising though. The zipping can use much large blocks and compress patterns throughout the file, while an audio codec must use smaller blocks (sequentially) to be realtime decodeable and playable.


Since 7zip doesn't know about stereo, I expect that for anything but mono audio or two uncorrelated stereo channels, a regular lossless codec will have a pretty big advantage since it can do M/S stereo. 
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Thundik81 on 2013-08-18 10:54:03
I fail to see how this is interesting that you found one file where 7z wins.

I tried on an album:
wav: 678MB
7z: 572MB
rar: 373MB
wv: 255MB


I would take any wager on general compressors losing that battle 95%+ of the time.


http://www.squeezechart.com/audio.html (http://www.squeezechart.com/audio.html)
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Nystagmus on 2013-11-10 21:17:57
Although not directly related to this conversation, it may be useful to know that foobar2000 has a component add on that allows for playing soundfiles embedded in 7z archives without needing to manually decompress them.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: kode54 on 2013-11-11 01:37:49
Yes, and that component decompresses the entire file to memory before playing it.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Seren on 2013-11-11 18:40:19
I was wondering... does it work out better if you compress the wav with or the already compressed flac/ape ect with 7z. I'd love to try it but I have to get some sleep now =(
Title: 7z beats other codecs on 24bit 48khz sample
Post by: xnor on 2013-11-11 19:25:34
I was wondering... does it work out better if you compress the wav with or the already compressed flac/ape ect with 7z. I'd love to try it but I have to get some sleep now =(


Well, one album I tested (metal album, released 2013, with low dynamic range) compressed to:

wav: 484 MB (uncompressed)
flac: 344 MB (level 8)
wav->7z: 468 MB (ultra)
flac->7z: 344 MB  (level 8, ultra)

So basically just a waste of time and resources.

Since compression algorithms make the compressed data more random further compression can only squeeze out a few more bytes, or if the initial compression was done well actually increase the size (so a waste of time, resources, and disk space).
Title: 7z beats other codecs on 24bit 48khz sample
Post by: pdq on 2013-11-11 20:41:49
On the other hand, since WinZip can compress with the WavPack compression algorithm, it will compress wav files quite efficiently.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: kode54 on 2013-11-12 01:42:11
And only WinZip can unpack those.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: probedb on 2013-11-12 11:52:15
Also bear in mind 7-zip isn't actively developed, it hasn't had a stable (non alpha/beta) release since 2010.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: xnor on 2013-11-12 13:14:25
Also bear in mind 7-zip isn't actively developed, it hasn't had a stable (non alpha/beta) release since 2010.

Then why did the developer announce a new alpha release in the coming days? 

It's still being actively developed, there's just not much to fix especially in the stable version.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: (Sly) on 2013-11-12 17:28:12
By the way 7-zip LZMA Ultra uses 64 MB dictionary size, that helps a lot, lossless audio codecs cannot exist with a dictionary size this huge, it would make seeking almost impossible.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: bryant on 2013-11-12 18:24:08
I took a look at this file, and as Thomas says, the high compression is based on missing sample values. The lower 10 bits are essentially wasted as almost every sample is a multiple of 32767 / 32, and since that's not an even power of 2 the algorithms in many lossless compressors that eliminate redundant LSBs don't do anything. I have no guess how this file got like this, but it essentially has only 14 bits of resolution.

A long time ago I considered trying to take advantage of this to get better compression for real CDs. I seem to remember that about 10% of my CD collection showed some statistical discrepancy that could be leveraged for improved compression, in some cases up to 10%! The first problem was the complexity (there were many variations on how the missing sample values were manifested) and the other thing that bothered me was that I would be taking advantage of something that really shouldn't be there at all with well-mastered material. I would be curious as to how common this is in modern recordings, but this definitely strikes me as an edge-case not worth considering beyond mathematical curiosity.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: probedb on 2013-11-13 08:34:58
Then why did the developer announce a new alpha release in the coming days? 

It's still being actively developed, there's just not much to fix especially in the stable version.


You answered your own question, the clue being the word alpha  I said stable.

Even the latest alpha is dated October 2012. Taking 3 years to get a new release out when it's been in beta is what I'd call not actively developed.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: birdie on 2013-11-13 09:11:17
I smell something extremely fishy here.

Like your album has several bit-identical songs, so 7z takes advantage of its enormous dictionary.

Care to share the album name?
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Kees de Visser on 2013-11-13 10:36:48
Just to make sure: is the test file in 24 bit 96 kHz format ? The spectrogram shows content up to 48 kHz, indicating a sampling rate of 96 kHz, but the topic mentions 48 kHz.
Also, has it been verified that the process is lossless ?
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Juha on 2013-11-13 11:26:02
Probably the test sample wav were recorded using std Audacity, which does let record/save 24-bit wav files bits 17-24 filled with zeros.

Title: 7z beats other codecs on 24bit 48khz sample
Post by: 2Bdecided on 2013-11-13 11:32:15
bryant's already explained exactly what's happening - why are you guys still speculating?
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Nystagmus on 2014-01-02 22:01:13
I'm not going to claim which technique(s) I think are better, but it's worth mentioning that the type of music (spectrally, the amount and size of repetition, and it terms of timbre) really profoundly affects audio compression successfullness.  But I do like 7zip as an archiver, and it's pretty darned cool that Foobar2000 media player can play contents of 7z archives (at least with the foo addon). 

Life is pretty good these days for music and media files.  I recently switched to BandiZip because it has 7zip support but can do a few things that 7zip can't.  BandiZip almost out-7zipped 7-zip itself!  But I still use 7zip because it can do a few things that BandiZip can't.  Occasionally I try out PeaZip for the same reasons.  But 7zip is pretty awesome for being able to open up MHTs, XPIs, XPSs, DMGs (macOS), and setup/installer programs of some types.  I really love bypassing installers and just extracting the contents.  Sometimes that really saves tons of headaches with annoying installers.  At first I couldn't get 7zip to run on Windows 7, but after auto-elevating UAC I got it to work fine again.  Luckily, 7zip is still being developed so it's all good.  I don't use ZIP for compression anymore and RAR is still a bit proprietary-like in some ways, but I use what I just mentioned to open em. 

For the record, I also enjoy using FLAC and WavPack.  They each have some awesome advantages for certain situations.  And I love how they are still supported and hardware support is growing through stuff like RockBox and Linux distros.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Porcus on 2014-01-02 23:57:24
Since the thread is already bumped:

http://www.squeezechart.com/audio.html (http://www.squeezechart.com/audio.html)


Three "general" compressors beating TAK on size, and Stuffit being on par at encoding speed? I am impressed, even though they are not seekable.


(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely,  .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)
Title: 7z beats other codecs on 24bit 48khz sample
Post by: bryant on 2014-01-03 02:13:42
(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely,  .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

I don't think there's possibility of exploitation there. The maximum allowable 16-bit PCM values are -32768 and +32767, so if you simply call 32768 full scale (1.0), then 32767 becomes 0.999969. I can imagine all kinds of normalization algorithms that would leave the sample values clipped to +/-32767 after converting from a higher resolution master. Unless all the even values are missing, that doesn't give you much to work with.

As for the "general" compressors doing better than TAK, my guess is that they have some audio specialized processing in there that is getting invoked. It's just not very likely that any general purpose algorithm will do well with stereo 16-bit PCM audio by accident (even with a huge dictionary). Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Title: 7z beats other codecs on 24bit 48khz sample
Post by: Porcus on 2014-01-03 08:19:35
Ah, 32767 ... almost facepalming I didn't think of that.


As for the "general" compressors doing better than TAK, my guess is that they have some audio specialized processing in there that is getting invoked.


Well certainly (like WinZip), but I am still surprised that they bother to take it to the extent that they beat everything faster than WavPack. People like yourself and Thomas have spent quite some effort on the audio part, and I wonder if any Stuffit customer would be disappointed if Stuffit simply grabbed some reasonable codec that is already out there under a permissive license ... (then OTOH for marketing purposes they maybe do not want to see anyone pointing out that they charge money for something that compresses .wavs to exactly the size of a .vw plus file header difference?  Of course the similarity to TAK in speed/compression performance could easily fuel other speculations  ). 



Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Starting from LPCM-optimized WavPack, I presume?

BTW, I did the following comparison out of curiosity, taking the "most FLACable" and "least FLACable" CD rip in my collection and compared WavPack to FLAC to TAK: http://www.hydrogenaudio.org/forums/index....mp;#entry800823 (http://www.hydrogenaudio.org/forums/index.php?showtopic=95670&st=100&p=800823&#entry800823)
I take the hunch that you don't listen much to Piaf and Thomas doesn't listen much to Merzbow ;-)
Title: 7z beats other codecs on 24bit 48khz sample
Post by: ktf on 2014-01-03 08:38:35
I've used XZ (which uses a compression algorithm very similar to 7-zip) in the first revision of my Lossless audio codec comparison (http://www.icer.nl/losslesstest/), it's in the raw data, the cvs file. There's only one album where XZ beats any codec, and that's on mono material with lots of silence. Pretty much everywhere else, XZ doesn't even come close. I guess the mentioned sample was something special, quantized in some special way for example.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Mangix on 2014-01-03 10:14:26
Since the thread is already bumped:

http://www.squeezechart.com/audio.html (http://www.squeezechart.com/audio.html)


Three "general" compressors beating TAK on size, and Stuffit being on par at encoding speed? I am impressed, even though they are not seekable.


(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely,  .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

Those general compressors have specialized models that deal with .wav files. Typically some context mixing algorithm with filters. They also operate on large blocksizes so seeking has to be done in a similar manner to mp3(decoding the whole file). They're also quite slow and unsuitable for real-time playback. WinRK and NanoZip anyway.

FLAC usually has a blocksize of 4096 if I'm not mistaken. TAK i think goes up to 16384. Smaller blocksizes lose compression in exchange for seekability.

Quote
There's only one album where XZ beats any codec, and that's on mono material with lots of silence.
LZ77 does very well when you have repeating sequences. Silence falls into that category. Although xz does use a "delta" filter when applied to audio data. It probably allows the LZ77 model to find the patterns.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Porcus on 2014-01-03 15:49:26
As a sidenote, after someone here on HA pointed out to me that FLAC could be used as a general purpose compressor by --force-raw-format, I did for fun try it on a few files. Not very competitive.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: Mangix on 2014-01-03 23:22:41
FLAC would probably do better on general files if the Rice coding was replaced with Arithmetic coding or FSE. Actually in the case of the latter, decode speed should improve.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: bryant on 2014-01-04 19:36:29
As a sidenote, after someone here on HA pointed out to me that FLAC could be used as a general purpose compressor by --force-raw-format, I did for fun try it on a few files. Not very competitive.

WavPack can do this too with "--raw-pcm" and yes, they're generally not too competitive. Switching to 8-bit mono sometimes helps. I think the only practical value of this is finding bugs in the code.
Title: 7z beats other codecs on 24bit 48khz sample
Post by: bryant on 2014-01-04 19:40:35
Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Starting from LPCM-optimized WavPack, I presume?

No, actually starting from scratch and using arithmetic coding. And one of the methods actually was a decent general purpose compressor that beat WinZip on one huge pdf that I had! 
Title: 7z beats other codecs on 24bit 48khz sample
Post by: thebombzen on 2014-01-09 22:06:15

There's only one album where XZ beats any codec, and that's on mono material with lots of silence.

LZ77 does very well when you have repeating sequences. Silence falls into that category. Although xz does use a "delta" filter when applied to audio data. It probably allows the LZ77 model to find the patterns.


XZ can do particularly well if you use a custom filter chain. I can get pretty good ratios (though still not as good as flac -8) when using XZ's ability to use a custom filter chain. Specifically, I use
Code: [Select]
xz -vvk --delta=dist=4 --delta=dist=4 --lzma2=dict=128MiB,lc=0,lp=2,pb=2,mode=normal,nice=273,mf=bt4,depth=1024 Audio_file.wav

but I change the dictionary size to be the smallest value that is either of the form 2^n or 2^n + 2^(n-1) that's larger than the file I'm compressing, because anything larger is unnecessary and those are the values that XZ Supports.

Note that with delta, you can specify the distance, which is extremely useful because each sample is 4 bytes long (16-bit stereo, adjust for other formats) so the corresponding byte would be 4 bytes away. Using delta twice improves the ratio further in every audio file I've tried it on, but I don't entirely know why. For some reason, three delta filters consistently performs worse than two, even though two consistently performs better than one. Someone else will have to explain this one to me.

Also note the values I'm using for lc, lp, and pb. (The other non-dict values are just max settings.) It's easier to explain if I quote the XZ manpages:
Quote from: man xz link=msg=0 date=


By using lp=2 and pb=2, I set LZMA2 to assume 4-byte alignment for everything, which is exactly how I want it for 16-bit stereo samples. I'd change these to 1 for 16-bit mono, 24-bit stereo, and 8-bit stereo. (24-bit stereo only contains one factor of two, but 16-bit stereo contains two; that's why it's lower for this one). The choice of lc=0, lc=1, or lc=2 is not critical: I tried all three on several samples and got nearly identical results (as in within .001 ratio), better or worse depending on the samples.

So there you have it, your guide on how to better compress your wav files with XZ. Final answer: Use FLAC or any other audio-oriented program. FLAC compresses better and also has much faster compression and decompression; FLAC decompressed around 6x faster than XZ and compressed around 20x faster.