Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: HALAC (High Availability Lossless Audio Compression) (Read 18813 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

HALAC (High Availability Lossless Audio Compression)

I'm new in this forum. I am glad it was such a special forum on Audio. I am the writer of the lossless image codec called HALIC(High Availability Lossless Image Compression). It is a work that can offer a good compression ratio quite quickly. This time I would like to introduce my work called HALAC(High Availability Lossless Audio Compression).

In the past(2018-2019), I had been working on the lossless audio compression. However, I could not bring together the work I did. Now I have a little time and I think I developed a fast codec. I worked on 16 bit, 2 channel audio data (.wav). Higher bit and channel options can be added if necessary. As a result, the approach is the same.

HALAC, like the HALIC, focuses on a reasonable compression ratio and high processing speed. The compression rate for audio data is usually limited. So I wanted a solution that can work faster with a few percent concessions.

I used a quick estimation with ANS(FSE). I don't know if there are other codecs using ANS, but the majority uses "Rice Coding". However, in my tests, Rice Coding(my own implementation) is a bit behind in terms of speed(0.6x - 0.7x), but it gives better results as compression rate(1% - 2%). The loss of speed in the Rice Coding is due to the calculation of adaptive parameter. I am really happy with ANS right now because speed is more important to me. In addition, I do not think that I use ANS fully efficiently.

GPU or SIMD was not used. Also now in the single-thread version. In the next version, I can add the Multithread option. I couldn't compile the Linux version because my Linux machine collapsed. I tried to find the middle way by working with different music genres.
Below are the comparisons (from original wav, 16 bit, 2 channel, 44100 bps) with FLAC, ALAC and WAVPACK (Pazera_Free_Audio_Extractor ver. 2.11).

Test Machine (2012): i7 3770k, 3.9 ghz, 16 gb ram, 256 gb ssd
Encode Usage: halac_encode.exe input.wav out.halac
Decode Usage: halac_decode.exe out.halac original.wav








Re: HALAC (High Availability Lossless Audio Compression)

Reply #1
Speed wise your codec is quite impressive. I ran some simple tests here and it seems to compress a tiny bit worse than FLAC 1.4.3 in mode -4 but clearly better than FLAC in mode -3.
But in compression speed it beats even FLAC mode -0 and TAK -p0. And seems to beat them all in decoding speed too.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #2
Very nice. tl;dr is it essentially LPC with ANS for the residual? If you could release source or compile for Linux that would be great, I have trouble running .NET mono crap through wine.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #3
I wonder how much of the speed benefit comes from the codec seemingly lacking any safety checks.

I compressed a WAV with metadata and the decoded file seems to have copied the header from the original file but is missing the metadata chunks, so it's a bit invalid length wise.

And I randomly altered bits in the encoded binary data as a test, decoder didn't notice any issues but of course the decoded file is no longer bit-identical to the original.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #4
Very nice. tl;dr is it essentially LPC with ANS for the residual? If you could release source or compile for Linux that would be great, I have trouble running .NET mono crap through wine.
Yes, a linear prediction and then 2 pieces FSE are used to encode residues. After the installation of my Linux machine is completed, I add execuable files. It is not open source at the moment, but can be evaluated according to the situation in the future. Right now, it's very new and I have things to improve.

There are also different predictors I don't use. Some are good in high entropy and some low entropy. As I said, we are at the beginning yet.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #5
I wonder how much of the speed benefit comes from the codec seemingly lacking any safety checks.

I compressed a WAV with metadata and the decoded file seems to have copied the header from the original file but is missing the metadata chunks, so it's a bit invalid length wise.

And I randomly altered bits in the encoded binary data as a test, decoder didn't notice any issues but of course the decoded file is no longer bit-identical to the original.

What you are talking about, ie data integrity control, can be achieved with a rapid Hash functions(wyhash, xxhash...). This will not have much effect on speed. Because they work extremely fast. However, no one had made such a request in my previous studies. If this is necessary, I will add it in the next version, no problem. Thanks a lot...

In addition, dealing with Metadata is the next simple details. We usually do not compress them (a few kilobytes).

Re: HALAC (High Availability Lossless Audio Compression)

Reply #6
The numbers posted suggest 3x to 5x faster than FLAC, and I get nothing of that kind. Though it is fast indeed! The decoding speeds are outright impressive given how FLAC is the fastest thing we ever saw ... yet. (Only recently did the -0 encoding speeds improve.)

I ran it on the corpus in my signature, and it is on par with fastest FLAC --no-md5. On a RAM disk (I use Passmark OSFmount because it is mounting software too) I had to restrict myself down to 4 albums (all classical music, this is just a brief test). After a few runs, I can report figures like these:

Encoding:
10.1 sec for flac -0r0 --no-md5 --totally-silent
10.4 sec for HALAC
11.0 sec for flac -1 --no-md5
14.4 sec for flac -0
16.1 sec for TAK -p0

Decoding:
13.8 for HALAC
16.6 for flac on the "-0r0 --no-md5" files
18.5 for TAK -p0


Sizes are impressive at the speed! File sizes for the full "signature" corpus, all FLAC and ALAC figures have had tags and padding removed
13 360 205 283 for FLAC -0r0
12 772 828 991 for FLAC -1
12 393 304 500 for HALAC <---------- that's between ffmpeg's ALAC and refalac
12 032 168 423 for FLAC -5


FLAC 1.4.3 win32, i5-1135-G7

Re: HALAC (High Availability Lossless Audio Compression)

Reply #7
@Porcus;
Thank you very much for the test.
The results I have obtained with different Converters are almost the same (Pazera, Human, fre:ac). I don't know your results when MD5 is active. As I mentioned in my previous answers, I have just started this work. Although the my data compression history is long, I worked for a certain time for audio compression and took a long break. We can probably put forward better things with your valuable ideas.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #8
flac.exe will write MD5 unless you invoke that "undocumented" option, so all the presets are with MD5. The "14.4" seconds enoding time using using "-0" was with MD5 - and also with the -r3 that tries to partition into 8/4/2/1, and which matters like 0.1 to 0.2 seconds.

Actually,  --totally-silent switches off the console output and helps 0.2 to 0.3 on those numbers.
Also I can speed up flac.exe slightly by using a larger block size. flac.exe uses 1152 samples per block for the fixed-predictors presets, I doubt that would have been selected today.  So some more timings on the RAM disk:
9.5 seconds encoding -0fr0 -b3072 --totally-silent --no-md5 (that switches off MD5) - and the --totally-silent actually helps a few tenths too.
14.6 seconds decoding the same, also with --totally-silent
13.1 encoding -0fr0 -b3072 --totally-silent (that is with MD5)
18.2 decoding the ones with MD5.

All times are medians of "a few". So MD5 takes another three and a half seconds. Quite significant in percentage terms.


Now, switching corpus to four metal albums instead, still on the RAM disk:
8.9 & 12.4 encoding & decoding HALAC
12.3 & 16.8 encoding flac at -0fr0 -b3072 --totally-silent --no-md5
15.3 & 18.3 encoding & decoding TAK at -p0

So material does matter. Impressive.


Re: HALAC (High Availability Lossless Audio Compression)

Reply #10
Interesting, and what about MLAC ?
https://hydrogenaud.io/index.php/topic,125201.0.html
In fact, instead of writing this, you can offer us files that we can run to test(under your own topic). Or you can share some of your results from there. You shouldn't expect others to do this. Then, those who are interested in the subject can perform different tests and give you feedback.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #11
And I randomly altered bits in the encoded binary data as a test, decoder didn't notice any issues but of course the decoded file is no longer bit-identical to the original.

I prepared the V.0.2.0 version to give a quick answer to this request. I added a special Hash control for each block. What I want to show is that the hash operations have no effect on the "HALAC" in terms of speed.

However, it should be noted that random changes on stuck files can make the archive cannot be opened. Because we can disrupt the special headings of independent blocks. Or we can break the tANS. There are many situations like this. In the present case, a warning message will be received if a place that corresponds to the samples fields is changed directly.

In fact, the most robust way to do this is to handle and check both the input file and the decoded file at one time. However, producing Hash at one time for the whole file will increase memory consumption(even if the speed does not change). This is not aesthetic. But I can find a more practical way in the next version. This is not about data compression.

X X X

Re: HALAC (High Availability Lossless Audio Compression)

Reply #12
Your reply seems to be rather dismissive of integrity checking: "This is not about data compression"

It is a central part of any lossless data compression. 

You should consider changing your attitude on this if you want to achieve wider acceptance. We are not academics. We want to know if we have a corruption. It is more important than the speed you chase.

Still, good luck with this.

Triza

Re: HALAC (High Availability Lossless Audio Compression)

Reply #13
EDIT: After posting that integrity checking isn't imperative for testing, ... ahem. Please check the following two. One outcome worse than the other.


Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)

Also if one is interested in comparing the residual compression algorithm the Rice code one can do an "everything else equal" by forking off reference FLAC and then (ab)using residual coding method 10 or 11, plugging in your own and ...
... and please change the fourCC from "fLaC" to "hLaC" or something, so the FLAC devs don't get error reports when files are found in the wild.

By the way, does anyone run a Celeron CPU?

Re: HALAC (High Availability Lossless Audio Compression)

Reply #14
Your reply seems to be rather dismissive of integrity checking: "This is not about data compression"

It is a central part of any lossless data compression. 

You should consider changing your attitude on this if you want to achieve wider acceptance. We are not academics. We want to know if we have a corruption. It is more important than the speed you chase.

Still, good luck with this.

Triza


Data integrity is of course very important. Nobody can underestimate. Otherwise, we cannot talk about losslessness. What I want to talk about here is that this control is different from the basic stages of data compression. Estimation, entropy coding, content modeling, error correction, dictionaries ...

And I wanted to point out that this control will not have an effect on Halac's working speed. Thanks for the comment.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #15
EDIT: After posting that integrity checking isn't imperative for testing, ... ahem. Please check the following two. One outcome worse than the other.


Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)

Also if one is interested in comparing the residual compression algorithm the Rice code one can do an "everything else equal" by forking off reference FLAC and then (ab)using residual coding method 10 or 11, plugging in your own and ...
... and please change the fourCC from "fLaC" to "hLaC" or something, so the FLAC devs don't get error reports when files are found in the wild.

By the way, does anyone run a Celeron CPU?

Really thank you for your attention.
The files you send contain extreme hard transitions. I've never worked with such data before. Certain limits must have been exceeded. No problem.
In addition, my current minimum block size is 24 KB. So I don't think I'm taking precautions for a smaller dimension. I usually work with MB files. Such deficiencies and mistakes will of course be. I will handle them in the next update.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #16
Data integrity is important but the way flac does it is wasteful IMO. I'm partial to the idea of grouping frames ala GOP in video encoding, there's a number of potential efficiency benefits to doing this one of them being easily able to do a single stronger checksum that applies to the entire GOP instead of dozens of weak checksums per frame. The neatest way I came up with to do this is for the seektable to be mandatory and a seektable entry details at least the GOP's sample count, file size and integrity checksum for integrity and fast seeking. The main downside to this is that a failed integrity check would resolve to an entire GOP instead of a single frame, but given the rarity of a failure and the likely desire to bench the file when it has errors, not a big downside IMO.

...
Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)
...
I do so hate optional components in a spec, the amount of flac files you buy that don't have MD5 is startling. I suggest a fast mandatory checksum for the full audio stream (stick it in a footer to still allow streaming), you can still have MD5 as an optional checksum if it's necessary for those that have some legacy reason to keep it around.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #17
The files you send contain extreme hard transitions. I've never worked with such data before. Certain limits must have been exceeded. No problem.
Japanese noise artist Merzbow, the Veneorology album. Tested because its extreme bitrate, especially track 3. Monkey's Audio cannot get down to WAVE size. Neither can TAK, though by an inch. FLAC - out of subset! - can out-compress OptimFROG.

Edit: https://filebin.net/k4e4cpi7oro14a52 contains the segment from 33 to 45 seconds. You can hear the texture changes, with enough structure there for FLAC to compress away nine percent - and that is by using fixed predictors only. We have a pretty good idea why: FLAC can change Rice parameter during the frame (the -r switch in the reference encoder), and so exploit redundancy in the ultra-short term. That's a rabbit hole for you to dive into.
(Actually, the reason this track does so well using fixed predictors compared to estimated LPC, is a kinda-bad-but-very-rarely-limiting design choice where the partition must be bigger than the LPC order.)


In addition, my current minimum block size is 24 KB. So I don't think I'm taking precautions for a smaller dimension. I usually work with MB files. Such deficiencies and mistakes will of course be. I will handle them in the next update.
You might want to allow for what in FLAC is verbatim subframes. Storing the samples unencoded. That is also an easy way to handle too short files.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #18
Data integrity is important but the way flac does it is wasteful IMO. I'm partial to the idea of grouping frames ala GOP in video encoding, there's a number of potential efficiency benefits to doing this one of them being easily able to do a single stronger checksum that applies to the entire GOP instead of dozens of weak checksums per frame. The neatest way I came up with to do this is for the seektable to be mandatory and a seektable entry details at least the GOP's sample count, file size and integrity checksum for integrity and fast seeking. The main downside to this is that a failed integrity check would resolve to an entire GOP instead of a single frame, but given the rarity of a failure and the likely desire to bench the file when it has errors, not a big downside IMO.

...
Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)
...
I do so hate optional components in a spec, the amount of flac files you buy that don't have MD5 is startling. I suggest a fast mandatory checksum for the full audio stream (stick it in a footer to still allow streaming), you can still have MD5 as an optional checksum if it's necessary for those that have some legacy reason to keep it around.
Thank you for your valuable ideas. But why is a hash function like MD5 still preferred? There are new and faster options available.
https://github.com/rurban/smhasher

Re: HALAC (High Availability Lossless Audio Compression)

Reply #19
The files you send contain extreme hard transitions. I've never worked with such data before. Certain limits must have been exceeded. No problem.
Japanese noise artist Merzbow, the Veneorology album. Tested because its extreme bitrate, especially track 3. Monkey's Audio cannot get down to WAVE size. Neither can TAK, though by an inch. FLAC - out of subset! - can out-compress OptimFROG.
Edit: https://filebin.net/k4e4cpi7oro14a52 contains the segment from 33 to 45 seconds. You can hear the texture changes, with enough structure there for FLAC to compress away nine percent - and that is by using fixed predictors only. We have a pretty good idea why: FLAC can change Rice parameter during the frame (the -r switch in the reference encoder), and so exploit redundancy in the ultra-short term. That's a rabbit hole for you to dive into.
(Actually, the reason this track does so well using fixed predictors compared to estimated LPC, is a kinda-bad-but-very-rarely-limiting design choice where the partition must be bigger than the LPC order.)
In addition, my current minimum block size is 24 KB. So I don't think I'm taking precautions for a smaller dimension. I usually work with MB files. Such deficiencies and mistakes will of course be. I will handle them in the next update.
You might want to allow for what in FLAC is verbatim subframes. Storing the samples unencoded. That is also an easy way to handle too short files.
For the first time I hear the word "Noise Artist". However, I will look at the relevant data when I find time.

HALAC is currently using a single predictor and does not contain error correction. I have more adaptive predictors, but I stay away because they are a little slow for now. The error correction and different predictors significantly affect the compression ratio. Likewise, Rice coding has a significant effect on the compression rate. I will be really happy if I can solve the speed problem here.

The distress in small-sized data stems from a detail I missed.

And really the ideas of the forum members here are very valuable to me.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #20
why is a hash function like MD5 still preferred?
File "fingerprints" across codecs. Compress a WAVE file with FLAC, WavPack, OptimFROG and TAK, and they store the same audio checksum. I can use my fave player to tell that two files contain the same audio, even without decoding, because the codecs provide for players to display the MD5.

So yes, there are options around that are "better" except that MD5 is so established.
Which means, IMHO:
* If you want to protect individual blocks - which is a good idea, for then you can detect and mute individual corrupted blocks - then don't use MD5, use something faster. Then you can allow for verification without decoding (WavPack, OptimFROG and Monkey's offer that), which is much faster than decoding anyway.
 * But once that is in place, it hardly makes sense to offer a full-stream checksum which isn't MD5.
 * WavPack, OptimFROG and TAK have MD5 optional. FLAC has it "optional" in the sense that it can be put to zero if you don't know it (say, an encoder that has to pass it on on-the-fly without the ability to go back and change file headers afterwards - FLAC has this info in the beginning of the file) - but the reference implementation does "officially not" write files without MD5, it is an option that isn't in the documentation. Apparently, there is no known string that is MD5-ed to zero.
I see @cid42 objects to "optional" components, but heck, MD5 is so useful that there are retrofit-hacks to provide for it. (This uses ffmpeg, which offers more algorithms.) Have one with all zeroes and that is optional enough.


Some fine print on what codecs do:
FLAC is foremost an audio compressor and will compute the same MD5 whether or not the source was AIFF or WAVE. WavPack is foremost a file compressor and, when it added support for big-endian source format(s) (CAF before AIFF actually) it chose to stick to the audio as in the source. Also Monkey's Audio used MD5 on the encode, not on the PCM.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #21
For the first time I hear the word "Noise Artist". However, I will look at the relevant data when I find time.
https://en.wikipedia.org/wiki/Noise_music, and in particular the Japanese scene.

For something that is heavily distorted but doesn't sound like Merzbow: Someone linked to this free album. https://aylwin.bandcamp.com/album/farallon . Discussed here: https://hydrogenaud.io/index.php/topic,122179.msg1014245.html#msg1014245


HALAC is currently using a single predictor and does not contain error correction. I have more adaptive predictors, but I stay away because they are a little slow for now.
flac.exe -l0 uses fixed predictors up to the fourth-order difference. IIRC, shorten would use fixed predictors up to third order.

To be frank, I will be surprised if you will get much of a userbase, since FLAC is good enough already, but you could surely impress the enthusiasts if you move the boundaries of what we thought feasible. That is about like TAK's status: hardly anyone uses it for their music collections, but it did change our perception on what a fast asymmetric codec was even able to do.

Also, things do depend on builds and optimization and architecture. Reference FLAC prioritizes compatibility. Here, for example:
https://hydrogenaud.io/index.php/topic,123025.msg1029768.html#msg1029768
We have done a lot of testing back and forth on FLAC, and gotten closer to some idea about what works in what circumstances

Re: HALAC (High Availability Lossless Audio Compression)

Reply #22
To be frank, I will be surprised if you will get much of a userbase, since FLAC is good enough already, but you could surely impress the enthusiasts if you move the boundaries of what we thought feasible.
I don't know how much more improvement can be done about audio compression. I look at the subject we call boundaries in terms of compression ratio/speed.
Maybe if I can follow a process similar to HALIC. The following table shows the compressed sizes, compression speeds and decompression speeds of a test made with random images. I did not show memory consumption, but other formats consume hundreds of MB memory in a single core during these processes. Image codecs usually decompose from audio codecs at this point. However, HALIC takes care of these operations with only a few MB.
https://www.dpreview.com/sample-galleries/7416430458/dji-inspire-3-sample-gallery/

X X X
According to my experience from HALIC, I just felt that I was tired. Even if it is quite superior to other alternatives, it doesn't mean much. So I took a break and I entered the audio again for the change. If I have time, I will tire myself a little bit about it.

Re: HALAC (High Availability Lossless Audio Compression)

Reply #23
I don't know how much more improvement can be done about audio compression. I look at the subject we call boundaries in terms of compression ratio/speed.
Remember that it is a 3D space of (compression ratio, encoding speed, decoding speed), so the Pareto frontier is a bit more complicated.

Some brief history: Back twenty years ago, it was a common belief that if you wanted heavy compression, you needed a complicated symmetric algorithm that took much CPU to decode. Monkey's Audio was the to-go for those who wanted something notably smaller than FLAC. OptimFROG was even more that way - and its heaviest modes were likely to get the upper hand on La, which came out in a couple of betas and a buggy foobar2000 component that the author didn't bother to fix. Also WavPack - which is a late 1990s codec that later has been beefed up - was symmetric.

Then came TAK. The green curves at http://www.audiograaf.nl/losslesstest/revision%206/Average%20of%20CDDA%20sources.pdf .

And WavPack and later OptimFROG started to do additional encoding processing that only minimally affects decoding speed. You see the WavPack "-x4" curve - and you see that the two leftmost OptimFROG triangles are pretty much above each other in the decoding speed diagram, that is because its --preset 0 does not utilize anything such.
And FLAC improved. Quite a lot. The way the reference encoder works - with variable LPC coefficients - is by a rough estimation on several alternatives (like, history length); then it picks the one that scores best, and calculates that all the way to the bottom.

What can be improved? High resolution: http://www.audiograaf.nl/losslesstest/revision%206/Average%20of%20all%20hi-res%20sources.pdf
High sampling rate makes the audio include lots of strange things - including "nothings". And reference flac's guesstimation approach isn't hitting high sampling rate material equally good. Also, there are other estimation methods that might be tried. This is practical engineering though - nothing that utilizes a newer encoding method than (partitioned) Rice (= Golomb power-of-two).

And multichannel: http://www.audiograaf.nl/losslesstest/Lossless%20audio%20codec%20comparison%20-%20revision%206%20-%20multichannel.html
As far as I remember from the TAK author, he uses some smart heuristic to get a reasonable channel (de)correlation matrix.


Re: HALAC (High Availability Lossless Audio Compression)

Reply #24
Hello to everybody and sorry for off-topic.

Japanese noise artist Merzbow, the Veneorology album. Tested because its extreme bitrate, especially track 3. Monkey's Audio cannot get down to WAVE size. Neither can TAK, though by an inch. FLAC - out of subset! - can out-compress OptimFROG.

Mistype: Not the Veneorology but Venereology
Porcus, your statement is interesting to me but I can't believe it until I make my own tests.
I have the reissue of this album by Irond Records. Are we talk about the track:
I Lead You Towards Glorious Times (live)
5:29.800 (14 544 180 samples)
MD5 for WAV PCM without metadata is: 4a83c56b646c2907723490819ba0741b

Can you please privately provide me your track if its differs from my ?
And also tell your best FLAC option for this track ?
Thanks!