HydrogenAudio

Lossless Audio Compression => Lossless / Other Codecs => Topic started by: Hakan Abbas on 2023-12-31 21:12:23

Title: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2023-12-31 21:12:23
I'm new in this forum. I am glad it was such a special forum on Audio. I am the writer of the lossless image codec called HALIC(High Availability Lossless Image Compression) (https://encode.su/threads/4025-HALIC-(High-Availability-Lossless-Image-Compression)). It is a work that can offer a good compression ratio quite quickly. This time I would like to introduce my work called HALAC(High Availability Lossless Audio Compression).

In the past(2018-2019), I had been working on the lossless audio compression. However, I could not bring together the work I did. Now I have a little time and I think I developed a fast codec. I worked on 16 bit, 2 channel audio data (.wav). Higher bit and channel options can be added if necessary. As a result, the approach is the same.

HALAC, like the HALIC, focuses on a reasonable compression ratio and high processing speed. The compression rate for audio data is usually limited. So I wanted a solution that can work faster with a few percent concessions.

I used a quick estimation with ANS(FSE). I don't know if there are other codecs using ANS, but the majority uses "Rice Coding". However, in my tests, Rice Coding(my own implementation) is a bit behind in terms of speed(0.6x - 0.7x), but it gives better results as compression rate(1% - 2%). The loss of speed in the Rice Coding is due to the calculation of adaptive parameter. I am really happy with ANS right now because speed is more important to me. In addition, I do not think that I use ANS fully efficiently.

GPU or SIMD was not used. Also now in the single-thread version. In the next version, I can add the Multithread option. I couldn't compile the Linux version because my Linux machine collapsed. I tried to find the middle way by working with different music genres.
Below are the comparisons (from original wav, 16 bit, 2 channel, 44100 bps) with FLAC, ALAC and WAVPACK (Pazera_Free_Audio_Extractor ver. 2.11).

Test Machine (2012): i7 3770k, 3.9 ghz, 16 gb ram, 256 gb ssd
Encode Usage: halac_encode.exe input.wav out.halac
Decode Usage: halac_decode.exe out.halac original.wav

(https://encode.su/attachment.php?attachmentid=10971&d=1704053753)

(https://encode.su/attachment.php?attachmentid=10972&d=1704053768)

(https://encode.su/attachment.php?attachmentid=10973&d=1704053780)

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-01-01 08:18:09
Speed wise your codec is quite impressive. I ran some simple tests here and it seems to compress a tiny bit worse than FLAC 1.4.3 in mode -4 but clearly better than FLAC in mode -3.
But in compression speed it beats even FLAC mode -0 and TAK -p0. And seems to beat them all in decoding speed too.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: cid42 on 2024-01-01 11:31:35
Very nice. tl;dr is it essentially LPC with ANS for the residual? If you could release source or compile for Linux that would be great, I have trouble running .NET mono crap through wine.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-01-01 11:39:52
I wonder how much of the speed benefit comes from the codec seemingly lacking any safety checks.

I compressed a WAV with metadata and the decoded file seems to have copied the header from the original file but is missing the metadata chunks, so it's a bit invalid length wise.

And I randomly altered bits in the encoded binary data as a test, decoder didn't notice any issues but of course the decoded file is no longer bit-identical to the original.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-01 12:14:17
Very nice. tl;dr is it essentially LPC with ANS for the residual? If you could release source or compile for Linux that would be great, I have trouble running .NET mono crap through wine.
Yes, a linear prediction and then 2 pieces FSE are used to encode residues. After the installation of my Linux machine is completed, I add execuable files. It is not open source at the moment, but can be evaluated according to the situation in the future. Right now, it's very new and I have things to improve.

There are also different predictors I don't use. Some are good in high entropy and some low entropy. As I said, we are at the beginning yet.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-01 12:22:04
I wonder how much of the speed benefit comes from the codec seemingly lacking any safety checks.

I compressed a WAV with metadata and the decoded file seems to have copied the header from the original file but is missing the metadata chunks, so it's a bit invalid length wise.

And I randomly altered bits in the encoded binary data as a test, decoder didn't notice any issues but of course the decoded file is no longer bit-identical to the original.

What you are talking about, ie data integrity control, can be achieved with a rapid Hash functions(wyhash, xxhash...). This will not have much effect on speed. Because they work extremely fast. However, no one had made such a request in my previous studies. If this is necessary, I will add it in the next version, no problem. Thanks a lot...

In addition, dealing with Metadata is the next simple details. We usually do not compress them (a few kilobytes).
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-01 12:34:48
The numbers posted suggest 3x to 5x faster than FLAC, and I get nothing of that kind. Though it is fast indeed! The decoding speeds are outright impressive given how FLAC is the fastest thing we ever saw ... yet. (Only recently did the -0 encoding speeds improve.)

I ran it on the corpus in my signature, and it is on par with fastest FLAC --no-md5. On a RAM disk (I use Passmark OSFmount because it is mounting software too) I had to restrict myself down to 4 albums (all classical music, this is just a brief test). After a few runs, I can report figures like these:

Encoding:
10.1 sec for flac -0r0 --no-md5 --totally-silent
10.4 sec for HALAC
11.0 sec for flac -1 --no-md5
14.4 sec for flac -0
16.1 sec for TAK -p0

Decoding:
13.8 for HALAC
16.6 for flac on the "-0r0 --no-md5" files
18.5 for TAK -p0


Sizes are impressive at the speed! File sizes for the full "signature" corpus, all FLAC and ALAC figures have had tags and padding removed
13 360 205 283 for FLAC -0r0
12 772 828 991 for FLAC -1
12 393 304 500 for HALAC <---------- that's between ffmpeg's ALAC and refalac
12 032 168 423 for FLAC -5


FLAC 1.4.3 win32, i5-1135-G7
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-01 13:13:46
@Porcus;
Thank you very much for the test.
The results I have obtained with different Converters are almost the same (Pazera, Human, fre:ac). I don't know your results when MD5 is active. As I mentioned in my previous answers, I have just started this work. Although the my data compression history is long, I worked for a certain time for audio compression and took a long break. We can probably put forward better things with your valuable ideas.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-01 16:53:21
flac.exe will write MD5 unless you invoke that "undocumented" option, so all the presets are with MD5. The "14.4" seconds enoding time using using "-0" was with MD5 - and also with the -r3 that tries to partition into 8/4/2/1, and which matters like 0.1 to 0.2 seconds.

Actually,  --totally-silent switches off the console output and helps 0.2 to 0.3 on those numbers.
Also I can speed up flac.exe slightly by using a larger block size. flac.exe uses 1152 samples per block for the fixed-predictors presets, I doubt that would have been selected today.  So some more timings on the RAM disk:
9.5 seconds encoding -0fr0 -b3072 --totally-silent --no-md5 (that switches off MD5) - and the --totally-silent actually helps a few tenths too.
14.6 seconds decoding the same, also with --totally-silent
13.1 encoding -0fr0 -b3072 --totally-silent (that is with MD5)
18.2 decoding the ones with MD5.

All times are medians of "a few". So MD5 takes another three and a half seconds. Quite significant in percentage terms.


Now, switching corpus to four metal albums instead, still on the RAM disk:
8.9 & 12.4 encoding & decoding HALAC
12.3 & 16.8 encoding flac at -0fr0 -b3072 --totally-silent --no-md5
15.3 & 18.3 encoding & decoding TAK at -p0

So material does matter. Impressive.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: forart.eu on 2024-01-01 17:28:09
Interesting, and what about MLAC ?
https://hydrogenaud.io/index.php/topic,125201.0.html
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-01 20:00:08
Interesting, and what about MLAC ?
https://hydrogenaud.io/index.php/topic,125201.0.html
In fact, instead of writing this, you can offer us files that we can run to test(under your own topic). Or you can share some of your results from there. You shouldn't expect others to do this. Then, those who are interested in the subject can perform different tests and give you feedback.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-02 21:06:18
And I randomly altered bits in the encoded binary data as a test, decoder didn't notice any issues but of course the decoded file is no longer bit-identical to the original.

I prepared the V.0.2.0 version to give a quick answer to this request. I added a special Hash control for each block. What I want to show is that the hash operations have no effect on the "HALAC" in terms of speed.

However, it should be noted that random changes on stuck files can make the archive cannot be opened. Because we can disrupt the special headings of independent blocks. Or we can break the tANS. There are many situations like this. In the present case, a warning message will be received if a place that corresponds to the samples fields is changed directly.

In fact, the most robust way to do this is to handle and check both the input file and the decoded file at one time. However, producing Hash at one time for the whole file will increase memory consumption(even if the speed does not change). This is not aesthetic. But I can find a more practical way in the next version. This is not about data compression.

X X X
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Triza on 2024-01-02 23:16:49
Your reply seems to be rather dismissive of integrity checking: "This is not about data compression"

It is a central part of any lossless data compression. 

You should consider changing your attitude on this if you want to achieve wider acceptance. We are not academics. We want to know if we have a corruption. It is more important than the speed you chase.

Still, good luck with this.

Triza
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-03 00:03:58
EDIT: After posting that integrity checking isn't imperative for testing, ... ahem. Please check the following two. One outcome worse than the other.


Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)

Also if one is interested in comparing the residual compression algorithm the Rice code one can do an "everything else equal" by forking off reference FLAC and then (ab)using residual coding method 10 or 11 (https://xiph.org/flac/format.html#residual), plugging in your own and ...
... and please change the fourCC from "fLaC" to "hLaC" or something, so the FLAC devs don't get error reports when files are found in the wild.

By the way, does anyone run a Celeron CPU?
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-03 06:34:07
Your reply seems to be rather dismissive of integrity checking: "This is not about data compression"

It is a central part of any lossless data compression. 

You should consider changing your attitude on this if you want to achieve wider acceptance. We are not academics. We want to know if we have a corruption. It is more important than the speed you chase.

Still, good luck with this.

Triza


Data integrity is of course very important. Nobody can underestimate. Otherwise, we cannot talk about losslessness. What I want to talk about here is that this control is different from the basic stages of data compression. Estimation, entropy coding, content modeling, error correction, dictionaries ...

And I wanted to point out that this control will not have an effect on Halac's working speed. Thanks for the comment.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-03 06:45:08
EDIT: After posting that integrity checking isn't imperative for testing, ... ahem. Please check the following two. One outcome worse than the other.


Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)

Also if one is interested in comparing the residual compression algorithm the Rice code one can do an "everything else equal" by forking off reference FLAC and then (ab)using residual coding method 10 or 11 (https://xiph.org/flac/format.html#residual), plugging in your own and ...
... and please change the fourCC from "fLaC" to "hLaC" or something, so the FLAC devs don't get error reports when files are found in the wild.

By the way, does anyone run a Celeron CPU?

Really thank you for your attention.
The files you send contain extreme hard transitions. I've never worked with such data before. Certain limits must have been exceeded. No problem.
In addition, my current minimum block size is 24 KB. So I don't think I'm taking precautions for a smaller dimension. I usually work with MB files. Such deficiencies and mistakes will of course be. I will handle them in the next update.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: cid42 on 2024-01-03 09:40:01
Data integrity is important but the way flac does it is wasteful IMO. I'm partial to the idea of grouping frames ala GOP in video encoding, there's a number of potential efficiency benefits to doing this one of them being easily able to do a single stronger checksum that applies to the entire GOP instead of dozens of weak checksums per frame. The neatest way I came up with to do this is for the seektable to be mandatory and a seektable entry details at least the GOP's sample count, file size and integrity checksum for integrity and fast seeking. The main downside to this is that a failed integrity check would resolve to an entire GOP instead of a single frame, but given the rarity of a failure and the likely desire to bench the file when it has errors, not a big downside IMO.

...
Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)
...
I do so hate optional components in a spec, the amount of flac files you buy that don't have MD5 is startling. I suggest a fast mandatory checksum for the full audio stream (stick it in a footer to still allow streaming), you can still have MD5 as an optional checksum if it's necessary for those that have some legacy reason to keep it around.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-03 09:53:53
The files you send contain extreme hard transitions. I've never worked with such data before. Certain limits must have been exceeded. No problem.
Japanese noise artist Merzbow, the Veneorology album. Tested because its extreme bitrate, especially track 3. Monkey's Audio cannot get down to WAVE size. Neither can TAK, though by an inch. FLAC - out of subset! - can out-compress OptimFROG.

Edit: https://filebin.net/k4e4cpi7oro14a52 contains the segment from 33 to 45 seconds. You can hear the texture changes, with enough structure there for FLAC to compress away nine percent - and that is by using fixed predictors only. We have a pretty good idea why: FLAC can change Rice parameter during the frame (the -r switch in the reference encoder), and so exploit redundancy in the ultra-short term. That's a rabbit hole for you to dive into.
(Actually, the reason this track does so well using fixed predictors compared to estimated LPC, is a kinda-bad-but-very-rarely-limiting design choice where the partition must be bigger than the LPC order.)


In addition, my current minimum block size is 24 KB. So I don't think I'm taking precautions for a smaller dimension. I usually work with MB files. Such deficiencies and mistakes will of course be. I will handle them in the next update.
You might want to allow for what in FLAC is verbatim subframes. Storing the samples unencoded. That is also an easy way to handle too short files.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-03 18:23:45
Data integrity is important but the way flac does it is wasteful IMO. I'm partial to the idea of grouping frames ala GOP in video encoding, there's a number of potential efficiency benefits to doing this one of them being easily able to do a single stronger checksum that applies to the entire GOP instead of dozens of weak checksums per frame. The neatest way I came up with to do this is for the seektable to be mandatory and a seektable entry details at least the GOP's sample count, file size and integrity checksum for integrity and fast seeking. The main downside to this is that a failed integrity check would resolve to an entire GOP instead of a single frame, but given the rarity of a failure and the likely desire to bench the file when it has errors, not a big downside IMO.

...
Integrity checking isn't imperative for testing of course, it can be added when you finalize the file format. There are some very fast checksum algorithms around that can be used for block-level checking. (For the full audio stream, then it's kinda pointless to use anything but MD5. Better make it optional than use something else I think.)
...
I do so hate optional components in a spec, the amount of flac files you buy that don't have MD5 is startling. I suggest a fast mandatory checksum for the full audio stream (stick it in a footer to still allow streaming), you can still have MD5 as an optional checksum if it's necessary for those that have some legacy reason to keep it around.
Thank you for your valuable ideas. But why is a hash function like MD5 still preferred? There are new and faster options available.
https://github.com/rurban/smhasher
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-03 18:49:07
The files you send contain extreme hard transitions. I've never worked with such data before. Certain limits must have been exceeded. No problem.
Japanese noise artist Merzbow, the Veneorology album. Tested because its extreme bitrate, especially track 3. Monkey's Audio cannot get down to WAVE size. Neither can TAK, though by an inch. FLAC - out of subset! - can out-compress OptimFROG.
Edit: https://filebin.net/k4e4cpi7oro14a52 contains the segment from 33 to 45 seconds. You can hear the texture changes, with enough structure there for FLAC to compress away nine percent - and that is by using fixed predictors only. We have a pretty good idea why: FLAC can change Rice parameter during the frame (the -r switch in the reference encoder), and so exploit redundancy in the ultra-short term. That's a rabbit hole for you to dive into.
(Actually, the reason this track does so well using fixed predictors compared to estimated LPC, is a kinda-bad-but-very-rarely-limiting design choice where the partition must be bigger than the LPC order.)
In addition, my current minimum block size is 24 KB. So I don't think I'm taking precautions for a smaller dimension. I usually work with MB files. Such deficiencies and mistakes will of course be. I will handle them in the next update.
You might want to allow for what in FLAC is verbatim subframes. Storing the samples unencoded. That is also an easy way to handle too short files.
For the first time I hear the word "Noise Artist". However, I will look at the relevant data when I find time.

HALAC is currently using a single predictor and does not contain error correction. I have more adaptive predictors, but I stay away because they are a little slow for now. The error correction and different predictors significantly affect the compression ratio. Likewise, Rice coding has a significant effect on the compression rate. I will be really happy if I can solve the speed problem here.

The distress in small-sized data stems from a detail I missed.

And really the ideas of the forum members here are very valuable to me.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-03 19:17:38
why is a hash function like MD5 still preferred?
File "fingerprints" across codecs. Compress a WAVE file with FLAC, WavPack, OptimFROG and TAK, and they store the same audio checksum. I can use my fave player to tell that two files contain the same audio, even without decoding, because the codecs provide for players to display the MD5.

So yes, there are options around that are "better" except that MD5 is so established.
Which means, IMHO:
* If you want to protect individual blocks - which is a good idea, for then you can detect and mute individual corrupted blocks - then don't use MD5, use something faster. Then you can allow for verification without decoding (WavPack, OptimFROG and Monkey's offer that), which is much faster than decoding anyway.
 * But once that is in place, it hardly makes sense to offer a full-stream checksum which isn't MD5.
 * WavPack, OptimFROG and TAK have MD5 optional. FLAC has it "optional" in the sense that it can be put to zero if you don't know it (say, an encoder that has to pass it on on-the-fly without the ability to go back and change file headers afterwards - FLAC has this info in the beginning of the file) - but the reference implementation does "officially not" write files without MD5, it is an option that isn't in the documentation. Apparently, there is no known string that is MD5-ed to zero. (https://www.reddit.com/r/cryptography/comments/jj8lg1/is_there_a_known_value_whose_md5_hash_is_0/)
I see @cid42 objects to "optional" components, but heck, MD5 is so useful that there are retrofit-hacks to provide for it (https://foobar.hyv.fi/?view=foo_audiomd5). (This uses ffmpeg, which offers more algorithms (https://ffmpeg.org/ffmpeg-formats.html#streamhash-1).) Have one with all zeroes and that is optional enough.


Some fine print on what codecs do:
FLAC is foremost an audio compressor and will compute the same MD5 whether or not the source was AIFF or WAVE. WavPack is foremost a file compressor and, when it added support for big-endian source format(s) (CAF before AIFF actually) it chose to stick to the audio as in the source. Also Monkey's Audio used MD5 on the encode, not on the PCM.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-03 19:39:41
For the first time I hear the word "Noise Artist". However, I will look at the relevant data when I find time.
https://en.wikipedia.org/wiki/Noise_music, and in particular the Japanese scene.

For something that is heavily distorted but doesn't sound like Merzbow: Someone linked to this free album. https://aylwin.bandcamp.com/album/farallon . Discussed here: https://hydrogenaud.io/index.php/topic,122179.msg1014245.html#msg1014245


HALAC is currently using a single predictor and does not contain error correction. I have more adaptive predictors, but I stay away because they are a little slow for now.
flac.exe -l0 uses fixed predictors up to the fourth-order difference. IIRC, shorten would use fixed predictors up to third order.

To be frank, I will be surprised if you will get much of a userbase, since FLAC is good enough already, but you could surely impress the enthusiasts if you move the boundaries of what we thought feasible. That is about like TAK's status: hardly anyone uses it for their music collections, but it did change our perception on what a fast asymmetric codec was even able to do.

Also, things do depend on builds and optimization and architecture. Reference FLAC prioritizes compatibility. Here, for example:
https://hydrogenaud.io/index.php/topic,123025.msg1029768.html#msg1029768
We have done a lot of testing back and forth on FLAC, and gotten closer to some idea about what works in what circumstances
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-04 08:46:18
To be frank, I will be surprised if you will get much of a userbase, since FLAC is good enough already, but you could surely impress the enthusiasts if you move the boundaries of what we thought feasible.
I don't know how much more improvement can be done about audio compression. I look at the subject we call boundaries in terms of compression ratio/speed.
Maybe if I can follow a process similar to HALIC. The following table shows the compressed sizes, compression speeds and decompression speeds of a test made with random images. I did not show memory consumption, but other formats consume hundreds of MB memory in a single core during these processes. Image codecs usually decompose from audio codecs at this point. However, HALIC takes care of these operations with only a few MB.
https://www.dpreview.com/sample-galleries/7416430458/dji-inspire-3-sample-gallery/

X X X
According to my experience from HALIC, I just felt that I was tired. Even if it is quite superior to other alternatives, it doesn't mean much. So I took a break and I entered the audio again for the change. If I have time, I will tire myself a little bit about it.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-04 14:48:16
I don't know how much more improvement can be done about audio compression. I look at the subject we call boundaries in terms of compression ratio/speed.
Remember that it is a 3D space of (compression ratio, encoding speed, decoding speed), so the Pareto frontier is a bit more complicated.

Some brief history: Back twenty years ago, it was a common belief that if you wanted heavy compression, you needed a complicated symmetric algorithm that took much CPU to decode. Monkey's Audio was the to-go for those who wanted something notably smaller than FLAC. OptimFROG was even more that way - and its heaviest modes were likely to get the upper hand on La, which came out in a couple of betas and a buggy foobar2000 component that the author didn't bother to fix. Also WavPack - which is a late 1990s codec that later has been beefed up - was symmetric.

Then came TAK. The green curves at http://www.audiograaf.nl/losslesstest/revision%206/Average%20of%20CDDA%20sources.pdf .

And WavPack and later OptimFROG started to do additional encoding processing that only minimally affects decoding speed. You see the WavPack "-x4" curve - and you see that the two leftmost OptimFROG triangles are pretty much above each other in the decoding speed diagram, that is because its --preset 0 does not utilize anything such.
And FLAC improved. Quite a lot. The way the reference encoder works - with variable LPC coefficients - is by a rough estimation on several alternatives (like, history length); then it picks the one that scores best, and calculates that all the way to the bottom.

What can be improved? High resolution: http://www.audiograaf.nl/losslesstest/revision%206/Average%20of%20all%20hi-res%20sources.pdf
High sampling rate makes the audio include lots of strange things - including "nothings". And reference flac's guesstimation approach isn't hitting high sampling rate material equally good. Also, there are other estimation methods that might be tried. This is practical engineering though - nothing that utilizes a newer encoding method than (partitioned) Rice (= Golomb power-of-two).

And multichannel: http://www.audiograaf.nl/losslesstest/Lossless%20audio%20codec%20comparison%20-%20revision%206%20-%20multichannel.html
As far as I remember from the TAK author, he uses some smart heuristic to get a reasonable channel (de)correlation matrix.

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Skymmer on 2024-01-04 21:17:29
Hello to everybody and sorry for off-topic.

Japanese noise artist Merzbow, the Veneorology album. Tested because its extreme bitrate, especially track 3. Monkey's Audio cannot get down to WAVE size. Neither can TAK, though by an inch. FLAC - out of subset! - can out-compress OptimFROG.

Mistype: Not the Veneorology but Venereology
Porcus, your statement is interesting to me but I can't believe it until I make my own tests.
I have the reissue of this album by Irond Records. Are we talk about the track:
I Lead You Towards Glorious Times (live)
5:29.800 (14 544 180 samples)
MD5 for WAV PCM without metadata is: 4a83c56b646c2907723490819ba0741b

Can you please privately provide me your track if its differs from my ?
And also tell your best FLAC option for this track ?
Thanks!
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-04 22:06:38
Porcus, your statement is interesting to me but I can't believe it until I make my own tests.
I have the reissue of this album by Irond Records. Are we talk about the track:
I Lead You Towards Glorious Times (live)
5:29.800 (14 544 180 samples)

I Lead You Towards Glorious Times yes. I have the Relapse edition. Don't know if Irond is the same, even if number of samples matches (... could be offset). Try the following and see what you get, and see if FLAC's MD5 is D651DE4246A4D5CA64385A91EE9241C5. If not we can exchange files.
flac -2er15 -b16384 --lax
If I afterwards give metaflac --remove-all --dont-use-padding, then I get it down to
51914702 bytes using flac 1.1.4 through 1.4.2
51914703 bytes more using flac 1.4.3.

Here are sizes with other codecs. https://hydrogenaud.io/index.php/topic,122040.msg1010086.html#msg1010086

(Reasons why "MD5 for WAV PCM without metadata" is not reliable, are (1) more than one version of the WAVE file format, and (2) even if you don't see metadata in a tagger, you need to know the size of any JUNK chunk.)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Skymmer on 2024-01-04 22:33:09
My MD5 is 03813C565FD01FF1D26F8DA56D42A2CB and result is 51 913 118
Writing you a PM...
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-04 22:50:30
So, to all who wonder about whether there are several versions of this monster track: these differ in offset only.

Skymmer's has its track boundary starting 664 samples before mine does:

Discarded 664 trailing samples containing non-null values from file #1.
Discarded 664 leading samples containing non-null values from file #2.
No differences in decoded data found within the compared range.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-08 19:04:02
Hello again. I found some time and made some updates.
* Working with small size files
* Working with data containing excessive noise (high entropy)
* Stronger data verification / hashing
* A faster mode

I would like to thank Porcus for making me see some of the points I've missed.
Two small files in Merzbow - Veneorology can now be coded and decoded smoothly. Of course, the compressed size is a bit too much. But in the next stage, we can prevent the increase in size by coding the data as it is for such cases. No problem.
Merzbow - Veneorology
mrz-point1.wav 17,718 bytes
mrz-point1.halac19,823 bytes
mrz-point2.wav35,358 bytes
mrz-point2.halac38.793 bytes

I also prepared a more faster version for speed enthusiasts. Encoder is around 30% and Decoder is around 10% faster.
Sean Paul - Full Frequency - 2014
Wave525,065,800 bytes
Flac-0408,370,552 bytes
Halac-fast396,199,439 bytes
Flac-1395,413,493 bytes
Flac-2394,448,381 bytes
Flac-3391,537,281 bytes

XXXX
Note: I haven't prepared my Linux machine yet.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-09 20:42:07
I finally prepared compilations for Linux. Just in case, I also made a static compilation.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-10 01:24:50
Tested.
New corpus: track 2 from each of the 38 CDs in my signature, to make it fit a RAM disk, 3h44m of CDDA.  Found out later that the Mahler symphony makes for 1/4 of the total duration.  Should just have deleted that.  Presenting ratios with and without that one, but kept the timings with all.
All FLAC decoding/encoding with 1.4.3 Win32.  With --totally-silent on top everything, it does not spend time spamming the console (which could be significant at these speeds!)
All times are median of at least 3.

6.7 is an impressive figure eh?

w/o M.all 38 tracksencodingdecoding.
60,058%56.30%6.7s 11.6s HALAC FAST
60,063%55.55%8.9s 11.6s FLAC -0b3072 -r0 --no-md5 --totally-silent
59,76%55.36%9.6s 12.5s FLAC -0  --no-md5 --totally-silent
58,10%54.03%9.8s 11.9s FLAC -1b3072 --no-md5 --totally-silent    <---- that is a "smart" joint stereo/dual mono
56,60%52.59%10.0s 13.3s HALAC
56,76%52.49%12.4s 12.7s FLAC -3 --no-md5 --totally-silent <---- that's variable-coefficient LPC but dual mono.
54,951%51.08%14.3s 17.4s TAK at p0 with no WAVE metadata (and no MD5)
54,948%51.05%30 s13.2s FLAC --no-md5 --totally-silent
HALAC isn't so happy about Mahler. Dropping it makes HALAC FAST beat the fastest FLAC at size, and makes HALAC beat FLAC -3 at size. TAK -p0 isn't so happy about that track either, but it is usually so good that it is a mild surprise to see flac -5 beat it at size - they trade about even blows here.

MD5 would add around 4 seconds to FLAC times.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-10 08:45:02
Thank you for the new test, Porcus. I would be glad if I can get information about the operating system and processor you use.
In most cases, Halac-Fast is better than FLAC-0 in the compression ratio. Because I especially set it like that. Of course, there is always space for exceptions.

And once again, these tests are performed using "Ram Disk". In terms of speed, even HALAC(not fast) is always faster than FLAC-0 in normal systems. With ordinary SSDs, HALAC will perform much better in slow and normal systems. My tests aside, I know this from different people who have done numerous tests. In order to address everyone, I do my tests especially with i7 3770k (2012). When I compile with AVX2 supported, I can get a little better results. However, HALAC cannot be used in processors without AVX2 support.

And I added a quick data verification system with V.0.2.0, especially because it was persistently requested. With V.0.2.1, I updated it more powerfully and made it permanent. By performing 2 times more operations. I do not want to disrupt the mechanism again to remove or make it selectable.

During these hashing processes, a very high number of operations take place. When working on the Ram Disk, the cost of these operations will be seen more clearly. However, in slow and normal systems, this transaction cost is not very clear and is not seen. In this test, the other codecs do not make data verification, HALAC is a little behind in this case. So if data verification is not performed, HALAC on Ram Disk will be slightly faster.

Since there is no one else who shares the test results made with a normal system that does not use Ram Disk, I may not be fully understood. There are necessarily testers. I wonder if I'm doing something really bad or harmful, I'm in doubt.

For example, there is a test with an old system:
https://encode.su/threads/4180-hac(high-availabilitation-lossless-audio-compression)?p=81748&viewFull=1#post81748
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-10 19:55:42
I'm not sure what you mean here. Rough SSD results will follow, but they are less reliable.

Thank you for the new test, Porcus. I would be glad if I can get information about the operating system and processor you use.
Windows 11, same i5-1135-G7 as in reply 6. Comparison with yours: https://www.cpubenchmark.net/compare/2vs3830/Intel-i7-3770K-vs-Intel-i5-1135G7
It is not in a laptop, it is in a fanless NUC. With 16 GB RAM (as you). When I used RAM disk, I allocated 4 to that.

In most cases, Halac-Fast is better than FLAC-0 in the compression ratio. Because I especially set it like that. Of course, there is always space for exceptions.
Tested the full 38 CDs in my signature, confirms it:
13301342004 bytes for flac -0 -b3072 (encoded with --no-md5 --totally-silent, that's for the timing quoted below), after running metaflac --remove-all --dont-use-padding to get it more comparable ... and to time it.
13224920370 bytes for HALAC_FAST
12739556906 bytes for flac -1 -b3072 --no-md5 --totally-silent
12636918110 bytes for flac -3 --no-md5 --totally-silent
12393304804 bytes for HALAC
12032168564 bytes for flac -5 --no-md5 --totally-silent (-5 is default, and it brute-forces stereo decorrelation)

And once again, these tests are performed using "Ram Disk". In terms of speed, even HALAC(not fast) is always faster than FLAC-0 in normal systems. With ordinary SSDs, HALAC will perform much better in slow and normal systems.
RAM disk seems to give more consistent results. This computer seems to do the following oddity when I loop encoding and then decoding onto the SSD: first run is usually slow, then timing improves, until the CPU starts to throttle.
It isn't the same issue when I test heavier settings. They need to be left overnight anyway, so I can run a couple of warm-ups to stabilize CPU temperature before the actual timing.

You mention you use AVX2. The official flac build does not use AVX2, so I took up the Rarewares 64-bit build that gave best results at https://hydrogenaud.io/index.php/topic,123025.msg1029768.html#msg1029768 for the following.
Encoding using -0fb3072 --no-md5 --totally-silent *.wav (the "f" overwrites, just like HALAC does) and decoding using -df --no-md5 --totally-silent *.flac . This is 38 CDs (one .wav file per CD), on SSD, timed with 7-bench timer64.exe
109 / 93 / 93 / 86 / 86 for five successive FOR-looped encodings. Using "-1" instead of "-0" gives about the same.
159 / 165 / 161 / 171 / 167 for five successive FOR-looped decodings.
Significant variation, I'd say.
But merely writing out the FLAC files takes quite a lot of time too. Running metaflac --remove-all --dont-use-padding takes some 35 to 72 seconds. That just strips out everything except STREAMINFO from the file headers, writes new header and copies the stream into a new file that is written. The big variation there (indicating something going on with allocation and priority in that SSD) is even bigger than in encoding/decoding.

But, now, going to flac -3f --no-md5 --totally-silent *.wav , only three rounds:
100 / 93 / 91 for encoding; 118 / 118 / 109 for decoding.
This would be a giant "WTF!" hadn't it been that it has been observed before that -3 can decode faster than -0. But still it is a surprise.


HALAC then. Since that doesn't support wildcards, and timer64 isn't happy about FOR loops, I put a FOR loop in a .bat file and ran timer64 on that:
153 / 145 / 124 / 115 / 122 for five successive FOR-looped encodings.
297 / 235 / 225 / 221 / 208  for five successive FOR-looped decodings.
I mean, this isn't reliable for anything but establishing that it is slower than FLAC. Encoding speed is around flac.exe -5 --no-md5 --totally-silent. -5 is the default.

Rough HALAC FAST times:
96 then down to 78 for encoding. 155 then down to 134 for decoding.
Not too reliable, but good enough to verify it encodes very fast indeed.



And I added a quick data verification system with V.0.2.0, especially because it was persistently requested.
On the encode stream, so that you can later implement verify without full decode?
Per block? "All" competing formats have something like that, except a couple that suck ... (ALAC has nothing, and TTA claims to offer it, but I know no implementation that actually uses it.)

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: cid42 on 2024-01-10 21:01:15
Thank you for the Linux binaries, particularly the static binary I did need that. This laptop is still on Ubuntu 20.04 so the glibc you compiled with is too old, also the static version should work on non-glibc like alpine linux. Another example that does apply to me is that I rent a Linux server and cannot update the installed libraries, but they do allow running uploaded binaries. glibc is compatible in such a way that if you want to make a highly portable non-static binary you need to use an ancient toolchain, Prime95 for example uses a very old centos to compile (from memory 4?). Personally I wouldn't bother and would just provide the static binary.

Here's some basic testing. Old M.2 SSD, Ubuntu 20.04, laptop with no user input during testing, skylake 6700HQ, fast static Linux binaries.

Code: [Select]
$ time for f in */*.wav;do halac_enc_2.0.1 "$f" "$f.halac";done

real 0m11.848s
user 0m10.116s
sys 0m1.615s

$ time for f in */*.wav;do ~/Downloads/flac143/flac-1.4.3/src/flac/flac --no-md5 --totally-silent -0 "$f";done

real 0m17.009s
user 0m12.440s
sys 0m3.622s


$ time for f in */*.halac;do halac_dec_2.0.1 "$f" "$f.wav";done

real 0m18.359s
user 0m16.082s
sys 0m2.232s

$ time for f in */*.flac;do ~/Downloads/flac143/flac-1.4.3/src/flac/flac --totally-silent -d "$f" -o "$f.fdec";done

real 0m17.172s
user 0m13.083s
sys 0m3.842s

Sizes:
2916907812 wav
2017463945 flac
2016873005 flac stripped of metadata
1994734822 halac
1979508435 sum of best (best of flac/halac unstripped)

I find the large difference in sum of best interesting, both codecs have decent swings in their favour.

edit: Late entry with slac:

Slac to compare to the above:

Code: [Select]
Encode
real 0m27.496s
user 0m24.736s
sys 0m2.417s

Decode
real 0m19.158s
user 0m16.150s
sys 0m2.807s

Size: 1983908820
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: bryant on 2024-01-10 21:08:07
I have been following this thread with some interest because five years ago I created a codec called SLAC along these lines. It was intended to be as simple as reasonably possible so that it could be easily understood. It was also assumed that it would also be very fast, but it never really lived up to that, at least compared to FLAC.

However, seeing this thread, I decided to take another look and see if I could find something obvious that was slowing it down. I ended up making a couple improvements that give a significant speedup, especially for decoding, which is now significantly faster than FLAC (at least on the system I tried). The compression is approximately equivalent to FLAC -1 when using the -j2 (slowest) encode option.

Experimental branch on GitHub (https://github.com/dbry/slac/tree/experimental-speedup)

I don’t know if these codecs are anything more than a technical curiosity, but they may have some use in embedded systems (like QOA (https://phoboslab.org/log/2023/04/qoa-specification)). I would like to try creating a lossy version of SLAC that might compete with it.

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-11 08:25:59
Porcus, the tests you do are really nice. And I trust your consequences. What I want to say is that HALAC's performance will be better in normal cases, except the use of "Ram Disk". In addition, HALAC also has extra processes for data verification. Flac does not make any verification in the tests. But we're at the beginning of the road and there's things to do.

I do not use AVX2. I think I'm misunderstood. I can only activate the -mavx2 option during the compilation phase. Thus, compilers can automatically make some optimizations. But I do not activate. All current HALAC versions are not compiled as AVX2. So the manual SIMD optimization was not done.
In fact, if I use AVX2, I can switch to rANS. This significantly accelerates the decode time. I do not compile as AVX2 in order to address a wider audience in HALIC.

On the encode stream, so that you can later implement verify without full decode?
Per block? "All" competing formats have something like that, except a couple that suck ... (ALAC has nothing, and TTA claims to offer it, but I know no implementation that actually uses it.)
HALAC makes data verification for each block. Each block is now 6 kb. And yes, if desired, warnings can be given without complete decodes and incompatible blocks can be shown. Of course, this will contain some side information. However, these are subsequent aesthetic add-ons. I have to focus on speed and compression right now.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: fooball on 2024-01-11 08:38:03
What I want to say is that HALAC's performance will be better in normal cases, except the use of "Ram Disk".
Why???

RAM disk should have a much higher performance than other forms of storage, if nothing else then simply because it is not accessed over a serial interface.  So why is performance worse?  Is it because you are performing other operations while waiting for disk accessed to complete?
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-11 08:51:13
Thank you for the Linux binaries, particularly the static binary I did need that. This laptop is still on Ubuntu 20.04 so the glibc you compiled with is too old, also the static version should work on non-glibc like alpine linux.
I'm running on Linux Mint. However, I could not fully optimize for HALAC in Linux. I just compiled that much.
I also did a test with SLAC and share the screenshots in terms of being convincing. With both i73770k and Ryzen7 5825u. With faster processor and SDD, the SLAC is close to HALAC in Decode time. However, Halac-Fast is faster.

In the meantime, something caught my attention in my different tests. There are problems in some files in SLAC's data verification. So it seems lost. And sometimes in one file, then it can occur in another file. I don't know why the problem is caused. Quite interesting. I added the images in the next post. Doesn't SLAC use data verification?

Seal Paul - Full Frequency - 2014
i7 3770k
HALAC: 383,545,116 bytes (2.985 sec, 3.263 sec)
SLAC: 396,073,220 bytes (6.484 sec, 5.344 sec)
HALAC-fast: 396,199,439 bytes (1.896 sec, 2.844 sec)

Ryzen7 5825u
HALAC: 383,545,116 bytes (1.914 sec, 2.576 sec)
SLAC: 396,073,220 bytes (4.266 sec, 2.591 sec)
HALAC-fast: 396,199,439 bytes (1.306 sec, 2.244 sec)

XXX
XX

XXX

XXX

XX

XX
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-11 09:01:49
I have been following this thread with some interest because five years ago I created a codec called SLAC along these lines. It was intended to be as simple as reasonably possible so that it could be easily understood. It was also assumed that it would also be very fast, but it never really lived up to that, at least compared to FLAC.
Hello David.
When I check after the decode process, some files have differences. I came across this situation by chance. I'm adding the images.
XXX
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-11 09:47:43
What I want to say is that HALAC's performance will be better in normal cases, except the use of "Ram Disk".
Why???

RAM disk should have a much higher performance than other forms of storage, if nothing else then simply because it is not accessed over a serial interface.  So why is performance worse?  Is it because you are performing other operations while waiting for disk accessed to complete?
I work with minimum things as much as possible for everyone to use the things I have developed. We can say that the biggest bottleneck is hard drives. And all normal users use them. Therefore, I need to particularly take care of the operations on disk access. In other words, it is also very important how a codec uses a fixed disk efficiently. We should not disable this part.

If this constraint is eliminated using RAM disk, Halac does not work badly, but others can close the difference a little. So there is no disadvantage for HALAC. However, since we have to think of normal users in comparisons, I say these in order to achieve more accurate results. Porcus is doing really great tests on his own system and I like it. It has a great accumulation. However, it is nice to see the results from other users.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-11 10:02:20
Yeah, it has been found that file handling had a certain impact on speed.
But IIRC, the differences were not consistently this or that direction ... @cid42 was that you? Corrections?

In addition, HALAC also has extra processes for data verification. Flac does not make any verification in the tests.
That's not correct, FLAC does indeed have two checksums in each frame. They are mandatory in the format. https://xiph.org/flac/format.html#frame_footer , and right above that: there is a CRC-8 for the frame header.

If reference flac encounters mismatch or corruption during decoding, it will throw an error - and then it will exit, unless -F if given. If -F is given, it will keep decoding, throw more errors if any (recent FLAC will mute the frame if it can, older will discard).
Example output from one byte corrupted:
Code: [Select]
1secCORR.flac: *** Got error code 2:FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH
*** Got error code 0:FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC


1secCORR.flac: ERROR while decoding data
               state = FLAC__STREAM_DECODER_ABORTED

The MD5 on the entire stream is atop of that. If no MD5 is present, reference flac will upon decoding give WARNING (which isn't an "error" unless you have used the --warnings-as-errors option).
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-11 10:40:59
SLAC might be the fastest decoding of the entire bunch?
130-ish seconds decoding. Encoding takes 170-ish for j0 and j1, 300-ish for j2. Some variability in it though.

Corpus: Same 38 CDs, same SSD. 64-bit encoded j0 slightly faster, so I base on that. I see I should have used -q to get it comparable with HALAC and with flac.exe's --totally-silent, in case console spam is slowing it down.

Putting the sizes into it - and including flac's full stereo decorrelation mode -2 (with -b3072 since I used that for the others)
13301342004   flac -0 -b3072 --no-md5 --totally-silent
13224920370   HALAC_FAST
13133326710   SLAC -j0
12798968972   SLAC -j1
12763832678   SLAC -j2
12739556906   flac -1 -b3072 --no-md5 --totally-silent
12710706259   flac -2 -b3072 --no-md5 --totally-silent
12636918110   flac -3 --no-md5 --totally-silent
12393304804   HALAC
12032168564   flac -5 --no-md5 --totally-silent

Note the benefits from FLAC's stereo decorrelation scheme. At -2 and -5 up, it calculates left, right, mid, side - but it isn't restricted to storing either left+right or mid+side. It can also store left+side or side+right, i.e. "side + whatever is smallest of the three others".
-0 is dual-mono only.
-2 calculates all four for every frame. Saves 590 MB over -2.
-1 has the "-M" switch which does the following: Only every now and then, it calculates all four and selects stereo decorrelation. It keeps this strategy for some frames (calculating only two for those!) and then it does all four again. Gets within 29 MB of -2 that way, i.e. it reaps 95 percent of the benefits for dirt cheap.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-11 11:04:16
I did not know that. Thanks for information.
However, the CRC-8 can only take 256 value. And it is very fast. If we look at the size and value range of the frame, this does not provide a definite security. It is a very superficial measure and collissions are inevitable. I think that's why MD5 is used.
We cannot compare this with the situation in HALAC. I think a much more robust and faster solution for each frame/block is more accurate.
SLAC might be the fastest decoding of the entire bunch?
Does SLAC do data verification? Or did you look at my previous posts?

X
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-11 13:24:06
I did not know that. Thanks for information.
However, the CRC-8 can only take 256 value.
CRC-16 in the frame footer. On the entire frame excluding the CRC-16 itself, but including the frame header.

Yes there is also a CRC-8 on the (*counting up*) 64 to 128 bits long frame header. If that is wrong, then the decoder can just ditch the entire frame without even trying to decode it.


As for SLAC I saw that you had posted issues - which are above my skills, I never tried it before yesterday.
Running a rough timing and a few dir or du commands, that is more like me.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: cid42 on 2024-01-11 15:13:12
flac's frame hashing format is messy, but it does provide "good enough" hashing. It's somewhat more than the 16 bits of crc if you checked fully because there's hell of a lot of structure in there, you mostly know the expected values in the header not that they're needed for the most part anyway, and any corruption in the residual likely results in a different frame size which will cause a desync on the next frame read. Not ideal and not something you'd choose in a greenfield project, but good enough as the status quo. Here's a discussion on flac's strong design decisions to make sync performant, it's related as making sync performant is where a lot of the structure comes from: https://hydrogenaud.io/index.php/topic,123569.0.html

Yeah, it has been found that file handling had a certain impact on speed.
But IIRC, the differences were not consistently this or that direction ... @cid42 was that you? Corrections?
OTOH I had a nightmare of a time getting consistent results benchmarking flac using a nearly-full NTFS HDD on Linux, tl;dr unfixable fragmentation not ideal. It could be mitigated by changing the size of vbuf, basically the fread/fwrite cache size. Be warned meandering thread: https://github.com/xiph/flac/issues/585

Personally I don't really see the benefit of including disk access when it comes to comparing the optimal state of codecs. It's important in the end, but all of these codecs do serial I/O on small chunks of data so they could all be optimised essentially the same way. If there are differences that show up in benchmarking that's just a minor implementation inefficiency that could be fixed down the line, worth noting but IMO not worth showing up in comparison results.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: bryant on 2024-01-11 17:59:53
I have been following this thread with some interest because five years ago I created a codec called SLAC along these lines. It was intended to be as simple as reasonably possible so that it could be easily understood. It was also assumed that it would also be very fast, but it never really lived up to that, at least compared to FLAC.
Hello David.
When I check after the decode process, some files have differences. I came across this situation by chance. I'm adding the images.
[attach type=image]28554[/attach][attach type=image]28556[/attach][attach type=image]28558[/attach]
Thanks for testing! I did these speedups using only a couple 2h19m files, and so it's certainly possible that an edge-case bug crept in. I would certainly be surprised if this exists in the master branch. And no, there is no error checking.

Where can I find one of the files that fails, like the Shun Ward track? Thanks!
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-11 18:10:38
flac's frame hashing format is messy, but it does provide "good enough" hashing. It's somewhat more than the 16 bits of crc if you checked fully because there's hell of a lot of structure in there, you mostly know the expected values in the header not that they're needed for the most part anyway, and any corruption in the residual likely results in a different frame size which will cause a desync on the next frame read.
Good point. Already by the first 33 bits of a frame, the chance of random bits being valid is like 1:2^19 or something. Of course, that happens like ... hm, once per day of encoded music, as per back of envelope, assuming that everything between frame header and footer "behaves uniformly random".
That is, a decoder that just sits there waiting for a valid frame header will about once per day get something that looks like the beginning of a valid header, but isn't - and then every 256th day get one that will pass the CRC-8.
But if we know it is a real FLAC stream, the decoder will with overwhelming chance get to an actual frame header first, and then it is saved until it drops out. So even if the transmission it is so bad that it drops out every second, then it's gonna take like seven years for it to sync into a false frame first. And if we suppose then that it doesn't check whether it decodes out of the bit depth specified (say with CDDA, the spurious bits don't evaluate to integers outside [-32768, 32767]) - or for some reason doesn't do that (say if it detects a spurious verbatim subframe) then it has to wait all until the frame footer to find out. And if it doesn't buffer until it comes to what it thinks is the footer, you will have 1/10th second of static after having waited for seven years listening to something that skips every second.
Call me when you are done!

Oh wait ... [sic!] ... There are subframe headers too, and then also some bits and combinations are reserved. 5/128 chance of passing through I think? For one subframe. Expected number of channels by random is 42/11 I think?


Not ideal and not something you'd choose in a greenfield project, but good enough as the status quo. Here's a discussion on flac's strong design decisions to make sync performant, it's related as making sync performant is where a lot of the structure comes from: https://hydrogenaud.io/index.php/topic,123569.0.html

Ah, and with a mention of https://wiki.multimedia.cx/index.php/Marian%27s_A-pac
I seemed to recall there was an ancient codec using second-differences only. From 1998. Should be possible to do that fast ... ?

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-11 19:00:11
Thanks for testing! I did these speedups using only a couple 2h19m files, and so it's certainly possible that an edge-case bug crept in. I would certainly be surprised if this exists in the master branch. And no, there is no error checking.

Where can I find one of the files that fails, like the Shun Ward track? Thanks!
I uploaded 2 sample wav files. You can try it with these. In addition, it will be better to have a data verification mechanism.

{links removed}


MOD removed links.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: bryant on 2024-01-11 20:10:47
Thanks for testing! I did these speedups using only a couple 2h19m files, and so it's certainly possible that an edge-case bug crept in. I would certainly be surprised if this exists in the master branch. And no, there is no error checking.

Where can I find one of the files that fails, like the Shun Ward track? Thanks!
I uploaded 2 sample wav files. You can try it with these. In addition, it will be better to have a data verification mechanism.

{links removed}
Thanks!

I was not able to verify the problem with the first file (12.Legacy), however I was able to verify corruption with the second one (hiphop). The problem only seems to occur on the Windows executables, which makes me think it's something to do with mingw. Anyway, it should not be difficult to track down.

As for the data verification, I agree that having this is absolutely required for a real file format (WavPack has checks on both compressed and uncompressed bitstreams, plus an [optional] whole-file MD5). However that's not what SLAC is. It's just a simple lossless compression library with a demo program. It doesn't even have a file format; the demo program just concatenates the compressed frames (and so there's no seeking or tag support either). Some applications would not require data integrity checks, for example if the frames were sent in network packets that are CRC protected, so this is something that I think belongs in the client using the library to match their needs.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: bryant on 2024-01-12 04:00:50
I found the issue with SLAC 0.4 and have uploaded updated executables (now 0.41).

Thanks again for pointing this out! Turns out it was our old friend uninitialized memory, which explains why it worked sometimes. I also fixed a couple things uncovered with UBSAN, but they shouldn't have caused any trouble.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-12 11:55:59
I found the issue with SLAC 0.4 and have uploaded updated executables (now 0.41).

Thanks again for pointing this out! Turns out it was our old friend uninitialized memory, which explains why it worked sometimes. I also fixed a couple things uncovered with UBSAN, but they shouldn't have caused any trouble.

It's okay, David. I'm glad the problem was fixed quickly.
Actually, I thought there was such a mistake. Because it was a little obvious from the behavior of the mistake. This is something I do often too.

Right now I have to take a break from the HALAC for a few weeks for a different project. After that, I'm thinking of preparing a Multithread(autothread) version just like in HALIC.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-01-12 13:15:29
Multithread(autothread) version
FLAC got that so recently it is not in an official release (yet).
Turned out not to be so straightforward:
https://hydrogenaud.io/index.php/topic,123248.0.html
https://hydrogenaud.io/index.php/topic,124437.0.html
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-12 19:56:07
FLAC got that so recently it is not in an official release (yet).
Turned out not to be so straightforward:
https://hydrogenaud.io/index.php/topic,123248.0.html
https://hydrogenaud.io/index.php/topic,124437.0.html
Multi-core processors are now in our lives. If the data that needs to be processed is in large chunks, it is advantageous in terms of speed to send them to the cores by fragmenting them. However, if the processed data are small or normal in size, it is more efficient to distribute each of them to different cores.
I could not understand what kind of problems there were in distributing data blocks processed independently of each other to the cores. Unfortunately, I couldn't read all of the written information on the links you sent.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-26 21:09:04
Hello. After a short break, I was able to prepare the 0.2.3 version. Since there were some structural changes, I did not publish version 0.2.2.

In this version, in summary;
* Normal mode and quick mode were combined. (-fast parameter)
* Structurally made suitable for multithread.
* If there is incorrect data, the Decoder can tell us approximately their location.
* Especially slow discs have increases in both encode and decode speed. Decode speed is more pronounced.
* There is a improvement in the compression rate of normal mode.

If there are no problems, I will prepare the 0.2.4 version as multithread in the coming days. And I will try to prepare Linux versions. However, as far as I pay attention, there is a little loss of performance on Linux. It must probably be about the compiler settings. I am doing all my work on Windows.

The tests performed with 2 different processors are as follows.
Code: [Select]
Intel i7 3770k
Sean Paul - 525,065,800 Size   Encode Decode
HALAC 0.2.3 Normal 383,196,885 3.250 3.297
HALAC 0.2.1 Normal 383,545,116 3.422 3.687

HALAC 0.2.1 Fast    396,199,439 2.391 3.266
HALAC 0.2.3 Fast    396,205,305 2.328 3.062
FLAC 1.4.3 Level 0 412,011,684 3.325 3.808   (flac -0b3072 -r0 --no-md5 --totally-silent)

Busta Rhymes - 829,962,880 Size Encode Decode
HALAC 0.2.3 Normal 575,791,554 5.224 5.125
HALAC 0.2.1 Normal 579,556,894 5.500 5.625

HALAC 0.2.1 Fast    600,200,683 3.797 5.140
HALAC 0.2.3 Fast    600,209,993 3.766 4.880
FLAC 1.4.3 Level 0 636,691,981 5.179 5.830
Code: [Select]
Amd Ryzen 7 5825u
Sean Paul - 525,065,800 Size    Encode Decode
HALAC 0.2.3 Normal 383,196,885 1.886 2.454
HALAC 0.2.1 Normal 383,545,116 1.883 2.568

HALAC 0.2.1 Fast    396,199,439 1.276 2.263
HALAC 0.2.3 Fast    396,205,305 1.308 2.161
FLAC 1.4.3 Level 0 412,011,684 1.730 2.066

Busta Rhymes - 829,962,880 Size Encode Decode
HALAC 0.2.3 Normal 575,791,554 2.997 3.851
HALAC 0.2.1 Normal 579,556,894 3.018 3.980

HALAC 0.2.1 Fast    600,200,683 2.039 3.504
HALAC 0.2.3 Fast    600,209,993 2.066 3.371
FLAC 1.4.3 Level 0 636,691,981 2.698 3.170   (flac -0b3072 -r0 --no-md5 --totally-silent)

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: freethesound on 2024-01-27 09:53:41
@Hakan Abbas the latest version, 0.2.4 is flagged with "Virus detected" by Chrome. I know it's a false positive, but, maybe you can do smth to fix it - there were no issues with previous versions.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-27 10:29:02
Thank you very much for the information. I have installed the 0.2.3 version right now. I think you mean it. However, I did not understand why such a thing was. I tested it with different virus software and there is no such thing. I would be glad if you can give information about how to solve this problem.

https://virusscan.jotti.org/en-US/filescanjob/wy4730h5u8
https://virusscan.jotti.org/en-US/filescanjob/37ht3qv9i9
https://www.virustotal.com/gui/file/911b0ca772bc883e4cca76fa094bc25882833f5c55fee757ebea31efb3b2c548
https://www.virustotal.com/gui/file/0687ab319f052bb6298116f7b1cbba7a03f03e071c29bed3f5a6677fc87baa2f
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: freethesound on 2024-01-27 10:44:43
I've attached the message.

The good news is that Firefox doesn't report it.

I've found a form, i don't know if it helps:
https://safebrowsing.google.com/safebrowsing/report_error/?hl=en

You can check for more info ;)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-27 11:07:55
Thank you so much, you're great.
I filled out the form and sent it. I hope it works. And hopefully, I won't have to report all my subsequent executable files to google.

XX
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: freethesound on 2024-01-27 11:21:20
thanks for developing useful software 👍
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-01-31 14:09:24
Yes, HALAC 0.2.4 version is ready...
In this version, we can now work as multithread as we want. For this, it is enough to set the "-mt=" parameter to a thread we want. This value may be between 1-1024 for now. "-mt=1" option will work as a single thread. The the results and graphics are below. Since the largest bottleneck discs in multithread processes are in reading and writing phase, high-performance discs give much better results. When determining the multithread parameter, you can get the highest performance above the number of threads in the system. For example, for a system with 16 threads, "-mt=32" is quite suitable.

With HALAC, memory consumption during multithread processes is also very low. I did the tests with 5 large WAV files to see the effect of multithread. 4 of these files are an album consisting of different music pieces. While working with normal-sized WAV files, the operations are already very fast, so the processor is completed without forcing enough.

Note: Data starting with "LIST" in the "header" section of some WAV files are stored. However, in the last parts of some files, there may also be different metadata. I wasn't interested in them at this stage. So in such a case, decoder can give a warning. If possible, work without this data.
Code: [Select]
WAV DATA1	 671,670,064
WAV DATA2 525,065,228
WAV DATA3 801,695,096
WAV DATA4 829,962,044
WAV DATA5 128,434,392
TOTAL 2,956,826,824
Windows 10 x64, AMD Ryzen 3700X (8 core, 16 thread), 500 gb Samsung 970 EVO Plus, Corsair 16 gb DDR4 2133
X

                                        Enc.     Dec.    Mem.
HALAC 0.2.4 Th1             8.67   10.72   2.4 mb
HALAC 0.2.4 Fast Th1      5.70   10.75   2.4 mb
HALAC 0.2.4 Th2             5.62     7.14   4.1 mb
HALAC 0.2.4 Fast Th2      4.48     7.14   4.1 mb
HALAC 0.2.4 Th4             3.36     4.48   6.5 mb
HALAC 0.2.4 Fast Th4      2.95     4.41   6.5 mb
HALAC 0.2.4 Th8             2.62     3.30   14.1 mb
HALAC 0.2.4 Fast Th8      2.33     3.42   14.1 mb
HALAC 0.2.4 Th16           2.09     2.87   24.8 mb
HALAC 0.2.4 Fast Th16    1.91     2.96   24.8 mb
HALAC 0.2.4 Th32           1.78     2.79   51.0 mb
HALAC 0.2.4 Fast Th32    1.76     2.74   51.0 mb
HALAC 0.2.4 Fast Th64    1.70     2.67   93.5 mb
HALAC 0.2.4 Th64           1.68     2.70   93.5 mb
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-02 18:10:58
Fired it at the file posted in https://hydrogenaud.io/index.php/topic,124862.msg1033901.html#msg1033901 .

On that particular signal, the normal mode produced ~15% bigger file than "fast", which in turn was nearly twice as big as flac -0.

I don't know whether your format can switch per frame between fast and normal, but maybe an idea to try at some stage in the development it could be worth it to try to do what flac -m does to stereo decorrelation - evey now and then it exhausts the methods, compares, and sticks to the best for some frames before it does that job again.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-02 19:12:05
Hello Porcus.
I made a simple experiment with the file you sent. In fact, I just directed the entropy encoder a little more accurately. I can automate this. However, for now, it does not show the same performance in other files because it is only a very fast compilation for this file.

Metadata at the end of the WAV file is not used.
00 - Dan Worrall - I Won The Loudness War (fragments).wav : 15,211,216
FLAC Level - 8 : 869,772
HALAC-Normal-Custom : 613,673

What I want to talk about here is that the arithmetic coding approach can really make a significant difference when it comes to simple data (low entropy). In most cases, I didn't need it because the music data could not be compressed below 50 %. However, I have not yet focused on the issue of compression rate. When the time comes, we talk about them. Since I'm a little tired of Multithread, I'm a little rest now.

I especially wonder about your tests on Multithread ...
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-05 07:12:51
I felt the need to illuminate a topic. GUI (Graphical User Interface) is only a tool that provides ease of use for us. I have also made GUI Converters and Players/Viewers for the codecs I have developed in advance (QT (http://qt.io)). I added the screeen shots. Because I could perform quite fast experiments by visually changing the parameters. We can also see the compression results and processing time clearly.

Similarly, I can develop a new Converter/Player (Cross-Platform) for HALAC. This is a much easier and more fun process than to deal with HALAC. But now it's a little early for that.

XXXXXX
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: fooball on 2024-02-05 09:14:44
BMP, PNG, TIFF ? ? ? ?
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-05 10:16:14
The screenshots of the previously developed GUIs. It was also involved in those who belong to HALIC (in the same directory). It's a little late for me to notice, and unfortunately I can't edit it now.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-09 11:54:39
HALAC 0.2.5 ready. In this version, the following operations are performed.

* The Metadata, which follows the header section of the wav files and at the end of the wav file, is now stored.
* Both Encoder and Decoder speed increased.
* Some excess data caused by the functioning of the entropy encoder were discarded.
* A more efficient coding was provided for data with a high rate of compression.

I will also share the Linux version and some tests. It doesn't seem very efficient to call a new running file every time by working on a file-based. Instead, it seems more efficient in terms of speed to send and process files under a directory to a single-run file. In the next version I can add this feature.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-09 14:40:35
OK, a few items:


** The new one. Tested sizes on my signature against the previous HALAC_ENCODE_custom.exe - not timed, my CPU is doing something and will throttle. (All tests done with -mt=7 - I don't know if they are supposed to create bit-identical streams?)
* 5.6 percent smaller files. Smallest number on harpsichord (1.5%), biggest on other classical
* -fast are pretty much the same size - 0.06 percent smaller files in total, and 32 of 38 files improve and none of them are getting much bigger. 0.33 percent smaller to 0.26 percent bigger.
All "percents" of old HALAC - not "percentage points"


** Wishlist for testing, would save me minutes every now and then:
Wildcards to handle *.wav (encoding, to generate filenamewithoutextension.halac) and *.halac for decoding.  A -y switch for "yes to overwrite".


** Timing test ... my CPU needs to be left doing nothing, and it hasn't for a week. Might come up.


** GUI: Sure neat, but for a high-speed thing ... wouldn't one rather pipe audio through it. Anyway: If you have a GUI that can load several files, you would likely send one file to each CPU thread?
You can maybe recycle one that is known here since long - that was customized for a whole bunch of formats:
http://web.archive.org/web/20120103153525/http://members.home.nl/w.speek/index.htm
The WavPack front-end lives on, can be downloaded from WavPack.com. The FLAC front-end was rewritten and FOSS'ed: https://sourceforge.net/projects/flacfrontend/ 
(I've had issues with the FLAC front-end, but apparently those were about how ancient flac.exe could misbehave on full drive.)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-09 18:56:00
Thank you for your interest in the subject, Porcus. I'm looking forward to seeing your tests.

I made a GUI(frontend) with very different formats for HALIC (for my own tests). Because I had to be able to see the processing time and compression results clearly. And I should also be able to view the images. I accidentally upload the screenshots in my previous post. However, I was really surprised that codecs such as Flac and Wavpack have such weak interfaces.

With the GUIs I have developed, you can work with both one by one and multiple threads. There is no problem in terms of speed, usability and modernity. And cross platform can be used. However, I don't know if this will really make a difference.

Since each track of music is around 30-50 MB, the real performance of Multithread is not seen in this test. And I added Linux Static executable files.
Code: [Select]
Windows 10 x64, AMD Ryzen 3700X (8 core, 16 thread), 500 gb Samsung 970 EVO Plus, Corsair 16 gb DDR4 2133
Busta Rhymes(20 tracks) - 829,962,880 bytes.

HALAC 0.2.5 Normal -mt=1: 2.976   3.813   574,048,583
HALAC 0.2.4 Normal -mt=1: 3.228   4.063   574,166,967
HALAC 0.1.9 Normal      : 3.296   4.362   579,556,734

HALAC 0.2.5 Normal -mt=2: 2.366   2.716   574,048,583
HALAC 0.2.4 Normal -mt=2: 2.453   2.875   574,166,967

HALAC 0.2.5 Normal -mt=4: 1.641   1.770   574,048,583
HALAC 0.2.4 Normal -mt=4: 1.753   2.030   574,166,967

HALAC 0.2.5 Fast -mt=1: 1.985   3.375   600,068,087
HALAC 0.2.4 Fast -mt=1: 2.179   3.609   600,102,660

HALAC 0.2.5 Fast -mt=2: 1.803   2.539   600,068,087
HALAC 0.2.4 Fast -mt=2: 1.845   2.611   600,102,660

HALAC 0.2.5 Fast -mt=4: 1.358   1.731   600,068,087
HALAC 0.2.4 Fast -mt=4: 1.379   1.743   600,102,660
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-09 21:26:16
Ah, OK. From 0.2.4 to 0.2.5:

Minor improvements for [ï]every[/i] CD in both normal and fast mode. There are no ID3 chunks in the WAVE files, so the header/footer sizes are small - and only one file per full album.

Normal mode: improvements ranging from .004 percent to .692 percent, median .039, mean .076.
Fast mode: improvements ranging from .002 percent to .032 percent, median .006, mean .008.

Biggest improvement for Miles Davis (contains a lot of mono), and for these classical CDs: Bruckner, Cage, Vaughan Williams. Then James Brown, which also contains a lot of mono.
That's for both modes, although the "biggest" numbers for fast are like 1/20th size.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-10 19:21:14
Thank you very much for your determinations, Porcus. The console parameters you want are complete. I will make a few more arrangements and share it as a new version in the coming days.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-14 19:20:06
HALAC 0.2.6 is ready.
* In this version, some improvements have been tried to be made in the compression ratio in general. However, care has been taken not to increase the processing speed. There are a wide variety of music types in the SQUEEZE CHART test data.
* Overwriting of output files is provided with the "-y" parameter. Otherwise, no action will be taken. (Interestingly, I was looking for errors in the codes because I forgot to use this parameter that I added myself!)
* The "directory usage" option has been added. Since multiple files are processed in series on a single executable file in this way, the speed increase is noticeable. The "*" symbol is important in the form of use. The "*" symbol activates the directory processing mode.

Directory usage examples;
Code: [Select]
halac_encode * *
All in the same directory "wav" files ".converts to "halac" files.

Code: [Select]
halac_decode C:/MUSIC/* D:/ABC
Converts all "halac" files in the MUSIC directory to the ABC directory as "wav".

Code: [Select]
Windows 10 x64, AMD Ryzen 3700X (8 core, 16 thread), 500 gb Samsung 970 EVO Plus, Corsair 16 gb DDR4 2133

TEST_AUDIO_SQUEEZE_CHART - 606,527,908
HALAC FAST 0.2.5 -   1 Thread : 1.321, 2.344, 389,279,259
HALAC FAST 0.2.6 -   1 Thread : 1.378, 2.502, 386,065,320
HALAC NORMAL 0.2.5 - 1 Thread : 2.161, 2.815, 372,183,245
HALAC NORMAL 0.2.6 - 1 Thread : 2.222, 2.950, 367,340,182

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-02-14 20:46:17
HALAC crashes on Windows 7 x64 with Core2Duo SSE3 since the inception of this thread (v0.1.9 ... 0.2.6).
All other known encoders work fine. Could you be so kind to do something about that, @Hakan Abbas?

(https://i3.imageban.ru/out/2024/02/14/c1495b9a08116f1c49163c266f608d40.png) (https://i6.imageban.ru/out/2024/02/14/18cffb502b51b4e07f8fd9ac75b79151.png)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-15 04:29:07
Hi Kraeved.
I think it's about the error instruction set. I perform my compilation for HALAC as -avx. It works a little faster. I do not use SIMD manually. Therefore, it is normal that it does not work on systems older than AVX. So now I have compiled it as SSE2.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-02-15 06:25:08
@Hakan Abbas, it works now, thank you. Consider adding some file extension to the help screen next to output_file, e.g. output_file.hlac, and perhaps some meaning error message if that argument is not provided (right now same help screen is displayed again). It is not possible to listen to the encoded file yet? So HALAC is more of a proof-of-concept compressor then. A step forward would be to create a playback component for Foobar2000, as in the case of QOA (https://www.foobar2000.org/components/view/foo_qoa).
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-15 10:09:27
I will add the arrangements you mentioned(input/output) in the next version.

At the point of playing HALAC files, I am thinking of preparing a useful converter and player as a GUI (Frontend). I shared some screenshots of my old work in my previous posts. Or I can present a .dll/.so file for other players to use. So at this stage, there will be no writing part to the file. The decoded wav data written to memory will be available to other audio players. However, when there were many audio players, there were also those who said it didn't make sense to develop a new one again. I will evaluate them in the future.

However, I am currently dealing with HALAC's compression and speed. After bringing them to a certain level, we can take care of other things in detail.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-19 07:51:56
In a different forum, detailed tests on HALAC 0.2.6's compression ratio improvement and multithread performance can be found at the following link.

https://encode.su/threads/4180-HALAC(High-Availability-Lossless-Audio-Compression)?p=82158&viewfull=1#post82158

I would be happy for the dear members of this forum to do different tests as well (before switching to 24 bits audio). After all hydrogenaud.io it is a forum site specializing in audio.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-20 00:25:36
Compared sizes of 0.2.4, 0.2.5 and 0.2.6 on the same 38 CDs.
0.2.6 does better overall, but not on the heavier part of the corpus, where it was worst on 8 of ten albums (both in normal and fast).

"&" to separate "normal & fast":

* 0.2.4 was never smallest.  0&0 of the 38&38 albums. It compressed worst on 21&21 and in total, but not on the ten heavier albums.
* 0.2.5 was smallest on 18&17 albums, and on the heavier corpus (beating 0.2.6 by 0.17&0.23 percent). It never compressed worst.
* 0.2.6 was smallest on 20&21 albums, and on the two other corpi: beating 0.2.5 by 0.77&0.19 percent at classical and 0.24&0.54 percent on the "other" genres.

For the ten heavy synth/rock/metal albums,
8+7: 0.2.6 was worst and 0.2.5 best.
0+1: 0.2.6 was worst and 0.2.4 best, although all within 0.015 percent.

I realize that the above is percent and the following are percentage points ... anyway you will get the idea.

Most significant impact from 0.2.6 - on all these, 0.2.5 and 0.2.4 were about the same:
From testing on FLAC, the two shortest ones - the Jordan Rudess EP and Wovenhand - could make big differences. So they did here: > 2.1 percentage points on normal mode, > 1.8 percent in fast.
Then a few where you got 0.8 to 1 percentage point-ish improvement:
Bach (Organ), Bruckner, Mozart, Vivaldi (normal; fast was ~half of that)
In fast only: Jan Johansson and Kraftwerk


Don't know what to take home from this.

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-20 10:44:20
Thanks for your concern, Porcus.
According to the latest tests, the speed of "HALAC-Normal" and "HALAC-Fast" remained almost constant. But there have been some improvements in compression ratios. I should point out that trying to increase the compression ratio more when working at these speeds is an extremely difficult thing. However, there is still some more space that can be compressed. I'm trying to close this gap without compromising on speed.

I didn't get any feedback from here about Multithread. So I understand that in this case the tests I have done and my results have been accepted. There are already tests that have been done by others on the link in the previous post. HALAC works quite efficiently depending on the number of threads.

Below is a test performed by a different source ( @a902cd23 (https://encode.su/members/3334-a902cd23)). Even if the thread count is 24, it seems that the processor does not get a full load. In this case, faster results can be achieved by increasing the number of threads.
Code: [Select]
Intel 13700k on ramdisk, 16 core, 24 thread
F540AC6E.wav : 1,062,989,800 bytes

          Normal encode        Fast encode          Normal decode        Fast decode
thread 1  Elapsed: 0:00:02,74  Elapsed: 0:00:01,61  Elapsed: 0:00:04,01  Elapsed: 0:00:03,26
thread 2  Elapsed: 0:00:01,92  Elapsed: 0:00:01,07  Elapsed: 0:00:02,34  Elapsed: 0:00:02,05
thread 3  Elapsed: 0:00:01,41  Elapsed: 0:00:00,80  Elapsed: 0:00:01,61  Elapsed: 0:00:01,42
thread 4  Elapsed: 0:00:00,93  Elapsed: 0:00:00,70  Elapsed: 0:00:01,24  Elapsed: 0:00:01,14
thread 5  Elapsed: 0:00:00,79  Elapsed: 0:00:00,62  Elapsed: 0:00:01,08  Elapsed: 0:00:00,93
thread 6  Elapsed: 0:00:00,68  Elapsed: 0:00:00,55  Elapsed: 0:00:00,93  Elapsed: 0:00:00,86
thread 7  Elapsed: 0:00:00,62  Elapsed: 0:00:00,52  Elapsed: 0:00:00,85  Elapsed: 0:00:00,76
thread 8  Elapsed: 0:00:00,66  Elapsed: 0:00:00,49  Elapsed: 0:00:00,78  Elapsed: 0:00:00,70
thread 9  Elapsed: 0:00:00,65  Elapsed: 0:00:00,48  Elapsed: 0:00:00,76  Elapsed: 0:00:00,71
thread 10 Elapsed: 0:00:00,64  Elapsed: 0:00:00,48  Elapsed: 0:00:00,77  Elapsed: 0:00:00,70
thread 11 Elapsed: 0:00:00,64  Elapsed: 0:00:00,47  Elapsed: 0:00:00,71  Elapsed: 0:00:00,67
thread 12 Elapsed: 0:00:00,60  Elapsed: 0:00:00,46  Elapsed: 0:00:00,68  Elapsed: 0:00:00,63
thread 13 Elapsed: 0:00:00,60  Elapsed: 0:00:00,44  Elapsed: 0:00:00,69  Elapsed: 0:00:00,62
thread 14 Elapsed: 0:00:00,57  Elapsed: 0:00:00,46  Elapsed: 0:00:00,65  Elapsed: 0:00:00,60
thread 15 Elapsed: 0:00:00,53  Elapsed: 0:00:00,40  Elapsed: 0:00:00,63  Elapsed: 0:00:00,57
thread 16 Elapsed: 0:00:00,54  Elapsed: 0:00:00,40  Elapsed: 0:00:00,63  Elapsed: 0:00:00,57
thread 17 Elapsed: 0:00:00,52  Elapsed: 0:00:00,42  Elapsed: 0:00:00,60  Elapsed: 0:00:00,58
thread 18 Elapsed: 0:00:00,51  Elapsed: 0:00:00,40  Elapsed: 0:00:00,59  Elapsed: 0:00:00,57
thread 19 Elapsed: 0:00:00,48  Elapsed: 0:00:00,38  Elapsed: 0:00:00,58  Elapsed: 0:00:00,55
thread 20 Elapsed: 0:00:00,45  Elapsed: 0:00:00,40  Elapsed: 0:00:00,55  Elapsed: 0:00:00,53
thread 21 Elapsed: 0:00:00,47  Elapsed: 0:00:00,41  Elapsed: 0:00:00,53  Elapsed: 0:00:00,53
thread 22 Elapsed: 0:00:00,44  Elapsed: 0:00:00,37  Elapsed: 0:00:00,52  Elapsed: 0:00:00,53
thread 23 Elapsed: 0:00:00,42  Elapsed: 0:00:00,37  Elapsed: 0:00:00,54  Elapsed: 0:00:00,53
thread 24 Elapsed: 0:00:00,41  Elapsed: 0:00:00,39  Elapsed: 0:00:00,51  Elapsed: 0:00:00,51

428 705 658  F540AC6E.hal
507 805 612  F540AC6E.fast.hal
(https://encode.su/attachment.php?attachmentid=11122&d=1708165533)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-20 13:23:35
According to the latest tests, the speed of "HALAC-Normal" and "HALAC-Fast" remained almost constant. But there have been some improvements in compression ratios.
Yeah, except for metal music it seems - then it became worse.
It is not just about "dense music with high bitrate", because the Mahler's big orchestra symphony did improve.

I didn't get any feedback from here about Multithread.
I have not yet been able to run (reliable) speed tests, sorry.

Question: I understand that -mt=7 means seven "worker" threads + one "bookeeping" thread? My CPU is 4 cores 8 threads.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-20 18:07:35
HALAC's multihread mode works a little differently. In fact, the thread numbers do not exactly reflect the threads. My previous work HALIC (https://encode.su/threads/4025-HALIC-(High-Availability-Lossless-Image-Compression)) is much more greedy(autothread). It works at full capacity, but it also consumes a lot of system memory. I wanted to do something much less memory-consuming for HALAC. Because it should have been able to work even in the lowest systems. And I really liked the result.

HALAC redirects the data to be compressed to the specified number of threads in sequence. I say sequentially because the read/write operations from the file should be sequential for security reasons. In this case, since HALAC is already very fast, some threads may finish their tasks early. This also leaves a little rest allowance for the processor.

If this is not desired, that is, if we want to use the processor at full capacity as much as possible, we may need to set the thread number above the normal one. However, on very powerful processors and disks, this may not be necessary. We have seen this in some tests.

If we think according to the processor above (i7 13700k, 16 core, 24 thread), 60% of the processor has been used yet. So there is a little more space that needs to be used. For example, different results occurred on 5 different computers that I use. the i7 3770k(4-core, 8-thread) performed best on the system at 16 threads, the Ryzen 7 3700x(8-core, 16-thread) performed better at 64 threads, and the i7 9700k (8-core, 8-thread) performed better at 32 threads. You can try the most efficient situation according to the computer used. If in the normal case it is enough to set up the number of threads of the processor.

However, it should be remembered that the data to be tested should be one piece and large in order to get accurate results. 1 gb and more.
Code: [Select]
Ryzen 7 3700x(8-core, 16-thread), ~3 gb wav data
HALAC-Normal 16 thread encode time : 2.845
HALAC-Normal 128 thread encode time: 2.326
Left image is 16 Threads and right image is 128 Threads
(https://encode.su/attachment.php?attachmentid=11126&d=1708173754)(https://encode.su/attachment.php?attachmentid=11127&d=1708173766)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-24 07:36:01
Finally had the computer free to run a rough multithreading timing. Encoding, normal mode. i5-1135G7, 8 threads on 4 cores, fanless computer.

* Graphed: median over three runs, and then the bars are standard deviation among the three.

* Run "from hot": three warm-ups discarding the timing, and then timed three. That might not have been enough on this computer:
Because -mt=10 was run first, the CPU might not have started throttling until 0.2.6 at -mt=9.

* Order: FOR threads in (10, ..., 0) <--10 first
  DO FOR versions 0.2.4, 0.2.5, 0.2.6 in that order
  DO encode the 38 CDs six times (deleting *.halac in between)


But it is hard to interpret without knowing how much it actually takes from each thread.

Sooo ... if I were to do this in fast mode, I don't even know if the relevant is to run it over a long time first to reach some thermal equilibrium numbers, or to let it cool down for a couple of minutes in between?
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-24 11:34:34
Thank you, Porcus.

You just need to use 0.2.6. It is more stable and does not differ much from others in terms of speed. (mt=0 and mt=1 are the same)
Since HALAC is extremely fast and each track of music is processed separately, the processes end quickly. Therefore, it is impossible to go further when processing music tracks with an average size of 40-50 mb. In other words, the processed data fragments remain small. In my tests, the total processor usage is around 30% with an average acceleration of 2x. This is parallel to your results. In such cases, it can be worked much more efficiently by sending each music track to a separate thread.

Therefore, it is more appropriate to test on one piece of large music (1gb and over). If possible, you can make one or several CDs into one track and do the tests that way. And you can force HALAC up to 16 or even 32 threads on your processor.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-02-24 11:54:07
Ah. 1 and 0 are the same, and the likely reason they took more time run with 0.2.4 and 0.2.5 is that the processor was still cooling down after -mt=2.
If so: I see the first timed 0.2.6 -mt=1 run was down in time while the last 0.2.5 was not, so it took six to nine of those runs to reach a thermal equilibrium where it would clock up again. That's 10 to 15 minutes. Of course, cooling down idle woud be quicker. But this is what I get for a fanless computer. I got what I paid for, and that was not reliable timing ...

Therefore, it is more appropriate to test on one piece of large music (1gb and over).
Up to 4 GB works in WAVE, but for later maybe support the RF64 and BW64 extensions to the WAVE format, to allow > 4 GB?
https://hydrogenaud.io/index.php/topic,123862.msg1036523.html#msg1036523 , replies forty-something: BW64 is set to obsolete RF64.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-24 12:58:35
Yes, the WAV format has a 4GB restriction. However, the size of the file does not matter much for Halac. To do this, I can use the Libbw64 (https://github.com/ebu/libbw64) library in subsequent versions. You can perform your tests with WAV files below 4 GB.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-02-24 18:17:48
@Hakan Abbas, by the way, I tried to use HALIC (https://encode.su/threads/4025-HALIC-(High-Availability-Lossless-Image-Compression)/page3?s=e1f839a8328e5437a8d2a9f98e01d2d0) of yours, and it refused to work on my end for the same reason (https://hydrogenaud.io/index.php/topic,125248.msg1039342.html#msg1039342) as HALAC. It remains a mystery to me how you managed to talk through 77 pages up to version 0.7.1 and ignore the pool of users without modern processors, whereas WEBP and JXL have no such restrictions and serve well. Perhaps you'll find a way to upload a binary there that does not require AVX instructions set.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-02-24 19:25:08
@Kraeved Codecs developed by certain teams (Google, Cloudinary, Apple, AOM), such as WebP, AVIF, HEIF and JXL, work by choosing according to the processor architecture (SSE, AVX, AVX2). Of course, this option could have been added, but it was slightly behind in the order of priority. Because it really wasn't easy for me (speed/ratio/memory/mt) to cope with such powerful image codecs.

HALIC can run a little faster when compiled in AVX2 mode. However, despite the request, I did not do this in order to support slightly older architectures. I thought AVX (2011) would be enough. But until now, such a request had come from outside of you. However, it only takes me a few minutes to prepare a version for older processors. You can access the SSE2 version I compiled for HALIC from the link below.
https://github.com/Hakan-Abbas/HALIC-High-Availability-Lossless-Image-Compression-/releases

HALIC is by far the best Lossless Image Codec according to "F_Score (universal score) (https://gdcc.tech/rules/)".
F_Score = C+2·D+(S+F)/10⁶
"Here, C and D are respectively the total compression and decompression execution time (in seconds), S is the total compressed size in bytes, and F is the submission packet size."
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: mudlord on 2024-03-01 08:00:52
Quote
It remains a mystery to me how you managed to talk through 77 pages up to version 0.7.1 and ignore the pool of users without modern processors, whereas WEBP and JXL have no such restrictions and serve well. Perhaps you'll find a way to upload a binary there that does not require AVX instructions set.

AVX has existed since Sandy Bridge, a 2011 CPU. I truly don't see how that's a problem.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: ktf on 2024-03-01 08:37:47
AVX has existed since Sandy Bridge, a 2011 CPU.
Yes, but Intel also launched Tremont in 2021, which still lacked it. These CPUs are still actively being sold today. You could encounter them in HTPCs, which are certainly being used to decode audio.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-03-01 11:35:40
AVX has existed since Sandy Bridge, a 2011 CPU. I truly don't see how that's a problem.
“There are more things in Heaven and Earth, Horatio, than are dreamt of in your philosophy” (Hamlet 1:5).

As Shakespeare accurately observed several centuries ago, the world is much more complicated than it seems. Are you aware that thousands of people in the UK are dying from the cold (https://www.theguardian.com/commentisfree/2020/feb/27/dying-cold-europe-fuel-poverty-energy-spending)? Please note, this is not a former African colony or a place where the Americans brought democracy with bombs and coups. Also, do you know that the wealthy author of Game of Thrones writes in a DOS editor (https://www.bbc.com/news/technology-27407502)? The point is that upgrading equipment is not only a challenge for some people (more about that in 'Design for the real world (https://www.goodreads.com/book/show/190560.Design_for_the_Real_World)' the book by an engineer Victor Papanek), it may not be a priority at all, since it works well enough for current needs and is in no hurry to become a waste (more about that in 'Samsara (https://www.youtube.com/watch?v=Viz6NJEpTvI)' the docufilm by Ron Fricke). Think about MD5 hashing algorithm, which is still used to verify audio data, while the progress, whatever that means, suggests using xxHash and Blake3. And what is AVX? It is an optimization to calculate underlying math faster and a way to beat the competitors in a synthetic benchmark, not a solution per se, whereas the world needs solutions first, i.e. widely accessible tools to ease the burden. FLAC and WavPack are two good examples of how an intellectual breakthrough can go hand in hand with humanitarian values.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-03-01 13:48:21
It is kinda up to the author/developer to define goals, and sure if the purpose is to impress enthusiasts with the fastest possible encoding and decoding then why leave weapons on the table. And this has been impressive yeah.

But uses for an ultra-lightweight codec developed in 2024 - what would that be?
Suggestion: canned audio with low-power CPU on chip? In which case, sure post optimized compiles, but at least make one that works with "nothing sophisticated"?


FWIW, Intel's most recent Atom launch - one year ago - were these two CPUs, without AVX: 6W dual core (https://ark.intel.com/content/www/us/en/ark/products/230380/intel-atom-x6214re-processor-1-5m-cache-1-40-ghz.html) and 9W quad core (https://ark.intel.com/content/www/us/en/ark/products/230379/intel-atom-x6416re-processor-1-5m-cache-1-70-ghz.html). More likely to be used in products like this (https://uk.insight.com/en_GB/shop/product/12NH0000UK/LENOVO/12NH0000UK/Lenovo-ThinkEdge-SE10--USFF--Atom-x6214RE--4-GB--SSD-64-GB-SSD-256-GB/) than anything else, sure.
(No AVX ... but no nothing?)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Octocontrabass on 2024-03-01 17:59:48
(No AVX ... but no nothing?)
Those two Atom CPUs have Tremont cores, which support all the usual SSE instructions (minus AMD's SSE4A), plus the new GFNI SSE instructions. One of those new instructions, GF2P8AFFINEQB, can accelerate all kinds of calculations. I wouldn't be surprised if HALAC is already using its AVX counterpart.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-03-01 20:02:10
Code: [Select]
Busta Rhymes - 20 tracks - 829,962,880 bytes
Amd Ryzen 5825u, Encode and Decode Times
HALAC Normal AVX2 : 2.309 s, 3.621 s
HALAC Normal AVX  : 2.445 s, 3.429 s
HALAC Normal SSE2 : 3.084 s, 3.494 s
HALAC Normal SSE  : 3.130 s, 3.497 s

HALAC Fast AVX2 : 1.451 s, 2.780 s
HALAC Fast AVX  : 1.511 s, 2.995 s
HALAC Fast SSE2 : 1.756 s, 3.045 s
HALAC Fast SSE  : 1.751 s, 3.048 s
I hadn't done such a test before, but it was an interesting experience. According to these results, different command sets can have small positive and negative effects. It can be tried in SSE2 and AVX versions. However, there was no significant change as a result of the automatic vectorization of the compiler without manual SIMD operations.

For example, "HALAC Normal AVX2" lagged behind the others in the decode process. This is not normally expected. Probably, HALAC can be further accelerated by manual SIMD optimization. I am also considering HALAC for a wider use area, not narrow. My work on HALAC continues. It is only 4 months old.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-03-22 20:17:47
Ran HALAC through one of my stupid-signals tests over at https://hydrogenaud.io/index.php/topic,125607.0.html .  Not the upsamples, but the pitch shifted ones.
All are "nominally CDDA", but pitch shifted (preserving tempo) - created to emulate a lot of the same situation, signals lacking the top octave to the top 3+ octaves. The legend is the max frequency.  For more description, see that thread.

It is hard to represent this without it getting turning into a mess, but log scale of file sizes gives reasonably close to straight lines. Numbers are compression ratio relative to WAVE:
(https://i.imgur.com/ZXiJYtz.png)
You see that as the signals miss treble, HALAC fast is losing out - only moderately though - until "overtaken" for last spot by ffmpeg's WavPack encoder. HALAC normal keeps up against FLAC's fixed predictor scheme for ~an octave pitch shift, but not at all for two - but then, the fastest FLAC is surprisingly good.

So here I do everything relative to FLAC -0b4096 -r0; this is not log scale. 1.5 means 50 percent larger than FLAC, 0.75 means 25 percent smaller.
(https://i.imgur.com/P8VlUnK.png)
Horizontal lines means it "keeps its ratio to FLAC's fixed predictor size". You see HALAC normal does that in the beginning - and the simplest TAK and the most brutally slow ALS do that in the end.

This is obviously not WavPack-friendly material. Although nowhere near the upsamples, the impacts are huge: changing from WavPack to TAK usually doesn't halve the file size - but even still, there are so many that cluster up around fixed-predictor FLAC in the end.

(Why I used those extra parameters on -0? -b4096 for an apples to apples comparison to the same block size that the others use. -r0 to make sure I didn't get any too good results from that clever partitioning trick.)

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-03-23 07:30:44
This was a good test, Porcus. You are really doing this sport ;) I have had to take a break to HALAC a little bit lately due to my different work intensity.

HALAC does not use a fixed estimator yet. That's why sometimes he can act quite aggressively. This situation is seen in your tests.
Fixed estimators give really good results in special cases. For example, the sample wav file on the "https://hydrogenaud.io/index.php/topic,125248.msg1038859.html#msg1038859" link can be compressed up to 397 kb with just one fixed estimator. It is really very difficult to get down to this size with other methods.

However, in the general case, the effect of fixed estimators cannot be noticed. That's why I didn't need to do calculations specifically for fixed estimators. Because every intervention made brings an extra processing load. I try to be as careful as possible not to compromise the speed. Because in ultra-fast situations, our hand is locked most of the time.

A slightly better version may come soon as a speed/compression ratio. And in this version I can now also use some fixed estimators for special cases. The new entropy coding phase that I will add in this version will be quite important. I just want to be absolutely sure about some things.

XX
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-04-04 08:33:02
I have shared these results before, but since it was in the "News" title, the results of HALAC were removed.
Code: [Select]
AMD Ryzen 3700x (8 core, 16 thread)
Single Part Wav: 2,577,542,764 bytes

WAVPACK Normal Thread-1: 34.84 s, 30.45 s, 1,617,702,120 bytes // wavpack.exe --threads=1 input output
WAVPACK Normal Thread-12: 9.22 s,  5.88 s, 1,617,400,684 bytes // wavpack.exe --threads=12 input output
WAVPACK Fast Thread-1: 29.01 s, 26.29 s, 1,652,567,206 bytes // wavpack.exe --threads=1 -f input output
WAVPACK Fast Thread-12: 7.74 s,  5.55 s, 1,652,389,896 bytes // wavpack.exe --threads=12 -f input output

HALAC Normal Thread-1:  8.571 s, 11.580 s, 1,669,302,550 bytes // halac.exe input output -mt=1
HALAC Normal Thread-12: 2.225 s,  2.553 s, 1,669,302,550 bytes // halac.exe input output -mt=12
HALAC Normal Thread-32: 1.789 s,  2.253 s, 1,669,302,550 bytes // halac.exe input output -mt=32
HALAC Fast Thread-1:  5.095 s, 9.895 s, 1,755,209,521 bytes // halac.exe input output -fast -mt=1
HALAC Fast Thread-12: 2.074 s, 2.547 s, 1,755,209,521 bytes // halac.exe input output -fast -mt=12
HALAC Fast Thread-32: 1.959 s, 2.132 s, 1,755,209,521 bytes // halac.exe input output -fast -mt=32
I can run HALAC faster in multithread mode (autothread). Like this HALIC, it only consumes a little more memory. Below is the processor usage cases for 12 threads. HALAC is much less loaded on the processor.

XX
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-04-04 10:51:15
I have shared these results before, but since it was in the "News" title, the results of HALAC were removed. I can run HALAC faster in multithread mode (autothread). Like this HALIC, it only consumes a little more memory. Below is the processor usage cases for 12 threads. HALAC is much less loaded on the processor.

Or the results were removed because HALAC is still the intellectual fun of its creator, and not a solution for the masses. Why don't the latter include this encoder in their comparisons, is it because they are unaware of its existence? But how can you miss an encoder whose executable name, including the extension, is capitalized? The trouble is that encoded files cannot be listened to, let alone edited, they can only be stored. The album of your favorite artist first needs to be unpacked, i.e. to take up a significant free space. A place in the hall of fame is earned not by charts illustrating your mathematical agility, but by bringing relief to our earthly vale of tears, as @bryant of WavPack has been doing for years. And the longer you stay at an academic distance, away from the daily torments and hopes of the masses, the more the masses become convinced that HALAC is nothing more than a touring exhibition of exotic animals — they can be brighter and fluffier than domestic cats and dogs, but they are picky about local food and can hardly be trained, so you can only take pictures with them as a souvenir.

(https://i1.imageban.ru/out/2024/04/04/ca918a23ec669a278565bf9d6551dd25.jpg)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-04-04 19:58:59
@Kraeved; you have written some really nice things. However, it is quite easy for me to prepare a player, converter or plug-in for HALAC. It is true that I have received a lot of feedback on this issue. There are things that I have prepared earlier on the previous pages.

The priority thing for me at this stage is the compression ratio/speed situation. And this, to me, is the most important part of a codec. When I do this as well as I can, I can do other things by sipping my coffee. I can even transfer these stages to my students. But it's really not fun to deal with the algorithmic part of the work. You can be sure of that.

I had to give HALAC a break for a while because I've been working on a different project for the last month. In the coming weeks, I need to start where I left off and see what I can do.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-04-23 14:11:03
Yes, after a little break, HALAC 0.2.7 version is ready. I need to get a little more warming to continue where I left off.
In this version, small structural changes and some small compression ratio improvement were made. But more importantly, it can now be decode as a DLL. I also had to prepare an Audio Player using this DLL because it was asked too much. This player can play .halac files and .wav files that are Encode with the version 0.2.7 version of HALAC. And in fact, many other audio formats can also be played builtly, but I haven't activated them at this stage. The player is suitable for the Cross platform. Only DLL/SO installation operations will need to be changed. If necessary, I try to prepare a .so version for Linux.
https://github.com/Hakan-Abbas/HALAC-Audio-Player

Player is presented as open source. If desired, other Audio Players can also integrate HALAC using this DLL. However, since HALAC is still in development, there may be structural changes in each version.
Code: [Select]
// Dll Function Prototypes //
typedef char* (*EXPORT_WAVFunc)(const char*, unsigned short); // Return .wav file to memory. Parameters -> "filename" and "thread count"
typedef unsigned int (*EXPORT_SIZEFunc)(const char*); // Return .wav file size. Parameters -> "filename"
typedef void (*EXPORT_DELETEFunc)(); // Delete .wav file from memory.
X
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: mudlord on 2024-04-25 19:20:36
Or the results were removed because HALAC is still the intellectual fun of its creator, and not a solution for the masses.

Again I will post: Does it even matter if that is the case? There is plenty of cases I have programmed something that is 100% not intended for general public use under a FOSS license, because I truly believe nothing of value will be gained from it being open source, which from time and time again, is exactly the case for me. Zero value is gained from code being FOSS, at all, zero people even used said code, but only the binary form. And even then its sketchy at best.

Hakan has 100% right to do whatever they want, including using AVX and co: its their project, not yours.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-04-25 21:14:46
@mudlord, Hakan understood me perfectly, but you didn’t, alas. Okay, I can elaborate.

HALAC is designed for people. It is constantly compared with other tools that people use in order to highlight its superiority. It was never limited to the author’s fun, internal use, or, say, participation in the Informatics Olympiad (https://en.wikipedia.org/wiki/International_Olympiad_in_Informatics).

When you make a project for people, then people inevitably provide feedback. It ranges from admiration to frustration, and in between there are questions and suggestions. Thanks to this feedback, projects develop further. If this is not obvious to you, then look at issue trackers and forums of the well-known apps. For example, Peter added OGG chapters (https://hydrogenaud.io/index.php/topic,125475.0.html) support to Foobar2000 because I asked about it and then he recompiled it because other users complained about a crash while calculating ReplayGain (https://hydrogenaud.io/index.php/topic,125795.0.html) values on systems without AVX. If you are not ready to process feedback or even do not need it, then make it clear in the first place. For example, disable the issue tracker on Github. It's that simple.

Hakan and I talked about priorities: a) SSE2/3 version enables more users to benefit from the tool and b) users need not only to encode, but also to listen to the material, otherwise the benefits of the tool are more visible on the charts than in real life. The author agreed that there was a reasonable grain in this appeal.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-04-25 22:33:16
Very initial HALAC input component for foobar2000: https://foobar.hyv.fi/foo_input_halac.fb2k-component (https://foobar.hyv.fi/foo_input_halac.fb2k-component).

Couple of warnings:
The input library is very basic and only supports loading files by file path, and the path is given in ANSI codepage so characters needing unicode won't work.

As the library forces decoding the entire file to memory large files will need a lot of RAM. But the library won't be able to handle very big files as it addresses size with 32-bit integer.

The library doesn't seem to have any error checking. Feeding it a path that doesn't exist makes the host program (foobar2000) crash. Trying to load corrupted/invalid HALAC file makes the host program crash.

Edit: and based on the DLL name I believe AVX is required. And since there is only 64-bit DLL available this version of the component only works on 64-bit foobar2000.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: mudlord on 2024-04-26 07:21:19
HALAC is designed for people.

At what point did the author explicitly say so?
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-04-26 09:24:48
The comment of everyone who comments is valuable to me. @mudlord , @Kraeved
As of now, HALAC is still in the development phase. The newest of other lossless codec except HALAC is close to 20 years of age. As I find time now, I try to establish the compression ratio/speed balance. I can see that there is a gap here.
When my improvements are over, it can be a codec that everyone can use. However, there is a lot of structural changes at the moment. And this is not very suitable for general use. However, 0.2.7 can now be encoded, decoded and listened to, albeit experimentally.

@Case
Thank you so much for foobar2000, Case. I have no desire to develop a new player. I just wanted to show that HALAC files can be listened to. So I prepared a very simple DLL.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-04-26 11:38:27
Best option for API would be not to rely on filenames at all. Most libraries allow setting callback functions - you just give pointer to simple functions for reading and seeking and other features API might need. Another option would be to use memory pointers for data exchange.

Not relying on filenames would for example mean that foobar component would automatically get support for playing HALACs over internet and from archives. And features like full file buffering or prebuffering parts of future tracks would work.

And partial decoding is of course very important for realtime playback. Decoding entire track in advance not only requires way too much memory, but it can also means a long delay for track changes potentially breaking gapless playback.

For playback use it would also make sense to have some way to report the audio data specs to the player and just give the audio data to it. My component now includes parser for WAV, RF64, BW64 and W64 formats just in case such things pop out of HALAC so that it can play them. It's not nice to outsource these things for the player.

Oh yeah, and you should specify what calling convention the functions use. Now they seem to depend on compiler defauls.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-04-26 11:41:50
I added the SSE2 version of DLL to Github.

@Case, please, update your Foobar2000 component accordingly.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-04-26 13:45:29
The foobar component (https://foobar.hyv.fi/foo_input_halac.fb2k-component) now includes both SSE2 and AVX libraries and it will use the one supported by the CPU.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Kraeved on 2024-04-26 14:14:39
Foobar2000 2.1.4 with Case's component (442 550 bytes) crashed upon adding HALAC file to the playlist.

Source PCM WAV 44.1 kHz 16 bit stereo was compressed using HALAC 0.2.6 SSE2 (https://hydrogenaud.io/index.php/topic,125248.msg1039348.html#msg1039348).

Code: [Select]
$ halac_encode_v.0.2.6_mt_sse2.exe sacrifice01.wav sacrifice.halac

$ halac_decode_v.0.2.6_mt_sse2.exe sacrifice.halac sacrifice02.wav

$ xxhsum.exe *.wav
0ebdb0d5ad93a94e  sacrifice01.wav
0ebdb0d5ad93a94e  sacrifice02.wav
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-04-26 14:26:24
That is what I warned about. The error handling leaves a lot to be desired and causes the host program to crash too. The demo HALAC Player crashes similarly. And the 0.2.7 decoder exe fails to decode the file too, creates a zero byte wav file.

If this was a final piece of decoder library with no hope of getting improvements, I could isolate it in a different process. But I hope things will improve so that such drastic measurements aren't necessary. The isolation layer adds more complexity and also introduces performance penalty.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-04-26 17:14:25
@Kraeved; I have already mentioned that Player only works with Halac 0.2.7. Also, I did not specify this with an error message. For the next version, I will add this measure to both Decoder and DLL. OK.

Thank you very much for your great suggestions, Case. Such comments are really necessary.
Best option for API would be not to rely on filenames at all.
DLL/SO will not deal with file operations. OK.

And partial decoding is of course very important for realtime playback.
DLL/SE should only take memory address, frame number and data length. OK.

My component now includes parser for WAV, RF64, BW64 and W64 formats just in case such things pop out of HALAC so that it can play them.
DLL/SO should give us the necessary information such as header(channel count, bit rate...) and metadata. OK.

Oh yeah, and you should specify what calling convention the functions use.
This should be more important especially for Windows. OK.

I will add them in the next version. If I need to repeat, these are the usual problems in the software process and the solution can be produced quickly. However, I need to focus on what I need to focus on (speed, ratio, prediction, entropy ...).
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: mudlord on 2024-04-26 19:10:45
When my improvements are over, it can be a codec that everyone can use.

Yep. I tried explaining to Kraevad that sometimes up until that point, a project is not for everyone *or anyone* at all, and that its purely a prototype. Like I have some personal projects which still have things cooking or I don't feel are up to any sort of standard that I feel comfortable to be public, so things like widespread vectorization/threading or even GPU support are not there yet, so I stick to SSE4/AVX2/NEON/GL4.6 until that time comes, because I am more focused on getting things working *at all* rather than any semblance of widespread appeal.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Case on 2024-05-01 11:36:25
Since there is still no updated decoder DLL I sandboxed the process and added support for 32-bit foobar2000 while at it. Since the DLLs are only 64-bit the 32-bit component won't be able to decode anything unless the OS is also 64-bit.
And I listed the component on my component page: https://foobar.hyv.fi/?view=foo_input_halac (https://foobar.hyv.fi/?view=foo_input_halac)
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-01 17:53:46
Since there is still no updated decoder DLL I sandboxed the process and added support for 32-bit foobar2000 while at it. Since the DLLs are only 64-bit the 32-bit component won't be able to decode anything unless the OS is also 64-bit.
And I listed the component on my component page: https://foobar.hyv.fi/?view=foo_input_halac (https://foobar.hyv.fi/?view=foo_input_halac)
Thank you very much for your interest in the topic and for what you have done. You are great.
I have uploaded the 32-bit compiled versions of the Encoder and Decoder from version 0.2.7 to Github as SSE2 and AVX. The 32-bit version may experience slightly loss of speed during the Encode stage.

I'm trying to prepare the things I get notes for the 0.2.8 DLL version (Windows/Linux). And since the DLL will be independent of the file and the file path, Player will now have multiple language support.

X
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: MihaiPopa12346 on 2024-05-06 15:17:20
Could you release a new version with a "-high" argument which gets a bit higher compression ratio than default but 25-50% slower compression and decompression speed?
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-06 20:44:50
Could you release a new version with a "-high" argument which gets a bit higher compression ratio than default but 25-50% slower compression and decompression speed?
X
The graph above shows the change of compression ratio since the first version of HALAC. HALAC is a speed-oriented study and the last thing I want to compromise on speed is. The lossless compression rate of audio data is really limited in most cases.

SQUEEZE CHART is an archive that also contains different types of music used in audio compression tests. It gives an idea in a general sense. Since the first version, there has been an improvement of about 1% in the compression ratio at the same speeds (encode has been slightly faster). At extremely high speeds, this is really not bad. Depending on the current situation, it is a little difficult to predict how much further progress can be made.

HIGH mode has been requested from different people before. I will focus on the compression ratio in later versions. Because I've already mentioned that there is a little more space in this regard.

I started to get interested in the compression ratio with version 0.2.6. However, I had to enter the Player and DLL topic in accordance with the incoming requests. Now, as soon as I have time, I am trying to complete the new dynamic library I have developed for HALAC in a flexible and error-free way. Changes are also made to the file structure and working style in accordance with incoming requests.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-05-06 21:03:04
There are situations where encoding speed is more important than decoding speed, and vice versa so out of curiosity: Does your methodology allow you to spend more encoding effort to decrease decoding load, without the trade-off being as skewed as "more brute-forcing"?

Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-07 07:44:17
HALAC's Encode speed is slightly better than the decode speed (I mentioned the little improvement of V.0.2.7 at the encode speed of my previous description). In fact, this situation is normal in HALAC. Because since the Encode speed is extremely fast, the decode speed seems to be behind. But not like that.

The encode process takes big data and compresses it to make it small. The decode process, on the other hand, takes compressed data and produces a larger data output. This is a disadvantage. The other problem is dependency. There is usually a dependency on the previous data in the decode process. One code cannot be passed to another without being decoded. In other words, some operations cannot be parallelized. Therefore, a bottleneck may occur at this stage.

Some codecs(especially image codecs) try to relieve the decode stage by performing more operations during the encode stage. In other words, they offer most things ready-made to the decoder. Because decode speed is more important for them. This approach also helps to increase the compression ratio, as more operations can be performed at the encode stage. In other words, more possibilities and situations can be evaluated. And some approaches, such as content modeling, can also be exhibited.

This kind of approach can also be exhibited for HALAC. In the "-high" mode, maybe we can see something like this. But I think this time it will be no different from other codecs. My goal is to make as few concessions to speed as possible.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: MihaiPopa12346 on 2024-05-07 15:11:43
Also: In the new version [0.2.8], add a lossy mode that uses quantization like FSLAC, Quite OK Audio (QOA), WavPack lossy and lossyWAV. Still make that as fast as possible.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-07 20:35:55
Lossy compression(audio/image) is about ignoring some data that is outside of human perception. And for this it is clearly necessary to perform various tricks. So as long as people don't understand, it means there's no problem. It is not as rigid as lossless compression, and we can act more flexibly. However, they are generally not preferred in professional areas and are for everyday use.

The main problem here is that the interpretation of the output can be in many different ways and quite relative. So it is very difficult to come to a definite conclusion in terms of quality.

When we carefully examine the deformations on the image, we can see them with the naked eye. However, it is not easy to analyze changes in the original sound without high-quality audio equipment. At least it's not easy for me.

Since there are higher bit values (24/32) in the audio data, there may also be more noise that needs to be cleared. I can evaluate this issue in later versions. But for now, there are more priority issues. Therefore, a little more time is needed for the lossy option.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-05-07 21:11:59
You could of course have a quick look at whether HALAC could be used in conjunction with LossyWAV. Then those who are interested in lossy (+ correction files) could play around with it and see if something interesting comes up near the speed/quality frontier. I surely agree with you (though it is your say and not mine) that creating a new lossy codec need not be next week's project.

BTW, @Nick.C the LossyWAV developer: Your profile (https://hydrogenaud.io/index.php?action=profile;u=42400) page has the ancient LossyWAV 1.3.0 development thread as your homepage ... did anyone notice in the 1/8th of a century since it was locked?   :))
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-08 07:54:14
Thanks for the suggestion, Porcus.
In the future, I can review the studies and academic publications on lossy audio compression and do a different study that focuses on high speed again.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: MihaiPopa12346 on 2024-05-09 05:11:08
Note: lossyWAV and HALAC don't match. If they match, it would be "lossyHALAC", a new format. 8)

If you compress lossyWAV-encoded files with HALAC, it's stuck in a endless loop, creating a VERY BIG file (in tens/hundreds of MBs/GBs or even TBs until you stop running the program). :P

If you compress normal WAV files with HALAC, it's fine, like how other lossless compressors should do. :D
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-09 13:46:27
HALAC can only compress WAV files at the moment. I just heard about the LossyWAV format a few days ago. This is happening because of the shift in header information. No problem, it can be easily handled.

X
As far as I have seen and understood, LossyWAV does not do direct compression. I think he's preparing the data by making it lossy so that it can be compressed better. Because the data obtained at the end of the process are in the same dimensions. However, their content is different and can be compressed better afterwards. However, the processing speed was below 10 MB/s even in standard mode(i7 3770k). This period does not include the subsequent compression process load.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Porcus on 2024-05-09 14:16:01
https://wiki.hydrogenaud.io/index.php?title=LossyWAV ;  IDK if HALAC could (easily) be adapted to 512 bytes per block.

My idea was more, if one wants to throw a bone to those who want a lossy version of HALAC - then making for it to work with LossyWAV would enable those users to work out settings at their preferred bit rates and see how it compares to other compressors.
Since the compression part is lossless (the lossiness is in the LossyWAV pre-processor), there would be no considerations about audio quality between compressors - speed and size would still be comparable in an apples to apples way.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-09 20:18:44
https://wiki.hydrogenaud.io/index.php?title=LossyWAV ;  IDK if HALAC could (easily) be adapted to 512 bytes per block.

My idea was more, if one wants to throw a bone to those who want a lossy version of HALAC - then making for it to work with LossyWAV would enable those users to work out settings at their preferred bit rates and see how it compares to other compressors.
Since the compression part is lossless (the lossiness is in the LossyWAV pre-processor), there would be no considerations about audio quality between compressors - speed and size would still be comparable in an apples to apples way.
X
I got an output with LossyWAV with the default settings. The new data obtained seems to be quite suitable for compression. So it has made the 16-bit data almost 8-bit. However, when I listened to the lossy music, I didn't see much difference.
Of course, in this case, the lossless HALAC is not suitable for these data. However, a good harmony can be achieved with a small change.
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: MihaiPopa12346 on 2024-05-09 20:19:56
Post the new version [0.2.8 maybe] of the encoder and decoder with lossyWAV support (lossyHALAC output format).
Title: Re: HALAC (High Availability Lossless Audio Compression)
Post by: Hakan Abbas on 2024-05-09 21:12:31
Post the new version [0.2.8 maybe] of the encoder and decoder with lossyWAV support (lossyHALAC output format).
I can't make any firm promises for 0.2.8, but I might add LossyWAV support for the next version. And since a special work will be required for an option such as "LossyHALAC", I can think about this topic later.