Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: SRLA: A lossless audio codec focused on decode speed and compression rate (Read 5378 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

SRLA: A lossless audio codec focused on decode speed and compression rate

(Sorry for the incomplete English; I am a newbie to this forum.)

For about a few years, I have been trying to develop a new lossless audio codec. Why? I think none of the codecs archived high decoding speed (per FLAC) and compression ratio (per Monkey's Audio). At this point, I have great respect for TAK. As we all know, TAK is the best lossless audio codec ever. However, TAK has been a closed source for long long years. I wish TAK to be open source. But I think this is difficult stuff...

Therefore, I decided to create a lossless audio codec by myself. The new codec "SRLA" is designed to have a high decoding speed and compression ratio. I selected linear predictive coding (LPC) as a predictor and recursive Golomb--Rice coding (gratefully thanks to Wavpack and TTA) as the entropy coder.

I know this is the tower of Babel [1].  But I want to get an evaluation of this codec in public space (here!). The below links place source code and executable binary for Windows/Mac/Linux. Please feel free to ask any questions/suggestions/bug reports!

Source code: https://github.com/aikiriao/SRLA
Executable binary for Windows/Mac/Linux: https://github.com/aikiriao/SRLA/releases/tag/v0.0.7

Currently, SRLA supports up to 8ch and 8-24bit PCM wav files only. Self-evaluation results are available at https://github.com/aikiriao/SRLA/tree/main/evaluation. I used part of the RWC Music Database (CD quality 211 tracks) for evaluation.

[1] https://codecs.multimedia.cx/2010/11/why-lossless-audio-codecs-generally-suck/

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #1
So as they say, TAK starts where FLAC ends ... so let's compare your -m 0 with flac -8p, even if it surely is possible to jack up flac beyond subset.

I used the basic version without optimizations (I think that's comparable to flac? I used flac at 32-bit),
--no-checksum-check and flac with --no-md5, and I removed all metadata from .flac files.

11958618022 flac -8p, with all metadata removed, that is larger than srla:
11877862240 bytes for srla;  took 38 minutes to encode 38 CDs (to file), that is slower than flac.

Then decoding time, from SSD file to NUL:
243 seconds for srla
154 seconds for flac

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #2
This was already saw before...
ahem, HALAC...
 ::)

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #3
This one aims at TAK.
HALAC aims at the fastest FLAC settings: https://hydrogenaud.io/index.php/topic,125248.msg1037617.html#msg1037617

HALAC has a news value: it uses a residual compression algorithm which wasn't even around when the established lossless codecs were fixed.

Actually, different choices for that part is a possible way to achieve better compression. Try several and pick the best. FLAC allows for several variations over the same method. ALS has the choice of two distinct methods.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #4
This was already saw before...
ahem, HALAC...
 ::)
;)

HALAC aims at the fastest FLAC settings: https://hydrogenaud.io/index.php/topic,125248.msg1037617.html#msg1037617

HALAC has a news value: it uses a residual compression algorithm which wasn't even around when the established lossless codecs were fixed.
SRLA's encode speed is too slow for practical use. I can't calculate what I could achieve if such a high processing time was also available for HALAC. But HALAC certainly doesn't want to compromise on speed.
HALAC is no longer just a speed-orientated audio codec. With v.0.3 it is faster and denser. ‘-fast mode’ offers the same compression ratio as “-normal” mode. ‘-normal’ mode can be thought of as “FLAC -5”. Decoding process is OK. Some optimisations are currently in progress.
I tried to test with SRLA on the same computer. However, it works with my i7 3770k processor but does not compress(compression rate does not go below 99). It worked with the Ryzen 3700x I tried before. Unfortunately, I could not do the tests again.

Busta Rhymes (2002) 829,962,880 bytes
Intel i7 3770k, 16 gb ram, 240 gb ssd


HALAC 0.1.9579,556,7344.66 s
HALAC 0.2.9 NORMAL574,192,1594.34 s
HALAC 0.3.x NORMAL564,867,3373.79 s
---------------
HALAC 0.2.9 FAST594,237,5022.94 s
HALAC 0.3.x FAST575,928,0163.17 s
---------------
FLAC -0625,873,229
FLAC -3603,491,036
FLAC -4564,940,949
FLAC -5563,259,688
FLAC -8558,665,363

And lossyWAV results are really much better.
HALAC 0.2.9350.671.4134.31
HALAC 0.3.x291.406.3143.19
HALAC 0.3.x FAST306.536.2692.83

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #5
Out of curiosity: do these new lossless codecs support 32bit floating point depth?
Hybrid Multimedia Production Suite will be a platform-indipendent open source suite for advanced audio/video contents production.
Official git: https://forart.it/HyMPS/

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #6
Out of curiosity: do these new lossless codecs support 32bit floating point depth?

Currently, SRLA supports up to 8ch and 8-24bit PCM wav files only.


Float is quite a different beast to handle if you want it lossless.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #7
Out of curiosity: do these new lossless codecs support 32bit floating point depth?
Float is quite a different beast to handle if you want it lossless.
I'm going off-topic here, sorry. I've found that people are more focused on the compression ratio, even if it's a very small difference, so I've turned my work with HALAC in this direction. And that's why I've unfortunately delayed the 24/32 bit support a bit.
Let's take a look at GDCC 2021. The top 3 and bottom 6 lines represent 32-bit floating telemetry data (astro0_2048x1489x5x32f ... astro2_2048x1489x5x32f and tele6_432x128x8x32f ... tele11_432x128x8x32f) In fact, since the data in this example are 3D, I can say that they have a more complex structure than audio data.

X
https://encode.su/threads/3701-GDCC-21-T2-Multispectral-16-bit-image-compression?p=71529&viewfull=1#post71529

As can be seen, the good results with HALIC MR 0.6 compared to other codecs are obtained from 32-bit floating values. According to this situation, I don't think this can be a problem for HALAC in audio data.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #8
Thank you very much for a lot of comments!
So as they say, TAK starts where FLAC ends ... so let's compare your -m 0 with flac -8p, even if it surely is possible to jack up flac beyond subset.
Porcus, thank you for your quick testing! Your results almost follow me. My environments (Intel i7-1260P) with AVX2 decoder are about 1.5 times faster than normal.

This one aims at TAK.
Absolutely yes, SRLA aims at the TAK alternative. Thanks for the fill the gap in my introduction.

SRLA's encode speed is too slow for practical use.
Thank you for testing the binary, Hakan! And I am surprised by HALAC's speed! I have missed your achievements (I have not seen this forum for a long time!).

Out of curiosity: do these new lossless codecs support 32bit floating point depth?
Float is quite a different beast to handle if you want it lossless.
Thank you comments on it! Currently, SRLA has no plan for supporting floating-point data. Because, same as FLAC, SRLA is designed for a consumer audio.

I'm going off-topic here, sorry. I've found that people are more focused on the compression ratio, even if it's a very small difference, so I've turned my work with HALAC in this direction.
This is true for the need for speed such as game and embedded industries. And yes, audio data is much smaller than assets such as movies, textures, etc.
But, I think, the compression ratio is still important for codecs. Imagine compressing all of the CD tracks human-made. In this situation, we are suitable to use high-compression archivers such as La, OptimFROG, and Sac, but they are moderate for speed. For practical use, we should search for the best point in the Pareto front of the performance curve.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #9
@aikiriao
Are you planning to support multithreaded enc/decoding? FLAC will support it in upcoming version https://hydrogenaud.io/index.php/topic,124437.0.html


Busta Rhymes (2002) 829,962,880 bytes
Intel i7 3770k, 16 gb ram, 240 gb ssd


HALAC 0.1.9579,556,7344.66 s
HALAC 0.2.9 NORMAL574,192,1594.34 s
HALAC 0.3.x NORMAL564,867,3373.79 s
---------------
HALAC 0.2.9 FAST594,237,5022.94 s
HALAC 0.3.x FAST575,928,0163.17 s
---------------
Are these results with 1x thread or multiple threads?


Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #11
Thank you for testing the binary, Hakan! And I am surprised by HALAC's speed! I have missed your achievements (I have not seen this forum for a long time!).
Thank you so much @aikiriao.
The HALAC 0.3 decoder is not yet optimized for speed. Because I am developing a new Rice-derived coder(when I have time). And since the 20 files in the test set are not large, the thread effect is not very noticeable.

Busta Rhymes (2002) 829,962,880 bytes
AMD RYZEN 3700X, 16 gb ram, 512 gb fast ssd

SRLA AVX2550,867,80077.26 s6.73 s
HALAC 0.3.x NORMAL (AVX)564,867,3372.51 s5.47 s
HALAC 0.3.x FAST (AVX)575,928,0161.86 s5.59 s
HALAC 0.3.x NORMAL (AVX) 16 Thread564,867,3370.74 s1.01 s
HALAC 0.3.x FAST (AVX) 16 Thread575,928,0160.66 s1.00 s

And LossyWAV results;
SRLA AVX2556,035,887
HALAC 0.3.x NORMAL (AVX)291,406,314
HALAC 0.3.x FAST (AVX)306,536,269


Are these results with 1x thread or multiple threads?
Single(1x) thread results.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #12
I just tested this "codec" right now.

-m 0 and -m 17 (for some reason the encoder doesn't allow -m 18, even when -h mentions that -m values are from 0 to 18) are almost the same in compression rate... but -m 17 is much slower. At least decoding is quite fast, while filesize is a bit smaller than with flac -8, for the few samples I tested, 16 and 24 bit ones.
I didn't try another encoding options... no idea of how them could affect to the final result.

Two additional notes:
The player seems to use the same CPU percentage as the encoder/decoder, and it doesn't play 24-bit SRLA files.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #13
So this is an LPC thing, which can bisect blocks if that improves - a feature that TAK got from ALS.
But it uses a double-Rice-parameter approach for the residual.  I wonder, if one just took FLAC and implemented that residual encoding method - just to get an apples to apples approach (and then one could do that to the block bisection strategy too!) - what would one gain?

Or "better": choice between residual coding methods, per subframe.
FLAC has that. Choice of two. (With room for two more, never going to be used.)
I don't solicit spreading files that identify as "fLaC", but that is an easy change if one wants to compare what makes differences. ( @Hakan Abbas , that goes for you as well: that residual coding method made a news value.)
I don't know about that BGMC method that ALS uses - it seems to offer choice per file, not per block.
But it would be interesting to see what methods work.
Heck, one could even take a FLAC file, keep the predictor, and encode the residual with N methods and report sizes ...


Anyway, attached is something that apparently does not benefit from SRLA's residual encoding method. I knew that WavPack doesn't shine on this (TTA, predictably, does bad when one strolls off the beaten path.) It is difference between two ditherings: a mono recording was mastered to CD with channels individually dithered, and I extracted the side channel (i.e. the difference): it is like "too loud dither".
Because SRLA doesn't seem to handle mono, I made a two-identical-channels stereo .wav out of it.

A 5 seconds clip attached, I got FLAC down to 51 205 kilobytes. 53 824 using flac -1 and then removing all padding and metadata. ALS could beat FLAC, at 47 639.   52 024 for the TAK -p4 with or without m or e. 59 100 for WavPack -hx4.
SRLA sizes then ...
86 205  -m 2, -m 3, -m 8, -m 9, -m 12, -m 13, -m 16, -m 17
86 261  -m 4, -m 5
86 647  -m 0, -m 1
88 017  -m 6, -m 7
91 007  -m 10, -m 11
97 323  -m 14, -m 15
Beaten by TTA at 85 943. Looks like there is a tuning job to do.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #14
I don't solicit spreading files that identify as "fLaC", but that is an easy change if one wants to compare what makes differences.
I mean, if the following is made into a file, don't use the "fLaC" ...
But the following should be doable:
* Take a FLAC file - say, CDDA, using only the 4-bit Rice parameter
* Traverse all subframes. For each, record the size, encode the residual using a different method, record the size.
Differences will be "too much apples to apples" since the reference flac encoder brute-force picks dual mono / left-side / mid-side / right-side.
And sure there could be a different predictor that fits the other residual method better, so one could also do - with an LPC framework with at most 32 steps history:
* Calculate another encoder's preferred predictor vector. Do same thing.

Also this could be used to evaluate the impact of the frame-splitting technique against what @cid42 did over in this thread, including taking on board a variable block size technique from someone's old thesis I linked to here: https://hydrogenaud.io/index.php/topic,123248.msg1020379.html#msg1020379

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #15
@aikiriao
Are you planning to support multithreaded enc/decoding? FLAC will support it in upcoming version https://hydrogenaud.io/index.php/topic,124437.0.html
Yes, I will. Each SRLA block is independent, so I think it is not hard to implement it. Thank you for pointing out.

By the way, what has happened since LINNE?
Thank you for the mention of LINNE. I hadn't imagined treating this codec in this forum.
LINNE is a conceptual implementation of predictor cascading such as the MPEG4-ALS and La. Existed codecs archive high compression through the cascading adaptive filter, but are CPU-consuming. So, I tried to cascade LPC predictors. LINNE's predictor model is equal to ResNet, so it is interesting for academic interest. It did work for compression rate, but, at the same time, the decoding speed was reduced due to the predictor's complexity. Hence, I tried to keep the structure simple and reach SRLA implementation.

The HALAC 0.3 decoder is not yet optimized for speed. Because I am developing a new Rice-derived coder(when I have time).
Thank you for the performance comparison (I plan to compare with HALAC soon). And I was surprised that the Rice coding speed again. I am wondering if HALAC's LPC order is set to reasonably small.

But it uses a double-Rice-parameter approach for the residual.  I wonder, if one just took FLAC and implemented that residual encoding method - just to get an apples to apples approach (and then one could do that to the block bisection strategy too!) - what would one gain?
Yes, I employ recursive Golomb--Rice coding which has two parameters. It is also employed by WavPack and TTA, which update parameters sample-by-sample. It is great for compression rate but it is time-consuming, so I compute optimal parameters for small blocks. And yes, the block bisection strategy follows FLAC.

Or "better": choice between residual coding methods, per subframe.
This is partially done in SRLA. It uses (one parameter) Rice coding if the residual amplitude is small. Thank you for the suggestion about code-switching. I had already tried to use naive run-length coding when residuals were almost zero but got a minor gain. It seems good to use the BGMC (or other range-coding methods) for tiny residuals.

Anyway, attached is something that apparently does not benefit from SRLA's residual encoding method.
Thank you very much for reporting it. I reproduced your result. It seems to lack handling for monoral-silence signal. I will try to improve it.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #16
I don't solicit spreading files that identify as "fLaC", but that is an easy change if one wants to compare what makes differences. ( @Hakan Abbas , that goes for you as well: that residual coding method made a news value.)
I don't quite understand this part. Which residual coding (Ans/Rice)?

Thanks for the ideas, Porcus. If we're talking about fast compression, we're going to hit a dead end at many points. We're afraid to do a lot of the transformations or predictions we want. because every move we make is a workload. We should evaluate the results obtained from HALAC accordingly.
In the traditional approach, many operations are performed and the appropriate ones are selected and processed. This does not affect the decoder performance. Because the encoder makes the necessary selections and stores them as side information.
As you mentioned, by trying more estimators or transformations, we can significantly increase the encode time and increase the compression ratio by 1% to 5% more without changing the decode time. But it is much more important for me to get as fast as possible and still get good compression.

The HALAC 0.3 decoder is not yet optimized for speed. Because I am developing a new Rice-derived coder(when I have time)..
Thank you for the performance comparison (I plan to compare with HALAC soon). And I was surprised that the Rice coding speed again. I am wondering if HALAC's LPC order is set to reasonably small
LPC coefficient varies according to the situation/mode (0-12). Rice encoding speed should normally be even faster than "Optimised Fast Huffman". Because apart from the calculation of the required "k" parameter, it only needs to be encoded in binary. By the way, Fast Huffman is faster than ANS in both encoding and decoding. I have not completed my works yet. At the moment, the "k" parameter per block is calculated in a very rough way.

And I calculate the success according to the formula below.
Universal Score(gdcc.tech) : C + 2·D + (S+F)/10⁶
Here, C and D are respectively the total compression and decompression execution time (in seconds), S is the total compressed size in bytes, and F is the submission packet size.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #17
But it uses a double-Rice-parameter approach for the residual.  I wonder, if one just took FLAC and implemented that residual encoding method - just to get an apples to apples approach (and then one could do that to the block bisection strategy too!) - what would one gain?
Yes, I employ recursive Golomb--Rice coding which has two parameters. It is also employed by WavPack and TTA, which update parameters sample-by-sample. It is great for compression rate but it is time-consuming, so I compute optimal parameters for small blocks.
To a code-illiterate like myself, it sounds like a difference between FLAC and ALAC: FLAC stores the Rice exponent parameter, ALAC infers it from the signal, and then upon decoding it has to be calculated. (That in part explains why ALAC is slower, but not why ALAC isn't better ...)

And yes, the block bisection strategy follows FLAC.
You mean ALS / TAK?
But it could be implemented in FLAC, and I guess it would be welcome if it actually helps.
I suggested that with those apodization functions that only do part of the signal block, it would make sense to try whether the "rejected" part has its own predictor. Better brains than mine were skeptical towards the gains, though.


Anyway, attached is something that apparently does not benefit from SRLA's residual encoding method.
Thank you very much for reporting it. I reproduced your result. It seems to lack handling for monoral-silence signal. I will try to improve it.[/quote]
Ah, simple as that.
FLAC has a CONSTANT subframe type, and actually it sometimes happens that the signal is constant but nonzero; I've seen =-1. I can only speculate that some recordings have inverted phase by flipping every bit. You can argue that "this shouldn't exist", but apparently it does. Same goes for "redundant LSBs that aren't all zero": WavPack checks for that and sometimes improves enormously ( https://hydrogenaud.io/index.php/topic,121770.msg1024689.html#msg1024689 ) over the competition.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #18
( @Hakan Abbas , that goes for you as well: that residual coding method made a news value.)
I don't quite understand this part. Which residual coding (Ans/Rice)?[/quote]
ANS. The established codecs are older than ANS, so that brings new elements into the game.


Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #20
ANS. The established codecs are older than ANS, so that brings new elements into the game.
I get it. It's newsworthy when you use ANS, but there's nothing special about using a Rice derivative. Even if the results are better  :)

For audio data, Rice coding usually produces better results than ANS. And it works a little faster. I have been specifically recommended to use Rice. For HALAC, ANS may come into play again in some cases. I'll leave this for later. I think that for a new codec, the results obtained are more important for end users than what is running in the background. That's why I want to use all the options to improve the speed/compression ratio as much as possible and get the final result.

By the way, I don't really know much about how other codecs (Flac, Wavpack...) work. Only general audio compression from academic publications... That's why I took the trouble to develop a new entropy encoder. It's really boring for me to analyse, understand and use the code of other works.

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #21
I get it. It's newsworthy when you use ANS, but there's nothing special about using a Rice derivative. Even if the results are better  :)
That's worth finding out.
Also, even if ANS is a bit slower, an encoder that isn't so obsessed with encoding speed could encode as both and pick the best.
(That's not particular about "ANS"/"Rice". If you want to squeeze droplets out of the FLAC reference encoder, let it try a lot of different possibilities, and it will pick the best - or at least, not very far from.)

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #22
I don't agree that srla's encode speed is too slow for practical use. My laptop contains 8 cores with avx512, this is orders of magnitude more powerful than hardware like the pentium 4 which was contemporary at flac's release. Most consumer hardware is far more powerful than most consumers need 99% of the time, and far-underutlised as a result.

If we were talking decode speed then yes, you want portable playback in realtime, but for encode they're just pitching for a different target than halac or flac and that's fine.

 

Re: SRLA: A lossless audio codec focused on decode speed and compression rate

Reply #23
I just released v0.0.8 package:
https://github.com/aikiriao/SRLA/releases/tag/v0.0.8

This release includes some changes thanks to comments.
  • Delete experimental presets (thanks to a.ok.in!): Currently, SRLA has -m 0, ..., -m 4 presets.
  • Improvements to pseudo-stereo wav (thanks to Porcus!):
    In my environment, 5sec_ditheronly_fakestereo.wav flated to 58577 bytes at -m 0 option.