Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: New FLAC compression improvement (Read 1446 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

New FLAC compression improvement

Hi all,

Last summer, I've been busy thinking of ways to improve on FLACs compression within the specification and without resorting to variable blocksizes, which are in the spec but might not be properly implemented on many decoders. I discussed this old post with its writer: https://hydrogenaud.io/index.php?topic=106545.50

He wrote a small python script to explore the idea of a integer least-squares solver. It's included as ils.py. I explored this idea, read a lot of literature and came up with a different solution.

The current FLAC LPC stage is a classic, textbook example so to speak. This method was designed for speech encoding. The ILS method SebastianG proposed tries to find a more optimal approach. While the current approach comes close to generating least-squares solutions, this could perhaps be improved by a more direct approach.

However, FLACs entropy encoder doesn't encode 'squared' values, it's cost is more linear. That is why I developed a iteratively reweighted least squares solver, which, by weighing, doesn't come to a least squares solution but a so-called least absolute deviation solution. A proof-of-concept is also attached as irils-calculate-improvement.py. It gives mostly equal results to the current approach on most material, but improves on certain, mostly synthetic material like electronic music.

I even got as far as implementing it in FLAC, which can be found here: https://github.com/ktmf01/flac/tree/irls However, this implementation doesn't perform as well as the python proof-of-concept, and as my free time ran out last summer, I haven't been able to improve on it.

Maybe some people here with love for both FLAC and math might be interested in looking into this. I'll look at this again when  I have some time to spare and a mind ready for some challenge  :D

I'd love to hear questions, comments, etc.
Music: sounds arranged such that they construct feelings.

Re: New FLAC compression improvement

Reply #1
Well, did you test it in real world scenarios and what were the results? Don't just drop this here and go away :)
Error 404; signature server not available.

Re: New FLAC compression improvement

Reply #2
That is a very good question. I probably didn't explain myself well enough in that regard.

The C code on github is not working properly I think. It isn't an improvement over the existing FLAC codebase. So, I haven't reached any real-world compression gains yet.

However, the Python proof-of-concept is promising, albeit with a caveat. To simplify the proof-op-concept, it only performs simple rice bits calculation, without partitioning, and it does not do stereo decorrelation. So, I can only compare single-channel FLAC files with rice partitioning turned off. This is why the gains in the proof-of-concept might be more than what would be achievable with rice partitioning and stereo decorrelation.

For most material, my FLAC files got about 0.2% smaller (which would be 0.1% of the original WAV size). In one case, with electronic music (Infected Mushroom - Trance Party) I got an improvement of 1.2% (which would be 0.6% of the original file size).

So, I think this is promising, but I haven't been able to achieve this in C yet.

Music: sounds arranged such that they construct feelings.

 

Re: New FLAC compression improvement

Reply #3
Hi all,

Last week I've had some spare time and a mind ready for a challenge, so I took a look at the code (https://github.com/ktmf01/flac/tree/irls). Sadly, it seems it still needs quite a bit of work before it performs as well as current FLAC. I'll share my findings here, so I can read them back later and for anyone interested.

How FLAC has worked since the beginning
So, FLAC uses LPC (linear predictive coding) to predict a sample based on previous samples, and stores the error/residual with a so called rice code. Finding a good predictor is done by modelling the input samples as autoregressive, which means there is some correlation between the current sample and past samples. There is a 'textbook way' of calculating the model parameters (which we'll take as the predictor coefficients) by using the Yule-Walker equations. These equations form a Toeplitz matrix, which can be solved quickly with Levinson-Durbin recursion. While there a few shortcuts taken, this should result in a solution which is close to a least-squares solution of the problem, and is very fast to compute.

What could be improved about this
While this is all good and fun, the predictor resulting from this process is not optimal for several reasons. First, the process minimizes the square error, while for the smallest file size, we want the shortest rice code. Second, as FLAC input is usually audio and not some steady-state signal, the optimal predictor changes over time. It might be better to ignore a (short) part of the input when trying to find a predictor. In other words: it might be better to find a good predictor for half the signal, then a mediocre predictor for all of the signal. Third, minimizing the square error puts an emphasis on single outlier samples, which messes up the prediction of all other samples, while this single sample will not fit in any predictor at all.

What has been already improved
Ignoring a short part of the signal is exactly what a patch I submitted a few years ago does. It added partial_tukey and punchout_tukey windows/apodizations, which ignore part of the signal. This is a brute-force approach, the exact same approach is tried on every block of samples.

What I have been trying last year
Now, last summer, I wanted to try a new approach in finding a predictor. This started out as a different way to find a least squares solution to the problem (without taking shortcuts), but I started to realize least squares is not equal to smallest rice code. As the rice code takes up the most space in a FLAC file, that should be the ultimate aim if I want compression gains. I figured that compared to a least squares solution for a predictor, using a least absolution deviation (LAD for short) solution is more resistant to 'outliers'. As these outliers mess up prediction, I thought this might work well, so I implemented code into FLAC to do this. This is done through iteratively reweighted least squares, or IRLS for short. This code works, but it is very slow and does not (yet) compress better.

What is wrong with this approach
After reviewing my code last week, I realise the LAD (least absolute deviation) IRLS implementation that I made is still not minimizing the rice code. For one, the rice code in FLAC is partitioned, which means that a large error at the beginning of the block can grow the rice code length differently than a large error at the end of a block for example. Depending on the rice parameter, a slightly larger error does not cost any extra bits at all (when the error is smaller than 2^(rice parameter - 1)), it might cost one extra bit (when the error is the same size as 2^(rice parameter)) or it might take a few bits (when the error is much larger than 2^(rice parameter)). I could not find any existing documented IRLS weighting (or a so called norm) that works with rice codes.

What I want to improve about it
So, the next step is to improve the IRLS weighting. This weighting procedure should ideally incorporate knowledge of the rice parameter (as this says something about whether bits can be saved or not) but this is not known during weighting. I think using a moving average* of the residual might be a good way to guess the rice parameter during the weighting stage. I could also use a partitioned average*, but as the number of rice partitions is not known beforehand, just as the rice parameter, the size of the partitions to average on will probably not match the rice partition size, and we might get strange behaviour on partition boundaries. With a moving average window the problem is similar (choosing the size of the window). The optimal window size correlates with the rice partition size, but this size is not known during the weighting stage, but at least this moving average doesn't 'jump' on arbitrary boundaries.

Using a moving average*, the weighting procedure can incorporate knowledge about which errors are considered too large to try and fix, and which are too small to bother with. If the error on a sample is much smaller than the moving average it can be ignored, and if the error on a sample is much larger than the moving average it can be ignored too. Only if the error is in a certain band, it should be weighted such that the next iteration tries to decrease this error.

In the current IRLS weighing scheme is least absolute deviation. To get from the default least squares to least absolute deviation, the weights are the inverse of the residual (1/r). For very small r, the weight 1/r is capped at some value. I can change this value depending on the moving average. Above this value, I will try using a weight of 1/r². The effect I think this will have is sketched in the image I have attached

If this doesn't work, maybe a moving average can instruct the algorithm to ignore a block. This is how the partial_tukey and punchout_tukey work, but here the method is not applied brute-force, but with some knowledge of the samples itself. If the moving average is much larger than the average of the whole block, that part is ignored. This way, short bursts of outliers (for example, the attack of a new tone) can be ignored in the predictor modelling.

*when I say average, I mean average of the absolute value of the residual from the last iteration.

X
Music: sounds arranged such that they construct feelings.

 
SimplePortal 1.0.0 RC1 © 2008-2021