Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Burrows-Wheeler Transform (Read 4162 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Burrows-Wheeler Transform

As I understand it many codecs still using huffman for entropy coding.  And for good reason I suppose it is free, works reasonably well and has been around for quite some time.  FLAC and Meridan Lossless Packing use Rice coding which is a type of golomb coding if I am not mistaken.  There are other alternatives though.  There is arithmetic coding which is apparently very slow and is using specifically in video codecs in the form of Context Adaptive Binary Coding in h.264.  I am sure there are many other forms of entropy coding but I am particularly interested in the Burrows Wheeler Transform.  It is used in bzip if I am correct and is patent free not to mention it works pretty well.  Also, according to a couple papers I have read say it is reasonably fast; faster than arithmetic and it seems to do fairly well against other entropy coders.  I don't know maybe I am just talking nonsense but it seems to me that most video and audio codecs use a similar set of tools for entropy coding why not try something different?  Maybe I am just talking nonsense, I have not had any formal training in information theory so probably I am.  Pretty much all the information I gave gleaned by google and reading stuff online.  Anyway, I would be interested to know the opinion of people who actually do know what they are talking about.


Burrows-Wheeler Transform

Reply #2
The Burrows-Wheeler transform is not an entropy coding method. It's a context modelling transform. Bzip2 uses Huffman coding for entropy coding.

Menno

Burrows-Wheeler Transform

Reply #3
BWT works best for text because it takes full advantage of the fact that some character orders are more likely, e.g. a "h" is most likely to occur after a "t". Once the BWT is applied, you get rearranged buffer and a sort index, and the beauty of things is that after the transform you will get somewhere inside the buffer a whole string of "hhhhhhhhhhh" which corresponds to 'what comes after "t"s somewhere in the original file'. This of course compresses excellent with huffman and a "move to front" technique.

Burrows-Wheeler Transform

Reply #4
As menno says, the BWT is not a form of entropy encoding. I've tried out the BWT in a few forms for lossless audio compression, however it is not really suited that well to audio. Even if one overcomes such hurdles as audio being 16-bit (versus text being 8-bit) which tends to screw up BWT's mechanisms (as the data is simply too sparse), the methods used by Monkeys Audio, Wavpack, La etc (various forms of adaptive prediction) simply compress much better.

One thing I have tried (somewhat unsuccessfully) but do want to further look into if I ever have the time is using wavelets for lossless audio compression. I think this might have some potential if done right.

And by the way jrj, Ghido is rather secretive about how Optimfrog works, however I don't believe he is using any form of QLFC (his BWT archiver) in it. Of course, if he'd care to clarify this that'd be great .