Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: How can lossless compression work [because I cannot do basic research] (Read 17967 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

How can lossless compression work [because I cannot do basic research]

I'd like to store all my CDs onto a drive so I can build a database for them. And as near as I can tell, .flac is the way to go because of its ability to maintain tags and that it's just as good as .wav as far as sound quality.

However, the question is how can .flac be the same quality as .wav since it compresses the recording by around 50%? I thought the reason why quality diminishes was because it is compressed. 




How can lossless compression work [because I cannot do basic research]

Reply #1
Wav= XYZXYZ
Flac= (XYZ)X2

Same data, twice shorter.

 

How can lossless compression work [because I cannot do basic research]

Reply #2
Pardon my ignorance, but what the hell is "XYZXYZ?" Is that your example of data representative of a recording? And if it is, how can it be doubled (X2) and yet still be half the size?

How can lossless compression work [because I cannot do basic research]

Reply #3
Yes, XYZXYZ is audio data, it is not just "doubled", sorry if my first summary was too short.
It is first "divided" (compressed) by the encoder & then the Flac data is restored back "doubled" (uncompressed) to Wav by the decoder before you play it back, so /2 is the encoder & X2 is the decoder.

Wav= XYZXYZ
Flac=  (XYZXYZ)/2 (Encoder) then (XYZ)X2 (Decoder)

Edit:
Note: Codec means EnCOder/DECoder as far as I recall.

How can lossless compression work [because I cannot do basic research]

Reply #4
However, the question is how can .flac be the same quality as .wav since it compresses the recording by around 50%? I thought the reason why quality diminishes was because it is compressed. 

Have you ever questioned the "quality" of the data in a .zip or .rar archive?

Same exact thing:  FLAC (and all other lossless audio codecs) is basically ZIP/RAR/etc. type compression optimized for PCM audio files (well, and occasionally floating point, eg. WavPack).
"Not sure what the question is, but the answer is probably no."

How can lossless compression work [because I cannot do basic research]

Reply #5
Thanks for clearing that bit up, sauvage78.


Have you ever questioned the "quality" of the data in a .zip or .rar archive? Same exact thing:  FLAC (and all other lossless audio codecs) is basically ZIP/RAR/etc. type compression optimized for PCM audio files (well, and occasionally floating point, eg. WavPack).


No, I haven't. But then again I don't think I've ever seen a .zip or .rar file compressing as much as 50% either. What's more is that one can't play a recording in those formats without decompressing them first. But apparently one can with a .flac file. Which, I guess, is another thing that confuses me.


How can lossless compression work [because I cannot do basic research]

Reply #7
Warning: Offtopic general compression answer.

Quote
I don't think I've ever seen a .zip or .rar file compressing as much as 50% either


With classic data compression you will reach very high ratio with text files (.txt/.rtf/.doc/.xls/.odt/.ods) if you try to losslessly compress files that have already been lossyly compressed like audio/video/image (say .mp3/.avi/.jpg) you will only achieve crap ratio & it is overkill: there is not much (if anything) left to compress. Executable (.exe) are in the middle, as far as I understund if done "correctly" they should already be losslessly compressed (but sometimes those are not really done "correctly" (not optimized) as sometimes they use very fast settings, instead of balanced speed vs compression settings).

Personnaly I use .7z default & I only compress archived text files (not used daily) & portable softwares (not .exe).

How can lossless compression work [because I cannot do basic research]

Reply #8
But then again I don't think I've ever seen a .zip or .rar file compressing as much as 50% either

Code: [Select]
C:\docs>dir
Volume in drive C has no label.
Volume Serial Number is 6CB2-B877

Directory of C:\docs

07/07/2011  10:57 AM    <DIR>          .
07/07/2011  10:57 AM    <DIR>          ..
07/07/2011  10:57 AM             6,607 Hi.7z
07/07/2011  10:56 AM           224,256 Hi.doc
               2 File(s)        230,863 bytes
               2 Dir(s)  27,883,274,240 bytes free

How about 97%?

That's a 30 page Word doc with nothing more than the word "Hi" at the beginning, followed by 30 pages of a single . at the beginning of each line and then spaces for the remainder of the line.

This is an extreme example, but it shows how the degree of compression achieved by *any* lossless format is dependent on the content of the file being compressed.

Edit: A "word problem" example of what sauvage78 just said.
"Not sure what the question is, but the answer is probably no."

How can lossless compression work [because I cannot do basic research]

Reply #9
No, I haven't. But then again I don't think I've ever seen a .zip or .rar file compressing as much as 50% either.


That's surprising.  Most (not already compressed) files compress much more then 50% in RAR.  Trying some random office documents and Windows system files for instance gives 60-80% compression.  Actually audio is one of the hardest things to compress, hence the need for specialized formats like FLAC (and RAR which also has special code for audio).

What's more is that one can't play a recording in those formats without decompressing them first.


Neither can flac. 

Which, I guess, is another thing that confuses me.


What that you need a decoder to play FLAC files?  Try bit streaming a FLAC file to a device that decompresses flac verses one that only understands WAV.  You'll find that theres a big difference in output if you use a decoder

How can lossless compression work [because I cannot do basic research]

Reply #10
I'll take your word for it, guys!

Quote
R1: What's more is that one can't play a recording in those formats without decompressing them first.
R2: Neither can flac.



I guess what I was getting at is that in order to use a software program, play an audio file, or view an image in a .zip file, one needs to unzip them first then use the appropriate secondary program to execute. It appears that a .flac file will decompress as it's playing. Or did I misunderstand that?


How can lossless compression work [because I cannot do basic research]

Reply #12
Quote
or view an image in a .zip file, one needs to unzip them first


On the other hand, PNG format is analogous to FLAC: it is compressed and can be viewed directly (without decompressing to BMP).

How can lossless compression work [because I cannot do basic research]

Reply #13
It is decompressed in real time. The wav format of the data (the uncompressed data0 is what is being played. It is not necessary to decompressed the entire file before starting to play, it can be done piece by piece.

How can lossless compression work [because I cannot do basic research]

Reply #14
Digg that. Thanks for the info folks!

How can lossless compression work [because I cannot do basic research]

Reply #15
Lots of data can be compressed losslessly with large compression ratios.  In general, if you are able to tell something about the data, you can generally devise a means of storing the same data in less space. 

Take a simple and easy to understand example:  You have a signal that you have information about what color it is every 1/10 of a second.  Say it can be 1 of three colors (say, for instance, yellow, red, and green).  This signal would take 2 bits every 10th of a second to store, given no additional information.  But say that you know what this signal is: you know it's a traffic light that always goes Red -> Green -> Yellow -> repeat.  How could you store this now knowing something about the signal?  Well, you know that it has a specific order, so you could use only 1 bit per sample (every 1/10 sec) to say either it remained the same (a 0), or it changed to the next color (a 1).  If you wanted to know what color the light's on, all you'd need to know is where it started, then play through the compressed (encoded) signal, count the 1s and then determine which color it should be (decoding/decompressing).  Now, you could take this much farther by knowing that traffic lights change infrequently (with respect to 1/10s of a second), and say, they don't change more than once per second.  Your signal could now either give a 0 meaning there was no change this second, or a 1 meaning there was (following this one you could give the binary # of which 1/10sec it changed in).

Your signal could now look like this:

(00) 0010101
or starting with color 00 ((00)) two seconds of the same color (00) and then a change to the next color (1) 5 10ths of a second in (101)

The original uncompressed signal would have been 2 bits 10 times per second for 3 seconds to be the same thing.  9 bits vs 60 bits.  It wouldn't surprise me if you could actually do better than what I've listed.

This concept holds true for image data, text data, and audio data.  Text contains many common patters that can be referenced, Images can take advantage of spacial locality.  Audio can take advantage of temporal predictability and similarity between channels.  I'm just naming a few here, but I'm sure that most people can see at least a few ways to take advantage of these sorts of knowledge about the data to make the same data take up less space.

How can lossless compression work [because I cannot do basic research]

Reply #16
I guess what I was getting at is that in order to use a software program, play an audio file, or view an image in a .zip file, one needs to unzip them first


Really?

From the foobar manual:  "In addition, foobar2000 can also play music directly from compressed ZIP and without requiring the user to extract the files prior to playing."

Likewise, Word documents (.docx) are just zipped XML files.  You don't need to unzip them and read the raw XML.  Word unzips them for you.  Same way players handle flac.

It appears that a .flac file will decompress as it's playing. Or did I misunderstand that?


ZIP and FLAC work exactly the same way.  You need to decode both before you can see whats inside it.  You can choose to use a utility to view whats inside the file (e.g winrar for zip or flac.exe for flac) or you can just use software that naively understands the format and does it for you (e.g foobar2000 for both flac and zip).

How can lossless compression work [because I cannot do basic research]

Reply #17
FLAC files compress things the way zip files do, as mentioned. MP3s etc first throw away the least audible information, then compress what's left in much the same manner. By throwing away things you can't hear, lossy formats like MP3 achieve much better compression ratios.

There isn't really such a thing as a lossy text compressor, because it would have to throw away the "boring" parts of your document! And as mentioned, text does very well with generic lossless algorithms like RAR and zip.

How can lossless compression work [because I cannot do basic research]

Reply #18
It appears that a .flac file will decompress as it's playing.

That's correct.


Perhaps semantics, but the flac file itself does not decompress, the flac player/decoder decodes/decompresses the flac file (in memory) to the source audio.

I know greynol understands this, but I state this to avoid confusion for those who don't.

How can lossless compression work [because I cannot do basic research]

Reply #19
Compression is also helped by Huffman coding, which is a technique for deriving the minimum number of bits needed to encode a source symbol (such as a character in a file) where the variable-length code table has been derived in a particular way based on the estimated probability of occurrence for each possible value of the source symbol

How can lossless compression work [because I cannot do basic research]

Reply #20
Pardon my ignorance, but what the hell is "XYZXYZ?" Is that your example of data representative of a recording? And if it is, how can it be doubled (X2) and yet still be half the size?


Strong algebra skills brah

Haha but seriously, it's just a matter of data simplification.

How can lossless compression work [because I cannot do basic research]

Reply #21
It appears that a .flac file will decompress as it's playing.

That's correct.


Perhaps semantics, but the flac file itself does not decompress, the flac player/decoder decodes/decompresses the flac file (in memory) to the source audio.


That literally means the same thing as what you disagree with.

How can lossless compression work [because I cannot do basic research]

Reply #22
Take a look at the "lossless coding" part of the "perceptual coding tutorial" at www.aes.org/sections/pnw/ppt.htm

Basically, any signal that is not uniformly distributed white noise can be compressed by some amount, without any change whatsoever in the digital data that represents it.

Consider, if we have a 4 level coder (yes, that's small, but let's consider it), and that the probability for the four levels is

.125 (level -1.5)
.375 (level -.5)
.375 (level .5)
.125 (level 1.5)

Now, if we send that as 2 bits, the cost is 2 bits per sample, end of discussion. But now, let us suppose that we send it as follows:

bit pattern 1 (meaning level .5)
bit pattern 01 (meaning level -.5)
bit pattern 001 meaning (level 1.5)
bit pattern 000 meaning level -1.5

We have .375 + 2*.375 = 2 * 3 * .125 bits, or an average of 1.875 bits.

so now just by using the code above, we have reduced the rate from 2 bits/sample to an average of 1.875 bits.sample.

If we use a more levels and more skew in probabilities, it gets better.

That's half of how one can reduce bits losslesly.

The other half is not so easy to explain conceptually, but if we find that the spectrum of the signal is not flat, we can "predict" the signal in a way that causes no loss.

In reality, the levels in an audio stream are quite widely varying in probability (and therefore compressable) and the spectra are not flat (more compressibility).

Question: Is it time for a noiseless compression tutorial talk?
-----
J. D. (jj) Johnston