Digital compression: Louder music = Larger file?

Topic: Digital compression: Louder music = Larger file? (Read 3965 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Digital compression: Louder music = Larger file?

2018-12-04 16:55:49

Apologies if this question is naive or makes various silly assumptions about encoding optimisation!

I've noticed that songs which have been dynamically compressed (or which simply avail of more of the dynamic range of a format) will encode to larger files, irrespective of whether those files are lossy or lossless.

1. Is this a function of how most formats encode amplitude? For example, does it take more bits of data to say that a given sample/series of samples are loud rather than quiet?

2. If so, does this not imply that encoding this information in the 'opposite direction' would yield substantial storage-savings for most contemporary (loud) music?

For example, 16 bit audio contains 65,536 possible integer values per sample. The way file size corresponds with loudness implies (to me!) that silence is encoded as null, whereas max volume (0 dbFS) may be encoded as 65,536. In this way, a louder song would surely require more bits of data than a quiet song, corresponding to a larger file size.

If instead 0 dbFS were null, with silence encoded as 65,536, would that not garner substantial storage savings for consistently loud songs (e.g. peak-limited electronic music)?

I'd be obliged if someone could set me straight, as I suspect I'm missing something obvious here.

Re: Digital compression: Louder music = Larger file?

Reply #1 – 2018-12-04 17:47:35

That seems logical to me, but I don't understand the nitty-gritty details of file compression.

I'm reasonably certain that FLAC is "smart enough" not to waste bits. A 24-bit that's the result of up-sampling from a 16-bit file (zeros in the 8 least significant bits) will compress to (about) the same size as the original 16-bit file. And, I'd expect the same with a super-quiet file that has zeros in the 8 most-significant bits.

On the other hand if you add dither or if there is other noise at the low-quiet end, the file is "harder" to compress and you'll get a bigger file. (Noise is random data which makes it hard to compress.)

A "over-loud" file that has lots of digital clipping has a lot of identical samples in a row and that makes it easier to compress.

Lossy compression uses some of the same/similar methods as lossless compression in addition to the psychoacoustic-related lossy processing.

Quote

2. If so, does this not imply that encoding this information in the 'opposite direction' would yield substantial storage-savings for most contemporary (loud) music?

I'm not 100% sure what you mean by opposite... But there would be no-less information so it would be no easier to compress. And, a 0dB 16-bit WAV file (actually −32,768 to +32,767) doesn't contain just the maximum peaks... It's sampled at multiple points along the waveform so it contains values in-between. And, even if you have a pure sine wave, each cycle will be sampled at different points so every "identical cycle" contains different sampled data. If you're not following that, the Audacity website has a gentle introduction to how digital audio works.

Re: Digital compression: Louder music = Larger file?

Reply #2 – 2018-12-05 08:13:42

If we're looking at lossless compression, it makes sense to consider the data you're compressing as something that is just data.
The more structured similarities that data has, the easier it is to compress.

In terms of digital signals, you have a specific structure and type of data, so having a compression algorithm tailored to that, will likely give you better results that using a general-purpose compression algorithm like Huffman, for instance.

In terms of lossy compression, each compression algorithm allows for a certain degradation compared to the original data, such that the result is close to the original, but not an exact duplicate. How much degradation is to be allowed, depends on the application and the data, as it only makes sense in terms of perceived quality.

Regular JPEG compression works quite good with photos, pictures taken with a camera, while it doesn't work too well with hard edges, like you have them when you put text on a plain background.

Similar things apply to audio as well. In terms of entropy, the more "random" the signal is you're encoding, the harder it is, in many cases. A lossy solution is you allow for noise to be generated on the decoder end, using a noise generator, and simply parametrize that source, etc.

Re: Digital compression: Louder music = Larger file?

Reply #3 – 2018-12-05 10:00:43

Quote from: Foobar3030 on 2018-12-04 16:55:49

I've noticed that songs which have been dynamically compressed (or which simply avail of more of the dynamic range of a format) will encode to larger files, irrespective of whether those files are lossy or lossless.

Often the case, yes.

From your questions, I take it to mean that a signal that occupies bits 2...16 (with bit 1 being constantly 0) is a smaller file than "if the first bit is moved to sixteenth position". You propose to reverse the order, so that bit 1 becomes number 16 AND bit 2<---> 15 etc, but simply moving everything up by peak-normalizing is closer to your original observation that volume seems to matter.

Given that: it should not matter to a good algorithm, and it is most likely not the explanation. My educated guess is that it rather works as follows:
- Take a 24-bit digital signal, which is supposed to end up being a 16-bit signal delivered to customers.
- Master #1: Dither down to 16 bits. Up to 15 bits accuracy it is the same as the original. You discard 8 bits, which uncompressed would be about 1/3 of the size. In reality, it is likely more (because it is more like noise).
- Master #2: Dynamically compressed. Think the following way, although it is completely wrong: the 24 bits have a dynamic range of about 144 dB, but you want to instead squeeze it into 96 dB. You cannot completely do that, but "you could try your best" ... and get as little of that "1/3" reduction as you can. Of course, that will harm the dynamics of the music ("loudness war").

In reality, DR limiting does not work this way, but there is still something to the inaccurate picture I tried to draw. For example:
Say you have an 1812 Overture with the cannon shot being 12 dB louder than everything else. So you can limit that cannon shot down by 6 dB. That leaves room for moving everything 1 bit up. That includes 1 bit more from the original 24 - possibly minus something lost from the cannon shot. But that was a very short part of the track and takes up very little (FLAC'ed) space. What you can put into it, is a bit that takes a lot of (FLAC'ed) space.
This is the "loudness war". And yet I have totally omitted whether that limiting introduces additional distortion that is within the next 15 bits ...

Quote from: Foobar3030 on 2018-12-04 16:55:49

I suspect I'm missing something obvious here.

Yes. You are missing (1) that a good algorithm can get rid of the top zero just as easy as the bottom zero - and (2) that if you move everything upwards, you are replacing that easily-compressible zero by some hard-compressible bit at the bottom, and (3) all the shit that the loudness war has done to music ;-)

Notice