Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC worst case compression? (Read 7618 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

FLAC worst case compression?

Given enough audio data (say > 10 seconds) does anyone know what the worst case compression of FLAC is? can it be larger than the uncompressed original? I have managed to get it close to the original (by feeding non-audio data to FLAC).

FLAC worst case compression?

Reply #1
can it be larger than the uncompressed original?

This is a necessary consequence of lossless compression, so the answer is yes.

FLAC worst case compression?

Reply #2
I've managed 1407 kbps for white noise at FLAC -5.

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)

FLAC worst case compression?

Reply #3
White noise should be about the worst as there's no pattern to utilize.
Here's my results with 1 minute of stereo noise (as generated by Audacity), exported as wav, flac level 5 and level 8.


rw-rw-r--. 1 Don Don 5298420 Feb 19 15:40 noise8.flac
-rw-rw-r--. 1 Don Don 5298420 Feb 19 15:39 noise.flac
-rw-rw-r--. 1 Don Don 5292044 Feb 19 15:38 noise.wav


Very close, the flac's being just a few KB bigger and identical for both levels.  Could be flac figured out it couldn't do anything so just stored with no compression and enough headers to indicate that.

FLAC worst case compression?

Reply #4
My Merzbow "noise genre" sample: 1397 kbps (2ch/16/44.1/FLAC -8)

It is ripped from a standard commercial audio CD, though the music it is not far from actual white noise. The sample is available here:
http://www.hydrogenaudio.org/forums/index....showtopic=78476

Lossy encodings of it have extremely high Replay Gain peak values. For instance, LAME -V5 produces a peak value of 2.73:
http://www.hydrogenaudio.org/forums/index....st&p=685851

FLAC worst case compression?

Reply #5
Could be flac figured out it couldn't do anything so just stored with no compression and enough headers to indicate that.

This function from FFmpeg might be helpful in determining a worst case.
Code: [Select]
int ff_flac_get_max_frame_size(int blocksize, int ch, int bps)
{
    /* Technically, there is no limit to FLAC frame size, but an encoder
       should not write a frame that is larger than if verbatim encoding mode
       were to be used. */

    int count;

    count = 16;                  /* frame header */
    count += ch * ((7+bps+7)/8); /* subframe headers */
    if (ch == 2) {
        /* for stereo, need to account for using decorrelation */
        count += (( 2*bps+1) * blocksize + 7) / 8;
    } else {
        count += ( ch*bps    * blocksize + 7) / 8;
    }
    count += 2; /* frame footer */

    return count;
}


edit: this would be around 1457 kbps for stereo 44.1kHz 16-bit, not including the file header
edit2: also note that the frame header is variable size, increasing as the frame numbers increase. so in reality it will be slightly less than this overall.

FLAC worst case compression?

Reply #6
I needed to know the worst case compression before compressing, so settled at 1.2x orig size, also swapped from FLAC to wavpack, as need to be able to handle float, but I think the same applies (wv seemed better at the non-audio test files I fed it).

FLAC worst case compression?

Reply #7
If you disable VERBATIM subframes and pick random values for the LPC coefficients, the results can be hilariously poor.  I've gotten perfectly valid FLAC files to encode 10 times larger than the original file.  But under more sane circumstances, there's typically 8-12 bytes per frame header/crc16 and about 1 byte per subframe header.  So I suppose the worst case works out to about:

ceil(samples / block_size) * (12 + channels) + (samples * channels * bytes_per_sample)

plus a constant amount of overhead for the various metadata blocks.  Assuming I'm doing my math right.