Re: Can a file be restored to original format after compression by FLAC?
Reply #6 – 2023-02-23 09:36:09
I need to ensure the original file is backed up in it's original file format and archived. I'm looking at this with a QA hat on, as you say this is business rather than hobby. May I presume you need verifiable traceability? I assume you have customers who would be very upset if it later turned out the data you had archived could not be restored to the original. If so, you cannot take on trust that any form of audio compression will necessarily recreate the original file when decompressed. Any specific example might, but that cannot be taken as extending to the general case unless you are in possession of some kind of auditable specification which states x ≡︎ decompression(compression(x)) AND y ≡︎ compression(decompression(y)). The problem is this: any form of audio file is a representation of the bit streams for the (potentially multiple) sound channels, and the process of creating that representation requires the original bit streams as input. In order to take the audio file and convert it to a new representation (eg MP3 to FLAC), you first have to extract the bit streams from the MP3, and process them through FLAC. While it might be possible to de-FLAC and obtain the bit streams which originally went into the FLAC, it is not possible to recreate the original MP3 from those bit streams. In symbolic terms: input.mp3 –> deMP3 –> input.bitstream –> FLAC –> archive.flac –> deFLAC –> output.bitstream –> MP3 –> output.mp3 Supposing you are willing to accept x ≡︎ deFLAC(FLAC(x)), then input.bitstream ≡︎ output.bitstream, and that might be sufficient to satisfy your business case, but y ≢︎ MP3(deMP3(y)) so output.mp3 ≢︎ input.mp3 so it might be difficult to satisfy your customers that your archive is valid. In the above, substitute any other encoding for MP3. The question is "is it reversible?". WAV yes, anything else is a definite "maybe", and what certainty is there that even lossless FLAC is bit-perfect reversible? I'm not talking about "sounds identical", I'm talking about being able to run a binary compare on two files and verify they are identical. Only if you show that, for each case, output.mp3 ≡︎ input.mp3, can you keep archive.flac and throw away input.mp3, and even then you would also have to archive the reconstruction process – you would not be able to update software and assume it worked the same. There is no point taking all this risk for an audio file which is already compressed (eg MP3). The storage space gain is too minor to be worth it. What you could do is ZIP it, which has the benefit of preserving file properties (eg the original file name) in an auditable package (and ZIP is demonstrably reversible). On the other hand, there are significant storage space gains to be made for uncompressed audio. As discussed in previous posts, there may be better alternatives than FLAC... but whatever you use you will want to obtain some kind of documentary evidence that your archival process does recreate the original file. And then ZIP it.