Idea for a Corruption-Tolerant Very-Low-Bitrate Audio Codec 2017-09-20 09:11:25 Hi, everybody.I have been developing a (pretty useless, I know) audio encoder that stores data in images. This isn't "steganography", as it doesn't try to hide the data. It just uses images as its storage medium. I'm just doing this as a hobby. (Well, originally, I used this to send small ZIP files losslessly via Facebook because Facebook is free in my country, and I don't have paid internet, and I need to send small files--later on, I developed an ADPCM-based lossy codec so it can save audio lossily into images)Here is the latest version if you want to try it out, but please don't bother.https://sites.google.com/site/orthographiccube/home/software/bitmapper2(Version 3 is coming soon, with a complete rewrite of the program from C# to Java + Java FX [it looks so gorgeous, and it works faster!] Currently, only the lossless data storing option is implemented in version 3 but the original lossy audio codecs should be implemented soon enough)It works, but the dilemma here is that while the program saves PNG images (call it the FLAC [or WavPack, depends on your preference] of the image formats), the audio data should be able to survive when the PNG is converted to JPEG (the MP3 of the image formats). The audio data still survives the lossy conversion, and due to the ADPCM-based codec it uses, the distortion to the data is kept at minimum. Audible rumble and noise is present, but fortunately, the primitive DC offset correction in the decoder keeps this tolerable.Anyway, the real problem is that (for example) Facebook resizes images smaller if they are uploaded too large. Resizing the image means that the file (image) header pixels are destroyed in the process, and audio samples are lost, making the image useless. So keeping the image small enough is first priority. This is achieved when I added mid-side stereo coding, which produces 2 separate images (the "mid" channel is playable separately but results in mono audio), AND the audio encoded is less than 4m30s and is in 8bit 16kHz sampling rate audio.The problem is, well, 8bit isn't that bad, it's just a bit of noise, but I really need to do something about the 16kHz sampling rate limitation (not a limitation imposed by my program, but rather, by the image size allowed by (for example) Facebook) so I want to store more samples while retaining image size, or maybe, creating more images for one audio file. I am trying my best to avoid creating more images since it makes the format inconvenient for distribution, and I want to make the images incremental to decode (like, you don't need to have ALL the images to be able to hear the whole audio--just one image should be able to provide a decent sounding audio, but having more and more images will gradually improve the audio quality)Currently, the program encodes a sample as a kind of ADPCM sample... with 5 different possible values per sample, one sample per pixel. This is not that bad since if JPEG alters a pixel too much, a stored value of 4 could become 3, and the data is decoded in an "analog electronics" manner, so it doesn't produce corrupt audio, just a bit of a "click" like a vinyl record (it isn't that audible, really). Using only two possible values per pixel make sure that JPEG will not alter the value in a way that a 1 will become 0 (or vice versa) but that's like... DPCM, the variant used by the Nintendo Entertainment System, and we all know how bad that sounds.So I have several ideas in mind. I'm thinking of using more images (resulting in 4 images, so it saves stereo audio for 0 to 8kHz, and mono 8kHz to 16kHz) OR maybe there is some way I can use ADPCM with only 2 or 1 bit per sample... or maybe a completely new codec that will not result in corruption when a pixel gets modified.... I dunno. Does anybody have any possible ideas? (I'll make sure to credit you if I ever release the software to the public )Thanks!