Then the idea of encode algorithm came into my head: just try to keep the signals difference at the same level (or less), defined by user. Thus, the audio quality is simply measured by volume of the difference of the signals, and this difference is nothing but distortions, produced by encoder.
Of cause, this algorithm is much slower then just direct encode
The problem with this approach is that audibility has very little to do with the absolute volume due to masking. So an error at -20dBFS might be inaudible if its masked, and very audible at -40dbFS if its not masked. So usually what codecs do is when they get to step 4, they compute masking thresholds and adjust the error based on how audible it will be. In this case you will likely find that huge error signals are often highly tolerable, while small error signals often not.
I've made many tests of lossy encoders, and get realized, that they cannon satisfy me in this role. WavPack hybrid is much better, but it is not flexible, and you never know, which distortions you get on output.
just try to keep the signals difference at the same level (or less), defined by user. Thus, the audio quality is simply measured by volume of the difference of the signals, and this difference is nothing but distortions, produced by encoder.The only thing is needed is that some audio developers get interested in this idea and implement it as a computer program.
The problem with this approach is that audibility has very little to do with the absolute volume due to masking.
However (and IIRC), WavPack Lossy does not use a psychoacoustic model, so this might loosely apply.
Delay one sound by a few milliseconds. This will make no difference in the sound, but when you subtract you will get a huge difference file
If I understand you correctly, audio developers have implemented this idea already 4 decades ago.
Well, I do not see any software, which uses maximum allowed error level of audio as an input parameter.
But who can guarantee, that this masking will work, and that the difference will not be audible on all input signals? The whole idea is not about audibility, it is about using minimum bitrate for maximum mathematical closeness of output audio to input audio, just pure calculations, which seems to be the only guarantee here.
LossyWAV is commonly discussed here and I lamented not including it shortly after posting.
Please show me a lossy algorithm with no psychoacoustic model that beats one with a psychoacoustic model where the metric is how low you can go in average bitrate and achieve transparency or near transparency for non-contrived test samples.
You just need to reduce the bitdepth of the audio signal by an amount equivalent to the difference (=noise) you're willing accept. You'll get 6dB more noise per extra bit dropped. Lower bitdepth = lower bitrate when losslessly encoded. So, use any audio editor that allows you to change the bitdepth, then use almost any lossless codec on the result = job done.
For a smarter way of doing it, take a look at lossyWAV.
Who can guarantee that this "maximum mathematical closeness" will work
Also, please don't insult our intelligence by suggesting that we must try all possible input signals before rejecting the assertion that this idea will do better than already established practice built upon well established knowledge when you have not even offered any evidence supporting your concept.
If this isn't about audibility then I completely fail to see the point.
You can change the maximum allowed error level of audio simply by altering the number of bits allocated per sample in an uncompressed context. With an appropriate codec, you can use fractional numbers of bits-per-sample. Then you can compress it down losslessly for a further reduction in file size.
so I think hes interested in highly compressed audio, whereas lossy wav is going to be about 2x that bitrate for good results.
We use psychoacoustic models exactly because we haven't an exact mathematical description of the auditory system, otherwise lossy compression would be deterministic and "just pure calculation" (well, more or less... anyway still more complex than sums and subtractions).
And it makes perfect sense: if you wouldn't consider the input level, a quiet signal would sound worse after your coding than a loud but otherwise identical signal.
Quote from: saratoga on 07 March, 2013, 02:25:49 PMso I think hes interested in highly compressed audio, whereas lossy wav is going to be about 2x that bitrate for good results.No, encoder can use as much bitrate, as it can for max. allowed signals difference. For substituting lossless I would accept the difference of approximately -45 dB and lower if it would be efficient enough.
The point is in reducing file size without any audible loss of quality on all inputs possible with 100% guarantee. That means, that there will be no more killersamples at all. Every user will use his own level of allowed distortions, dependent on sensibility of his ears, and he will know exactly, what he gets.
I do not know exactly, how it will work, but I want to try it, because already established practice does not work good enough. All, we do, is just a blind play with bitrates, believing, that we have some quality there. And when we find one more killersample, we realize, that it was just a believe.
* And this sentiment takes us back to your previous ideas about VBR encoding, wherein you were also effectively demanding that people create an encoder that can guarantee transparency to everyone at a single setting. That wasn’t viable, either.