questions about lossless codecs

Topic: questions about lossless codecs (Read 3340 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

questions about lossless codecs

2021-07-01 17:37:46

Hi all,
I have 2 questions to pose about lossless codecs, mostly to indulge my curiosity:
1. How likely is it for a lossless codec to go rogue and become lossy during an encode/transcode? I take it this is unheard of unless something really really wrong happens. The only reason I'm asking is because I'm just paranoid I guess. If lossiness does happen, could I hear weird audio artifacts on playback, or would there just be errors with encoding/decoding?

2. In the past, I've been tempted to save some space by converting some of my files to a lossy format. I'm very hesitant to do this, and would rather just buy more hard drive space tbh, but for the time being I'll pretend that isn't an option and indulge my curiosity.

My plan would, if I needed to settle for lossy, start with something like Lossy Wav, WavPack or Optim Frog Duel Stream, since they aren't perceptual codecs. I've read more about them than I've used them, but so far as I understand, any distortion they add would just be quantization noise, so the most challenging part is deciding where to allow that noise to happen and how much. my impression is that even if you don't use a correction file, at high enough qualities the level of added noise is minimal, bordering on undetectable without comparing to the lossless copy.

What I am wondering is this: are these codecs ensuring a certain level of integrity per quality level, or is it possible that things could really break with unusual audio signals? I'm envisioning a scenario where normal music sounds great, but my low level 24/96 recordings get super noisy after some amplification or some pitch shifting for sound design. Or that chiptunes or other unusual sounds will be really broken because the encoder doesn't know what to make of it.

Reading on the Optim Frog web site suggests that, so long as I pick an archival quality, I should be safe even in these circumstances, since the integrity the codec seeks to provide goes beyond perceptual. I'd of course do some testing of my own before committing to one of these, but yeah I'm curious of your thoughts.

Re: questions about lossless codecs

Reply #1 – 2021-07-01 21:48:51

Proven lossless codecs won't be lossless only on faulty hardware or software - if something makes them corrupt. It's very likely that you notice many different bizarre things happening on such systems, and it's not quite possible to tell how a corrupt file will sound - it may be only partially decodable, it may have a short glitch, or a short silence, or something else. If something like that happens the most reliable way is to launch a full file verify of the whole library (on a good system).
Hybrid mode of lossless codecs makes bigger files than perceptual lossy codecs, but they are more predictable with things like chiptunes - they tend to simply add noise instead of weird artifacts of perceptual codecs.
If you're that paranoid go full blown lossless...

Re: questions about lossless codecs

Reply #2 – 2021-07-02 16:21:52

Problem 1: faulty RAM is mentioned, but there are cases where user or some application will use the encoder wrongly. Also if the file is already faulty, you would most often want to keep a copy you have not touched. I have "half a file" due to some crash during ripping, I didn't discover originally and it will be re-ripped if the CD shows up again ... but until then I want everything to give errors, rather than having a FLAC that pretends everything is OK.

So here is a case where I've had problems with CDDA streams as FLAC:

* I've had the FLAC front-end eat my files: https://hydrogenaud.io/index.php?topic=99803.msg921798#msg921798 . For some reason it would not produce files with audio, but it would delete the source file still and replace the old file by something with no music in it.

Nothing of the text below applies to the CD format, but you should take care if you go for more exotic inputs than say, 96/24:

* If the target format cannot accommodate the audio stream, you can get lossy outcomes that could even be quite bad.
(The reason I say this does not apply to the CD format, is that I know no lossless audio codec that cannot handle this!)
Here is one case where you will be in trouble: if you try to convert a 32-bit floating-point file to FLAC - FLAC does not support floating-point, so there is no way really to fit the audio into the box. Then what happens? Some applications - including older versions of foobar2000 - would without warning reduce to something that can be fit into a .flac; and, that could even introduce clipping: https://hydrogenaud.io/index.php?topic=85943.0 .
Therein you see that also WavPack can do a strange thing to exceptionally malformed wav files. Indeed if your input file is malformed, it might be that application X makes it sound OK while converter Y cannot handle the error.

* And, "odd" formats that "only" WavPack supports: You should use WavPack itself for that, rather than going by way of other applications.
- first and foremost DSD. If you try to convert using something else, you will get decode-to-pcm-and-then-encode. WavPack itself knows the DSD stream.
- but also, there are different 32-bit formats. WavPack can handle them in a way that ensures you get your original .wav back.

Problem 2 is not mine, I keep lossless copies anyway.

Re: questions about lossless codecs

Reply #3 – 2021-07-03 11:28:33

I do recommend FLAC
It is lossless
Perfectly taggable including custom tags
MD5 checksum. You can always check if the file has become corrupted.

Re: questions about lossless codecs

Reply #4 – 2021-07-03 16:01:19

And FLAC is the most compatible too ... well with reservations for what Apple are doing. But ALAC in MP4 is inferior on pretty much every parameter except Apple support.

Then I use WavPack for what FLAC cannot handle. WavPack is less supported, slightly slower, but can compress slightly better. It also has better support for images with embedded cuesheets.

(So longer story rambled ... before I got any files that FLAC cannot handle, I used WavPack for oddballs like rips with pre-emphasis; that's some serious junk ... but if I should happen to accidentally delete the tags, I know that it is a different codec and a different profile. Though I admit I could instead have used say flac 1.2.0 or something for those, and it was kinda "yes an excuse to use WavPack because I like it". I am actually just as impressed over TAK.)
I don't really need the compatibility for those tracks: they play wrong (the pre-emph without the tags) or not at all (the floating-point etc.)

Re: questions about lossless codecs

Reply #5 – 2021-07-04 14:38:46

Wait, so wavpack decoder applies proper decoding if the tracks have preemphasis tag?

Re: questions about lossless codecs

Reply #6 – 2021-07-04 15:11:46

2 - Yes you can experiment. IMO your looking at 400k or more average bitrate. If you need much below
this, I suggest a perceptual codec at very high setting where it is 'less' perceptual say 256 ~ 320k . Lets
say 320k. Additionaly, A lossless encoding should be kept. Offline drives are easy even if numerous.

With lossywav & Dualstream a vbr 'model' decides how much and where the noise goes - usually into
the silent bits otherwise its masked by louder bits. At higher setting its buried in the noisefloor. Dualstream should give the highest
quality per bitrate - objectively anyway. Wavpack doesn't yet apply a vbr estimation, instead working like cbr . However,
The noise moves in infinite steps up and down according to signal and this usually gives good results esp at higher
bitrate like 400k. The noise is applied to least important bit etc, noisefloor etc without the precisions of lossywav.
At higher bitrates of 400+ this becomes less of an issue imo and at some point they converge and wavpack may even
exceed lossywav objectively as lossywav is limited to 6db steps while wavpack has infinite. Wavpack also use noise shaping
by default that works pretty well. It is 'dynamic' staying in the 'shadow' of the signal. This can be overriden to OFF -s0 , Or
using a fixed value like s0.5 - that will shape noise higher in the frequency and helps with sharp transients with lots of HF content.
These type of signals can pose problems. In most cases these custom overrides aren't required at all though. Joint stereo is used
as fixed or as needed if using -x switch. Can be turned off with -j0 but usually not needed to bother with this.
So using a sufficient bitrate say 400k and hight quality switches should give you what you are after. Something like -b400hx4 is
a good starting point and likely all that is needed.

Dualstream offers the highest quality per bitrate compared to wavpack and losswav in addtion to a quality mode.
Its also slower but you can use --mode fast --quality 5 and get exactly the same lossy file as vbr will simply
tweak bitrate only and retain exact quality. The encoding and decoding will be faster. Downside is playback is currently
limited to pc / notebook. But its something to consider if your OK with that.

Re: questions about lossless codecs

Reply #7 – 2021-07-04 20:51:25

Quote from: itisljar on 2021-07-04 14:38:46

Wait, so wavpack decoder applies proper decoding if the tracks have preemphasis tag?

No, Porcus meant that he tagged pre-emphasis rips as such (presumably for playback with foo_deemph) and encoded only those as WavPack and everything else as FLAC, as a means to easily identify them in case all the tags in their files were erased by user error or bug.

Re: questions about lossless codecs

Reply #8 – 2021-07-19 13:34:49

Quote from: Cynic on 2021-07-04 20:51:25

as a means to easily identify them in case all the tags in their files were erased by user error or bug.

Yes, this. With "or bug" in parentheses, as the user (myself) is the biggest risk.

Re: questions about lossless codecs

Reply #9 – 2021-07-20 15:48:46

Quote from: shadowking on 2021-07-04 15:11:46

With lossywav & Dualstream a vbr 'model' decides how much and where the noise goes - usually into
the silent bits otherwise its masked by louder bits. At higher setting its buried in the noisefloor. Dualstream should give the highest
quality per bitrate - objectively anyway. Wavpack doesn't yet apply a vbr estimation, instead working like cbr . However,
The noise moves in infinite steps up and down according to signal and this usually gives good results esp at higher
bitrate like 400k. The noise is applied to least important bit etc, noisefloor etc without the precisions of lossywav.
At higher bitrates of 400+ this becomes less of an issue imo and at some point they converge and wavpack may even
exceed lossywav objectively as lossywav is limited to 6db steps while wavpack has infinite. Wavpack also use noise shaping
by default that works pretty well. It is 'dynamic' staying in the 'shadow' of the signal. This can be overriden to OFF -s0 , Or
using a fixed value like s0.5 - that will shape noise higher in the frequency and helps with sharp transients with lots of HF content.
These type of signals can pose problems. In most cases these custom overrides aren't required at all though. Joint stereo is used
as fixed or as needed if using -x switch. Can be turned off with -j0 but usually not needed to bother with this.
So using a sufficient bitrate say 400k and hight quality switches should give you what you are after. Something like -b400hx4 is
a good starting point and likely all that is needed.

Dualstream offers the highest quality per bitrate compared to wavpack and losswav in addtion to a quality mode.
Its also slower but you can use --mode fast --quality 5 and get exactly the same lossy file as vbr will simply
tweak bitrate only and retain exact quality. The encoding and decoding will be faster. Downside is playback is currently
limited to pc / notebook. But its something to consider if your OK with that.

Replies such as this one are more like a quick tutorial, and it is in times like these I wish there were some sort of bookmarking in here. As that isn't the case, this goes straight into my My Pocket feed - properly tagged - so I can refer to it later on.

Thanks (again) SK.
I'm starting to sound repetitive, I know, but you've notoriously been putting a lot of effort (and time) into your testing, so thanks where it's due, IMO.

Notice