Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Reasonable handling of non-compliant source files in lossless audio compressors (Read 16856 times) previous topic - next topic - Topic derived from FLAC-git Releases (Co...
0 Members and 1 Guest are viewing this topic.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #50
On to the wrong.wav, after having looked at it ... I don't know that WAVE specification, but https://learn.microsoft.com/en-us/windows/win32/api/mmeapi/ns-mmeapi-waveformat doesn't say anything about what information takes precedent in case of inconsistencies. Here we have nChannels=1, nBlockAlign=4 and wBitsPerSample=16, and so if we for the sake of discussion assume that no more than one of these is wrong: Is there anything that says that this file should be interpreted as
* mono with 4 bytes per channel, i.e. wBitsPerSample is wrong and should be replaced by 32?
* 4 bytes per channel and 16 bits, i.e. nChannels is wrong and should be replaced by 2?
* mono with 16 wBitsPerSample, i.e. nBlockAlign is wrong and should be replaced by 2?
There is a consistent interpretation (i.e., assuming no values are "wrong"), which is mono 16-bit samples in 32-bit containers. That's obviously kind of weird, but 24-bit samples in 32-bit containers is a thing, and this is how WavPack interprets it because I initially thought it would be a good idea to handle those cases. Unfortunately, since there are no files like this and the spec kinda says that it's not a thing (or how the sample would be justified, if it was a thing), this "feature" doesn't get exercised and doesn't work now (if it ever did).

Quote
  
As far as I can tell,
flac, TAK and OptimFROG reject it
Monkey's roundtrip it to bit-exactly the original
WavPack is fooled by it, into unpacking it to something else
ffmpeg thinks it is ten seconds long, I assume then it disregards nBlockAlign although I didn't bother to listen
foobar2000 seems to do the same thing when playing it.
I will start rejecting these as well, rather than trying to make my interpretation above work correctly.

Another reminder to always use the -v (verify) option when encoding with WavPack (it catches this one of course).   ;D


Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #51
There is a consistent interpretation (i.e., assuming no values are "wrong"), which is mono 16-bit samples in 32-bit containers.
Oh. Yes this is old WAVE. And, *sighs*, my memory isn't the best ... This thread: https://hydrogenaud.io/index.php/topic,121447.msg1008257.html#msg1008257
Anyway, a stereo 16 in 32 from that collection attached, and it completely beats me why that works with FLAC/WavPack/OptimFROG and the mono doesn't ... rookie mistakes, moi?

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #52
Fixing broken WAV files with SoX
https://langdoc.github.io/2017-05-28-sox-trick.html

Code: [Select]
PS E:\download> .\sox junglede.wav junglede.flac
E:\download\sox.exe WARN wav: Premature EOF on .wav input file
PS E:\download> .\sox --ignore-length junglede.wav ignore.flac
PS E:\download> dir *.flac|select length,name

Length Name
------ ----
 44203 ignore.flac
371537 junglede.flac

The fun thing is, without using --ignore-length, the encoded flac file is even larger than the input .wav file, in both Rarewares and NetRanger's versions.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #53
On to the wrong.wav
...
As far as I can tell,
flac, TAK and OptimFROG reject it
...
Add refalac to the list, and thanks Bryant for the attention.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #54
Add refalac to the list, and thanks Bryant for the attention.
Although qaac/refalac's own WAV parser rejects wrong.wav, refalac will try with libsndfile when it is available, and libsndfile happily load it.
In case of qaac, it will be loaded by CoreAudio's ExtAudioFile API even when libsndfile is not available.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #55
Thanks for the tips. I extracted refalac.exe from foobar's free encoder package without using other libraries, so didn't know about these details.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #56
Well, it's just a shortcoming of how qaac/refalac handles input files. It just tries each handler one by one until it succeeds.
Usually it works, but occasionally behavior becomes less predictable to the user.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #57
On to the wrong.wav
...
As far as I can tell,
flac, TAK and OptimFROG reject it
...
Add refalac to the list, and thanks Bryant for the attention.
I have fixed this in WavPack with this commit by rejecting such WAV files.

While I was fixing this I looked at other formats and see that CAF also has this redundancy and, unlike WAV, the format documentation clearly states that having unpacked samples is valid and gives a detailed example of the 24-bit sample in 32-bit container case (values are shifted "up" for both BE and LE).

I created a few sample files like this and found that WavPack handled them fine, except for not verifying that the proper bytes were zeroed. I tried other programs and FFmpeg and Audacity (via libsndfile) both fail on these files. The only other program that handled them correctly was (no surprise) @nu774 's Foobar2000 CAF component. Good job!

If anyone is interested, I can attach them to the thread.
 

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #58
I have fixed this in WavPack with this commit by rejecting such WAV files.
Thanks. Cool Edit / Audition 24.0 float files use 4 bytes block align and 24 bits per sample in header, will these files be affected? Here are some sample files saved with Audition 1.5:
https://hydrogenaud.io/index.php?action=dlattach;topic=114816.0;attach=22034
[edit]Warning to innocent lurkers: Some of these files are not safe to play outside of Cool Edit / Audition, beware of loud noise!

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #59
I have fixed this in WavPack with this commit by rejecting such WAV files.
Thanks. Cool Edit / Audition 24.0 float files use 4 bytes block align and 24 bits per sample in header, will these files be affected? Here are some sample files saved with Audition 1.5:
https://hydrogenaud.io/index.php?action=dlattach;topic=114816.0;attach=22034
[edit]Warning to innocent lurkers: Some of these files are not safe to play outside of Cool Edit / Audition, beware of loud noise!
Yes, thanks for reminding me of these! The 24.0 float files are indeed broken now. Will need to come up with a more complete fix.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #60
Perhaps only accept type 1, 24 bits per sample, 4 bytes block align per channel files when -a is being used. Otherwise signal analysis on the data chuck would be required to guess the audio encoding format, for example, if it is really unpacked 24-bit int.

This should be relevant:
https://hydrogenaud.io/index.php/topic,121447.msg1009022.html#msg1009022

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #61
I have updated this again with this commit.

For now I've decided to re-allow unpacked samples in WAV (they are already valid in CAF), however I now reject the file if all the required padding bits are not zero, and suggest the --pre-quantize option to "fix" the files, and if it's appropriate I suggest the Adobe Audition / Cool Edit option.

I realize that these files are probably invalid WAVs, but if the data is consistent (i.e., all the correct bytes are zero) then why not? I also prefer the behavior that if someone tries to compress a Audition 24.0 float file without the -a flag they'll get an error message with a reminder about the option instead of generating a corrupt WavPack file (which obviously was less than ideal).

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #62
And so, the validity in Reply 50 of that test file ... with that interpretation?


(BTW, I am not much fan of overmoderation, but @korth as mod: This thread has become more about peculiar WAVE files than a wishlist on one particular encoder/decoder. Or maybe wishful thinking on my part. Maybe it does no harm staying in this subforum.)

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #63
And so, the validity in Reply 50 of that test file ... with that interpretation?


(BTW, I am not much fan of overmoderation, but @korth as mod: This thread has become more about peculiar WAVE files than a wishlist on one particular encoder/decoder. Or maybe wishful thinking on my part. Maybe it does no harm staying in this subforum.)
Yes, the header is valid with that interpretation. However, the audio does not have the appropriate NULL padding (which would be half the bytes) and so you get an error message, which is more informative than "unsupported WAV format".

I agree that this thread has become about lossless compressors' handling of non-conforming WAV files in general, not just FLAC's handling of one particular non-conforming file. Perhaps a rename and move to General Audio or Lossless / Other Codecs would do?

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #64
I have updated this again with this commit.
Thanks. The treatment looks very comprehensive so even uninformed users will get alarmed by the error messages and discourage them from creating problematic files.

Re: Reasonable handling of non-compliant source files in lossless audio compressors

Reply #65
The design goal of Monkey's Audio is to perfectly restore files to their original state.  All the header, data, and footer should be included.  I suppose you could argue whether an audio compressor really needs to do this, but it was one of my design goals.

If anyone finds an exception to this, please provide a sample and I'll fix it.

Thanks.