Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: foo_audiomd5 mismatches, APE and MP3s (Read 2860 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #25
Its APE demuxer specific fix, and pretty minor issue, as its only about returning pure APE bitstream packets, not actual decoded frames.
Do you have MP3 and/or FLAC sample(s) with similar issues?

 

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #26
@Foobar3031 , you got any offending MP3? I am not at those files at the moment.

FLAC is front-tagged, but ...
Look at the attachment with ID3v1. Which is not supposed to be a thing, so I am not saying it should be fixed - it is kinda dangerous to crop off the end of a file just because it looks wrong? (At least, check whether it is the end indeed.)
Anyway, we know that such files do float around, because of that checkbox in EAC. (The attachment was not produced with EAC, but with a quick and dirty file concatenation.) Anyway
ffmpeg -i short.flac -map 0:a -c copy -f md5 -
returns different values before and after removing tags.

I did not get the same "error" by appending ID3v1 to WavPack.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #27
Apologies for the delay, chaps.

Using Porcus' suggested website, I've uploaded five problem MP3s. If you'd like to take a look, you might be able to spot whatever formatting/file structure hi-jinx is going on. Some info:

  • Files 1 + 2 use APEtags and have the same issue we saw with the .APE files. The MD5s mismatch from the moment they are first added, and no amount of tinkering seems to change that.
  • Files 3, 4 + 5 are more interesting: They are tagged in ID3v2.3 and 2.4, and if you generate an MD5 and let the plugin tag the files, a verification scan will produce a mismatch. However: If you make ANY alteration to the tags using Foobar's inbuilt properties panel - such as deleting the MD5 tag - the file ceases to be problematic. Any subsequent re-run of the MD5 generation and verification will produce a match.

One idea I had is that Foobar itself is correcting/reorganising/standardising problem metadata whenever ID3 tags are modified using its inbuilt tool, but that perhaps the plugin is not doing the same when it adds an AUDIOMD5 tag. Is this possible?

Relatedly: I've now moved my entire MP3 library over to ID3v2.4 and also used the 'optimize file layout' function in case that resolves anything non-standard lurking about. Redoing the MD5 tags, I've found that all files now verify without issue. It's a solution, for sure, but not to a problem we can as yet pin down.  :(

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #28
3, 4, 5 mp3 files have at end garbage added with bytes like "LYRICSBEGIN..." . What software added such garbage?

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #29
3, 4, 5 mp3 files have at end garbage added with bytes like "LYRICSBEGIN..." . What software added such garbage?

Unclear, this is how they were when I found them. Doing a bit of digging, it seems this tag may be part of the Lyrics3v2 format, an official extension of ID3v1 that "resides between the audio and the ID3 tag". That might explain why FFMPEG is interpreting it as audio data; I'm assuming changes to the main body of ID3 tags is causing subtle changes to this lead-on region, or the padding between them?

When I remove ID3v2.3 and ID3v2.4 in MP3tag, the program suddenly recognises the presence of Lyrics3v2, but Foobar doesn't. Perhaps others here have similar MP3s in their collections?

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #30
First time I heard about this format. Its little strange to store size of tags in ASCII.
It could be added support for detection and ignoring of such ranges now that I know what kind of "garbage" it is and how to handle it.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #31
Indeed, I've also come across a post on Reddit that suggests this lyrics extension can play havoc with Replaygain scanners -- seems they too are interpreting the region as part of the audio data.  :o

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #32
3, 4, 5 mp3 files have at end garbage added with bytes like "LYRICSBEGIN..." . What software added such garbage?
* Apparently Winamp, at least with some plugin ... https://id3.org/Lyrics3 for the first version of it.
* Here's the author's own software, MP3 Manager, at archive.org: https://web.archive.org/web/20040103085829/www.mpx.cz/mp3manager/tags.htm
* Maybe some code that is useful: https://github.com/squell/id3 . But since you can code, a chunk that is not a valid MPEG frame and of the form LYRICSBEGIN blah blah blah [LYRICSEND|LYRICS200] is enough of a giveaway?
* ExifTool can apparently read it.

Relatedly: I've now moved my entire MP3 library over to ID3v2.4 and also used the 'optimize file layout' function in case that resolves anything non-standard lurking about. Redoing the MD5 tags, I've found that all files now verify without issue. It's a solution, for sure, but not to a problem we can as yet pin down.  :(
Huh. Did you use fb2k to migrate to ID3v2.4? I migrated to v2.3, that solved the problem for me (all tags in front).

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #33
Huh. Did you use fb2k to migrate to ID3v2.4? I migrated to v2.3, that solved the problem for me (all tags in front).

Yes that's right, I used Foobar's 'MP3 tag types...' and 'optimize file layout' features, one after another. I then deleted any existing MD5s and started afresh.

I did a bit of digging and it seems that ID3v2.4 also defaults to the start of files, though can still work at the end -- by using the optimize command, it should re-locate everything towards the start. Combined with stripping out any ID3v1 (including Lyrics3, I assume) and APEtags, the MD5 tool now works flawlessly.

Interestingly, two other common formats (Musepack, Wavpack) also use APEtags, but don't run into the same issues. I can only assume they have their own ways of demarcating audio data from any surrounding metadata, or perhaps that ffmpeg is better at separating it out with them than with MP3 and APE itself.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #34
Hopefully the tag parsing gets improved in all source trees of ffmpeg.

Indeed, I've also come across a post on Reddit that suggests this lyrics extension can play havoc with Replaygain scanners -- seems they too are interpreting the region as part of the audio data.  :o
That post is so weird. It doesn't say the Lyrics tags messes with scanning. It claims that the existence of Lyrics tag makes foobar2000 ignore ReplayGain data and pretends the file has none. If that is or was true, I don't understand how it can be.

PS: I liked Lyrics tag, I had support for it in my old command line tagger "Tag" many years ago. Simple like ID3v1 but supported longer fields.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #35
I grepped through one of my backup folders and found the attached file. Because it was already tagged by fb2k at some point in time (201 ...), I tag it again now, hence the <AUDIOMD5> tag. No issue as is, with ID3v2.3, Lyrics3v2 and ID3v1.

But mismatch can be created as follows: In fb2k, set MP3 tag types to "anything with APE": With or without ID3v1, with or without ID3v2 - and if so, with or without override 2.3 vs 2.4.

With APEv2 tags, any new MD5 scan&tag will cause mismatch, which is why I conjecture(d) that ffmpeg reads into tags.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #36
yes, mp3 demuxer will happily consume APE tags, as APE is not meant for MP3?!

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #37
I believe the APEv2 tagging spec is designed to be format-independent; it's the default not just for Monkey's Audio, but for Musepack and Wavpack.

I'd guesstimate about 10% of all MP3s I come across have been formatted with it too -- as it turns out, rather unfortunately!  :))

https://mutagen-specs.readthedocs.io/en/latest/apev2/apev2.html

Quote
An APE tag at the end of a file (strongly recommended) must have at least a footer, an APE tag in the beginning of a file (strongly unrecommended) must have at least a header.

When located at the end of an MP3 file, an APE tag should be placed after the the last frame, just before the ID3v1 tag (if any).

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #38
APE tags have been used in MP3 for over twenty years
IIRC, Mp3gain first used it for undo information, thinking that nothing else would touch that. But nah.

As for ffmpeg reading past the end of the audio:
Given that everybody should know there are several possible tag formats appended; when you are getting past the last MPEG frame and there are only such chunks left, is there any reason to play them at all? At worst they end up as noise at hundreds of dB.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #39
Yea, the mp3 format is so old/ugly that it makes hard to ignore garbage added to end of file.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #40
Mycroft, will your earlier fix for the .APE files feed through to ffmpeg builds? I'm not sure if librempeg is a separate project -- if it is, are there builds that can be transplanted in place of ffmpeg for the purposes of Case's MD5 plugin?

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #41
I don't know the MP3 internals very well, but isn't an MPEG frame supposed to start with eleven 1's, and if there are no such left in the file, ... why even treat it as audio? Or are there too often fake frame headers (and if applicable, CRCs)? Anyway, all four tag types flag themselves quite clearly.

FWIW, here is a bug ticket indicating it is known that ffmpeg reads Lyrics3 and "assumes these are still audio data": https://trac.ffmpeg.org/ticket/7879

Mycroft, will your earlier fix for the .APE files feed through to ffmpeg builds? I'm not sure if librempeg is a separate project -- if it is, are there builds that can be transplanted in place of ffmpeg for the purposes of Case's MD5 plugin?
I think @mycroft somewhere here at HA posted an "ffmpeg build" with librempeg?

librempeg has fixed a FLAC decoding bug that the ffmpeg developers haven't looked at - not then and not after it was fixed. Two days ago, ffmpeg still couldn't decode that file.

Re: foo_audiomd5 mismatches, APE and MP3s

Reply #42
Yes, its still here: https://github.com/rorgoroth/mingw-cmake-env/releases/tag/latest

If you push/bribe somehow ffmpeg devs you could get both fixes into ffmpeg.

mp3 is very raw/low level format, so eleven 1's cant help much. And demuxer job is harder because it can not easily read bit by bit like decoder.