Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Detecting MP3s re-encoded from a lower bitrate (Read 9516 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Detecting MP3s re-encoded from a lower bitrate

an MP3 was re-encoded from a lower bitrate MP3 to a higher one?  For example, if I took a 192kb MP3 and transcoded it to a 320kb MP3, while it'd be stupid, could one then take the 320kb MP3 and determine if it's not making "good use" of that bitrate because the source was already compromised?  I've tried taking a 128kb MP3 and transcoding it to VBR with high quality settings, figuring maybe it'd make a low bitrate VBR because the 128kb MP3 didn't have enough resolution to warrant selecting high bitrates in the VBR, but it still uses some pretty high bitrates....  I mention that because I thought a good test would be to take the MP3 and re-encode it to VBR and whatever the median bitrate was would probably be an indicator if the source was another MP3 or a CD.  But, it doesn't seem that's so....

Any ideas?  I'm mostly curious because I have a large collection of old stuff and I think I re-encoded a lot of it and I don't know which I did and which I didn't.  I don't want to re-rip everything.

In that same vein, how detrimental is it to take a 320kb MP3 and convert to HQ VBR to save some space?  My new work-flow is to rip to FLAC then transcode to VBR MP3, but I have some older 320kb MP3 and I'd like to be consistent.  If it's going to be hurtful enough, though, I'll just leave them (supposing they really are 320kb MP3 and not some transcode that I did back when I was dumber and thought it'd help).

Thanks for any ideas/tips.

Detecting MP3s re-encoded from a lower bitrate

Reply #1
I've tried taking a 128kb MP3 and transcoding it to VBR with high quality settings, figuring maybe it'd make a low bitrate VBR because the 128kb MP3 didn't have enough resolution to warrant selecting high bitrates in the VBR, but it still uses some pretty high bitrates....


Encoding to MP3 does more than just "remove" information. Audio quality is not like a liquid that you can simply pour out, ending up with a partially unused container that you can replace with a smaller and more efficient one. You essentially have new data, and when you encode that at VBR, the encoder may determine that a specific part requires more bits to encode than the original CBR MP3.

In that same vein, how detrimental is it to take a 320kb MP3 and convert to HQ VBR to save some space?


You'll have to test that yourself, with your own ears. For peace of mind, I'd just leave them. You know it's not going to get better.


Detecting MP3s re-encoded from a lower bitrate

Reply #2
an MP3 was re-encoded from a lower bitrate MP3 to a higher one?

Adding to what dhromed said, there is only one clue (other than your ears) that allows you to guess that a file may be such a transcode: Look at the spectral plot of the file. If it has a very low low-pass filter, e.g. all frequencies above 16kHz are cut, despite beeing encoded in high bitrate with a recent encoder, chances are that the source has been an inferior lossy format.

Quote
In that same vein, how detrimental is it to take a 320kb MP3 and convert to HQ VBR to save some space?

I would do MP3->MP3 transcodes only if quality is not a priority, because it is very likely that you'll get a noticable deterioration. It's not worth the few kbps you might save in this case. I only ever transcode from high bitrate MP3 to low bitrate for audiobooks in a bloated CBR bitrate (e.g. more than twice the target bitrate), where I don't mind a few artifacts.

Detecting MP3s re-encoded from a lower bitrate

Reply #3
Thanks guys...I'll have to try the spectral plot thing and see.  I was just surprised, I figured if I took a low bitrate MP3 file that the resulting WAV couldn't have anymore "information" that would require a higher bitrate frame.

In the vain, how does it make its choice on bitrate size per frame?  Anyone know?

Detecting MP3s re-encoded from a lower bitrate

Reply #4
Quote
In the vain, how does it make its choice on bitrate size per frame? Anyone know?
I don't know if it will answer you question, but there is quite a bit of information at mp3-converter.com[/u], and there are some links to references on the LAME website[/u].  Other than that, you might have to study the LAME source code.

Quote
I was just surprised, I figured if I took a low bitrate MP3 file that the resulting WAV couldn't have anymore "information" that would require a higher bitrate frame.
I don't understand that either.  Maybe it's a "safety factor"??? 

The decoded WAV file does does actually contain more information/data than the MP3, but it's "fill-in" or useless-redundant information.  (i.e. A 24-bit/96 kHz that was up-sampled from a 16-bit 44.1kHz file contains more information than the original file.)

Detecting MP3s re-encoded from a lower bitrate

Reply #5
Quote
In the vain, how does it make its choice on bitrate size per frame? Anyone know?
I don't know if it will answer you question, but there is quite a bit of information at mp3-converter.com[/u], and there are some links to references on the LAME website[/u].  Other than that, you might have to study the LAME source code.

Quote
I was just surprised, I figured if I took a low bitrate MP3 file that the resulting WAV couldn't have anymore "information" that would require a higher bitrate frame.
I don't understand that either.  Maybe it's a "safety factor"??? 

The decoded WAV file does does actually contain more information/data than the MP3, but it's "fill-in" or useless-redundant information.  (i.e. A 24-bit/96 kHz that was up-sampled from a 16-bit 44.1kHz file contains more information than the original file.)

Yes, I realize the WAV has "extra" resolution that can't be used by the MP3 since it's already lost.  This is why it really puzzles me.  Maybe you're right, if I tell LAME to create a "super high quality" VBR MP3, maybe it just decides not to use any frames below a certain point just in case.  Maybe I'll look at the source, if for anything just to see if there's a way I can look at a resulting WAV and tell what the "most packed" VBR would be and then encode at that rate?

In fact, maybe that's another question.  Anyone know of a tool that can look at a waveform and say "the highest VBR rate that'd ever be needed in this file is x, most the frames would need no more than y).  Be curious if something like that gave similar results on a 128k MP3->WAV result.

Detecting MP3s re-encoded from a lower bitrate

Reply #6
Yes, I realize the WAV has "extra" resolution that can't be used by the MP3 since it's already lost.  This is why it really puzzles me.  Maybe you're right, if I tell LAME to create a "super high quality" VBR MP3, maybe it just decides not to use any frames below a certain point just in case.


Well, not really. Let's see how the encoder works:
- first it transforms the samples from time domain to frequency. This is a simple linear transform, essentially lossless (although in case of MP3, not completely lossless);
- then it tries to simplify the structure of the obtained frequency vector by applying psychoacoustic model and reducing precision of some frequency components representation (the amount of simplification is defined by the preset quality level, or bitrate limit);
- then it utilizes this simplified structure to encode the frequency vector with a compact low-entropy code.

Now, what happens when the stream is decoded and encoded again? Indeed, the information that is lost can not be restored. This means that the mentioned simplified frequency structure is still there (at least to certain extent). But the encoder does not look for it. It does not assume that the frequency vectors were already quantized once. Thus is does not look for possible quantization patterns, and it does not utilize the presence of such patterns, if any. Instead it applies the entire perceptual encoding cycle, all over again. And from the perceptual model point of view, the fact that the frequency vectors were previously quantized does not help much.

To employ the fact that the stream was already encoded once, the encoder would need to use entirely different analysis model.

 

Detecting MP3s re-encoded from a lower bitrate

Reply #7
Yes, I realize the WAV has "extra" resolution that can't be used by the MP3 since it's already lost.  This is why it really puzzles me.  Maybe you're right, if I tell LAME to create a "super high quality" VBR MP3, maybe it just decides not to use any frames below a certain point just in case.


Well, not really. Let's see how the encoder works:
- first it transforms the samples from time domain to frequency. This is a simple linear transform, essentially lossless (although in case of MP3, not completely lossless);
- then it tries to simplify the structure of the obtained frequency vector by applying psychoacoustic model and reducing precision of some frequency components representation (the amount of simplification is defined by the preset quality level, or bitrate limit);
- then it utilizes this simplified structure to encode the frequency vector with a compact low-entropy code.

Now, what happens when the stream is decoded and encoded again? Indeed, the information that is lost can not be restored. This means that the mentioned simplified frequency structure is still there (at least to certain extent). But the encoder does not look for it. It does not assume that the frequency vectors were already quantized once. Thus is does not look for possible quantization patterns, and it does not utilize the presence of such patterns, if any. Instead it applies the entire perceptual encoding cycle, all over again. And from the perceptual model point of view, the fact that the frequency vectors were previously quantized does not help much.

To employ the fact that the stream was already encoded once, the encoder would need to use entirely different analysis model.

So what you're saying is that the bitrate itself doesn't imply that the next encode couldn't/wouldn't use higher bitrates to perform a better encode.  I could take a 32k MP3 and transcode to VBR and still it's possible during analysis the encoder would say "I could really use a 192k frame here?"  That the bitrate in the original file doesn't result in a quality loss measurable by the next encode?

Detecting MP3s re-encoded from a lower bitrate

Reply #8
I could take a 32k MP3 and transcode to VBR and still it's possible during analysis the encoder would say "I could really use a 192k frame here?"


The encoder doesn't know the original bitrate. It doesn't know anything at all about the original file. It just sees a set of samples, as decoded from the original MP3.

Detecting MP3s re-encoded from a lower bitrate

Reply #9
In addition to alexeysp’s good technical description, think of it this way: The encoder uses trickery to reduce the disk space required to store the audio, and this involves masking frequencies the user can’t (in an ideal world!) hear, but the process necessitates changing the audio and thereby adding noise, etc. The end result is a waveform that is both different from the source file and quite possibly would seem more complex to the encoder on a second pass.

As has been said, encoders are not perfect and invariably cannot determine that the waveform was compressed once and act accordingly—that is, they are not idempotent—and this is why generation loss due to transcoding is a potential problem.

Detecting MP3s re-encoded from a lower bitrate

Reply #10
I could take a 32k MP3 and transcode to VBR and still it's possible during analysis the encoder would say "I could really use a 192k frame here?" That the bitrate in the original file doesn't result in a quality loss measurable by the next encode?


The encoder starts with a maximum resolution for each block. It then selectively reduces resolution of certain parts of the spectrum (according to the perceptual masking criteria) until it reaches specified bitrate or quality limit. The encoder will not try to reduce the bitrate further than necessary, even if the frequency components already have lower "intrinsic" resolution. It simply does not perform this sort of analysis.

While in VBR mode the encoder may analyze block "complexity", it uses different (psychoacoustic) criteria for complexity definition. Thus the determined "complexity" does not necessary correlate with the "intrinsic resolution" induced by the previous quantization pass.

In practice the situation is further complicated by additional factors: first, the subband separation/synthesis process is not lossless, so the frequency quantization structure is not strictly preserved in the decoded stream; second, additional delay and/or padding may be introduced during the encoding/decoding process, so the block boundaries will not match on the first and second encoding passes (this is especially likely to happen if different codec implementations are used for first and second encodings).