Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Does lossy encoding always produce the same result? (Read 6677 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Does lossy encoding always produce the same result?

I read on the forum that if I run the same file through a lossy encoder twice it will produce the same bit identical files. I want to know if this is true. I thought the algorithm decides what to keep and throw out but it differs each time the encoder is run? What about the container it's stored in isn't that going to be different each time too (aka AAC -> M4A) thereby making them non-identical CRC's? What if it's VBR won't that vary each time the encoder is run? Any input is appreciated it's just a curiosity and want to know.

Does lossy encoding always produce the same result?

Reply #1
Are you asking (what I answered in the original thread) if LAME A = B the first time and LAME A =B the second time
OR are you asking if LAME A=B the first time and LAME B=B the second time?

For the first is true, but the second is not.
Creature of habit.

Does lossy encoding always produce the same result?

Reply #2
I read on the forum that if I run the same file through a lossy encoder twice it will produce the same bit identical files. I want to know if this is true. I thought the algorithm decides what to keep and throw out but it differs each time the encoder is run? What about the container it's stored in isn't that going to be different each time too (aka AAC -> M4A) thereby making them non-identical CRC's? What if it's VBR won't that vary each time the encoder is run? Any input is appreciated it's just a curiosity and want to know.


I don't see why any encoder developer would _intentionally_ make the codec non-deterministic.  This would greatly frustrate testing, since it's useful to know that a change which shouldn't have changed the output didn't actually change the output.

Some formats do have some randomness in them. For example, the OGG container has a random serial number (which also effects the CRCs), but the codec output itself should be the same for identical code running on the same kind of system, with identical input.  I would not be inclined to trust software that didn't work this way.


Does lossy encoding always produce the same result?

Reply #3
I read on the forum that if I run the same file through a lossy encoder twice it will produce the same bit identical files. I want to know if this is true. I thought the algorithm decides what to keep and throw out but it differs each time the encoder is run? What about the container it's stored in isn't that going to be different each time too (aka AAC -> M4A) thereby making them non-identical CRC's? What if it's VBR won't that vary each time the encoder is run? Any input is appreciated it's just a curiosity and want to know.


I don't see why any encoder developer would _intentionally_ make the codec non-deterministic.  This would greatly frustrate testing, since it's useful to know that a change which shouldn't have changed the output didn't actually change the output.

Some formats do have some randomness in them. For example, the OGG container has a random serial number (which also effects the CRCs), but the codec output itself should be the same for identical code running on the same kind of system, with identical input.  I would not be inclined to trust software that didn't work this way.


Anyone interested in testing this? Uploading some mp3 and then three or four of us encode it with LAME or something else and check the CRC?


Does lossy encoding always produce the same result?

Reply #5
Quote
Does lossy encoding always produce the same result?, was: "Encoding lossy leads to?"


No, it does not not. Lossless to lossy removes some inaudible sounds (also means "degrading the quality") while lossy to lossy removes it (thus losing more quality) even further.
sin(α) = v sound/v object = Mach No.

Does lossy encoding always produce the same result?

Reply #6
Read before replying.

The OP is asking whether encoding the same file on two different occasions will produce different results. Yes if one uses different executables, no if one uses the same executables. (Different compiles of the same version can produce minute differences; this is normal and nothing to be concerned about.)

Why would the same executable produce different results every time? Mathematics are deterministic, unless intentionally made otherwise. And why would someone do that?

Does lossy encoding always produce the same result?

Reply #7
I could see this happening if an encoder and/or decoder used a pseudo-random noise generator (for dithering or whatever) and seeded the generator from the system clock, but I don't know why you'd want to do that.

Does lossy encoding always produce the same result?

Reply #8
Why would the same executable produce different results every time? Mathematics are deterministic, unless intentionally made otherwise. And why would someone do that?


Perhaps if the the executable uses a rand() function somewhere, for some reason.  (I don't know if any lossy encoder does, but it might well - eg, for dithering.)  Even then, the differences shouldn't be detectable in the audible output (at least, not by human listeners).

Does lossy encoding always produce the same result?

Reply #9
I just tested this with Sound Converter in linux and it did produce the same CRC's, interesting. Thanks

PS I just wanted expert answers greynol that's why I started the thread, but yea thanks everyone be good cheerio!

Does lossy encoding always produce the same result?

Reply #10
Why would the same executable produce different results every time? Mathematics are deterministic, unless intentionally made otherwise. And why would someone do that?


Perhaps if the the executable uses a rand() function somewhere, for some reason.  (I don't know if any lossy encoder does, but it might well - eg, for dithering.)  Even then, the differences shouldn't be detectable in the audible output (at least, not by human listeners).

Encoders don't dither. You're thinking of decoders.

Does lossy encoding always produce the same result?

Reply #11
I doubt that decoders generate random outputs either.  All the differences I've seen between different decoders have been attributed to rounding at the LSB.  I've never seen a decoder that delivers inconsistent output when decoding the same file more than  once.  Perhaps someone can provide an example to the contrary.

Does lossy encoding always produce the same result?

Reply #12

I don't see why any encoder developer would _intentionally_ make the codec non-deterministic.  This would greatly frustrate testing, since it's useful to know that a change which shouldn't have changed the output didn't actually change the output.

Some formats do have some randomness in them. For example, the OGG container has a random serial number (which also effects the CRCs), but the codec output itself should be the same for identical code running on the same kind of system, with identical input.  I would not be inclined to trust software that didn't work this way.


Anyone interested in testing this? Uploading some mp3 and then three or four of us encode it with LAME or something else and check the CRC?


You'll get different results,  — I qualified "same kind of system" for a reason.  Since most lossy codecs use floating point permitted differences in compiler behavior can result in slightly different results from some computations.  This is different from run to run determinism.

Does lossy encoding always produce the same result?

Reply #13
I doubt that decoders generate random outputs either.  All the differences I've seen between different decoders have been attributed to rounding at the LSB.  I've never seen a decoder that delivers inconsistent output when decoding the same file more than  once.  Perhaps someone can provide an example to the contrary.

this -  http://www.hydrogenaudio.org/forums/index....st&p=746382  ?


Does lossy encoding always produce the same result?

Reply #15
I agree with the deterministic software comments--in theory there's no reason why you should get a different audio result (beyond "housekeeping" issues not related to the audio data). Unless there's some randomness intentionally built into the encoder, you should always get the same result on the same PC with the exact same "input" file, with the same encoder software, operating with the same settings.

But if you change any of those variables, all bets are off. Different run time libraries, CPUs, operating systems, etc. could have a subtle impact on the floating point math.

And, it's probably obvious, if you rip a CD twice with lossy encoding you may get a different result if your rips are not 100% bit accurate. If the start of the rip, for example, changes by just one sample, the output files may not be identical. Redbook audio CD's don't have the same file integrity as data CD's. So, to test this, you should use the same lossless file and not CD audio media.

Does lossy encoding always produce the same result?

Reply #16
The "randomness" of floating point is generally overemphasized.

There are just a few differences:

x87 float (which works internally in 80bit and is quite slow)
SSE float (which works internally in 64bit and is fast)

(This matters only when doing several operations over some data in CPU registers, before storing it back to memory)


Then, there is the rounding method:

truncate to zero  ( -0.6 = 0 , 0.6 = 0 ) (faster)
round to nearest ( -0.6 = -1, 0.6 = 1 ) (slower)

(You could accumulate a rounding error if working over a variable. That's when the developer has to decide if using double is a better idea)


At last, there is also denormals ( http://en.wikipedia.org/wiki/Denormal_number ), and the action done on them:

keep them.  (really slow on Pentium 4s, slow everywhere else)
round them to zero.



Generally, all these options are either defined by the language (IIRC, C++ defines that floats have to use rounding, not truncating, but I can't find a reference to this right now), or by the options set when compiling (A compiler may use SSE floats if compiling for a supported processor and the operation can be done in such way).



There are hardware differences between implementations, but they have to conform to the standard (IEEE-754), so they should not output a different result than the one expected.


Does lossy encoding always produce the same result?

Reply #18
Depends on the format.  WMA for instance does have a random number generator involved in the decoder, however most decoders seed it with the same value every time, so you'll get deterministic output (but of course you're free to not do this if you don't mind having random output).