32 bit linear to 32 bit float

Topic: 32 bit linear to 32 bit float (Read 6483 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

32 bit linear to 32 bit float

2004-01-06 01:40:32

Is it possible to make a lossless transcode of a 32 bit linear .aiff file to to 32 bit float? I understand that 32 bit float is a superior format than 32 bit linear so a lossless transcode might be possible.

32 bit linear to 32 bit float

Reply #1 – 2004-01-06 01:50:46

Nope, a 32-bit float can't represent all the values a 32-int can.

32 bit linear to 32 bit float

Reply #2 – 2004-01-06 04:58:22

While floats allow for enormous ranges, they're just approximations (IE 1.999999 instead of 2 or something like that), so anytime you convert a number from an int to a float its generally not lossless (unless it happens to land on specific values then it might be).

32 bit linear to 32 bit float

Reply #3 – 2004-01-06 05:21:34

If the integer itself is small, the conversion itself should be lossless..
For example int +/-5 can be converted losslessly into float32.. Only for very BIG integer values would be lossy...

32 bit linear to 32 bit float

Reply #4 – 2004-01-06 05:56:22

More specifically: IEEE 32-bit floating point uses a 23-bit mantissa, but thanks to some clever math you actually have 24 bits of precision. So any unsigned number less than ~16 million (or ~8 million signed) should be represented losslessly. That's -48db from full scale though... virtually anything above that will truncate. You really should stick with 32-bit integer representation unless it's getting in the way of something.

Quote

While floats allow for enormous ranges, they're just approximations (IE 1.999999 instead of 2 or something like that), so anytime you convert a number from an int to a float its generally not lossless (unless it happens to land on specific values then it might be).

Approximations like that come into play when a calculation gets truncated by the mantissa length (ie 1/3 = 0.33333333 and is truncated, and 3/3 = 1, but 3*(1/3)=0.9999999), they don't matter with floating point representations of integers. Unless of course they do have more precision than the mantissa allows, in which case they're rounded. The number 2, requiring exactly two bits of mantissa precision in IEEE 754 (IIRC), is not one of those numbers.

32 bit linear to 32 bit float

Reply #5 – 2004-01-06 06:16:51

So would 32 bit linear to 24 bit linear be similar to 32 bit linear to 32 bit floating point?

The software I need to use the files with doesnt support 32 bit linear.

32 bit float or 24 bit integar?

32 bit linear to 32 bit float

Reply #6 – 2004-01-06 09:58:58

Quote

So would 32 bit linear to 24 bit linear be similar to 32 bit linear to 32 bit floating point?

Only if used values stays under 16777216 (unsigned) or between +8388608 and -8388608 (signed)

32 bit linear to 32 bit float

Reply #7 – 2004-01-06 11:05:47

32-bit float has more than enough precision for almost anything I can imagine.

tacitus10, yes - 32-bit float (as used in, say, Cool Edit / Audition) can maintain all the accuracy of a 24-bit int, so converting your 32-bit int to 32-bit float will give the same results (in terms of loss - you will lose the last 8 bits) as converting to 24-bit int.

What's more, for sample values where the top bits are zero (i.e. quieter moments, smaller amplitude samples) the 32-bit float will store greater precision than 24-bit int, so 32-bit int > 32-bit float is better than 32-bit int > 24-bit int.

If you can make the conversion, I wouldn't worry too much about the loss. Depending on the application.

Cheers,
David.

32 bit linear to 32 bit float

Reply #8 – 2004-01-06 11:28:17

Excuse me? Let's back up to the first few responses. Wasn't the question from linear to float? The answers following the original question seem to be the reversed.

I imagine linear will convert to float easily (i.e. a value of 2 will convert to 2.000).

32 bit linear to 32 bit float

Reply #9 – 2004-01-06 12:51:02

Yes, for very small value this is true.. As stated if the integer value is out of the range of +8388608 and -8388608 then the conversion will be truncated....

This means that 8388609 WILL NOT be converted to 8388609.000

32 bit linear to 32 bit float

Reply #10 – 2004-01-06 13:55:21

Quote

Yes, for very small value this is true.. As stated if the integer value is out of the range of +8388608 and -8388608 then the conversion will be truncated....

This means that 8388609 WILL NOT be converted to 8388609.000

Which is true, but you wouldn't do that - you'd clip everything! So no software designed for the purpose is going to do this.

For this reason, you'd choose to drop the least significant bits, not the most significant ones! and all audio software works this way.

(you probably know this full well - but the thread has gone in a circle so it's not clear!)

Cheers,
David.

32 bit linear to 32 bit float

Reply #11 – 2004-01-07 01:43:31

How does the scale factor work in 32 bit float files? I have read that 32 bit floating point has been estimated to have a 1500 db dynamic range!

32 bit linear to 32 bit float

Reply #12 – 2004-01-07 06:57:22

Might be true.. take the maximum signed value of a 32 bits float.. log10 it and then multiply with 20.0..

32 bit linear to 32 bit float

Reply #13 – 2004-01-07 07:20:01

Quote

Yes, for very small value this is true.. As stated if the integer value is out of the range of +8388608 and -8388608 then the conversion will be truncated....

This means that 8388609 WILL NOT be converted to 8388609.000

I think it will be.

An IEEE754 float has 23bit mantissa+1bit hidden matissa+1bit sign.
So the lossless conversion could be done in the range -2^24+1 <= x <= 2^24-1

32 bit linear to 32 bit float

Reply #14 – 2004-01-07 19:02:59

I can't comment on AIFF specifically, but in every other audio use of 32-bit linear codings I'm familiar with the int represents a fixed point number in the range -1 to +(1 - 2^-31). Converting to float is a simple cast, which at worst loses 7 bits of mantissa precision. If you were particularly conscious of quality such a truncation would be preceded by the addition of triangular dither at the lsb of the truncation, i.e. your 32-bit fixed point number becomes 24-bit dithered before casting it to float.

From an audio processing quality point of view it would be better to stick with 32-bit fixed point, use double precision fixed point (64-bit, ideally with some additional bits at the upper end to accomodate temporary overflows during multiply-accumulate ops) for all intermediate calculations and dither at the lsb of the 32-bit values when truncating the 64-bit intermediate values back to 32-bit, but whether there is any point to that depends entirely on the quality of the original data and how much processing you are going to carry out on the data. If the source and destination for the data streams are typical PC sound cards, you could truncate to 16 bits and not notice it get appreciably worse

32-bit float is most definitely not superior (from a quality perspective) to 32-bit fixed for audio processing, but it is easier to handle.

32 bit linear to 32 bit float

Reply #15 – 2004-01-07 23:43:34

Quote

32-bit float is most definitely not superior (from a quality perspective) to 32-bit fixed for audio processing, but it is easier to handle.

I did some tests on a 32 bit floating point wav. I was able to get values of -288db at the lowest gain and +48 db at the highest. From a laymans perspective I cannot see how 32 bit integar could win in quality as the float has an almost limitless dynamic range (1500 db). Would not the scale factor be able to more than make up for the missing extra bits?

Thanks for the reply. Quality is very important as I am using the files for convolution purposes.

32 bit linear to 32 bit float

Reply #16 – 2004-01-08 01:20:43

The way I understand it, you could say in 32bit float the noise level is always 23-24bit lower than the level (sample value) of the actual signal, while in 32bit fixed point the distance between noise floor and signal depends on the signal's volume. I have no idea what's better for convolution.

32 bit linear to 32 bit float

Reply #17 – 2004-01-08 08:38:14

Quote

I did some tests on a 32 bit floating point wav. I was able to get values of -288db at the lowest gain and +48 db at the highest. From a laymans perspective I cannot see how 32 bit integar could win in quality as the float has an almost limitless dynamic range (1500 db). Would not the scale factor be able to more than make up for the missing extra bits?

Dynamic range doesn't mean crap. Anything above 0db is going to clip and anything below ~-100db is below the noise floor - even if you want to convert back to fixed point space (which you ultimately have to) you will at most be converting back to 32bit's -192dB. The price of all these is that compared to 32bit fixed point, your signal will also lose precision in the 0db to -48db range. Within the 32 bit FP space there is a whole lot of redundant values.

32 bit linear to 32 bit float

Reply #18 – 2004-01-08 10:14:09

Quote

Quote
I did some tests on a 32 bit floating point wav. I was able to get values of -288db at the lowest gain and +48 db at the highest. From a laymans perspective I cannot see how 32 bit integar could win in quality as the float has an almost limitless dynamic range (1500 db). Would not the scale factor be able to more than make up for the missing extra bits?

Dynamic range doesn't mean crap. Anything above 0db is going to clip and anything below ~-100db is below the noise floor - even if you want to convert back to fixed point space (which you ultimately have to) you will at most be converting back to 32bit's -192dB. The price of all these is that compared to 32bit fixed point, your signal will also lose precision in the 0db to -48db range.

With 32-bit float (not taking advantage of the number space above 0dB FS) the noise is (at most) 144dB down from digital full scale, and usually even lower. I think we are (were?) talking simply about storing the data, not processing - but either way, how is there any problem 0 to -48dB?

Cheers,
David.

32 bit linear to 32 bit float

Reply #19 – 2004-01-08 11:28:46

Quote

Quote
32-bit float is most definitely not superior (from a quality perspective) to 32-bit fixed for audio processing, but it is easier to handle.

I did some tests on a 32 bit floating point wav. I was able to get values of -288db at the lowest gain and +48 db at the highest. From a laymans perspective I cannot see how 32 bit integar could win in quality as the float has an almost limitless dynamic range (1500 db). Would not the scale factor be able to more than make up for the missing extra bits?

Thanks for the reply. Quality is very important as I am using the files for convolution purposes.

What you need to think about is what happens when you do additions. Floating point is all very well whilst things are being multiplied, but as soon as two floating point numbers get added they have to be scaled to the exponent of the larger number. The upshot of this is that the real precision of your floating point representation is down to mantissa precision, meaning 24 bits for a 32-bit float, the theoretical dynamic range allowed by a floating point representation is, as noted so tactfully by tangent , irrelevant.

In audio processing systems that have 24-bit input and output data streams, it is not sufficient to carry out processing in a format that has 24-bit precision, such as 32-bit floats. Certain types of processing can have very high noise gains that could leave you with less than 16 bits of artefact-free signal in a 24-bit resolution system. 32-bit fixed point provides an additional 8 bits of precision, on top of which most DSPs used in Audio Processing have double-width accumulators with additional overflow bits to maximise their precision in large sum-of-products calculations like convolution - typically the accumulator result would be 80 bits, which gets dithered back to 32 bits at the end of the calculation.

However, if your original source data is only 16 bits anyway (e.g. if it comes from CD) you can get away with a lot of processing even using 32-bit floats as long as you dither back to 16 bits at the output.

If you have genuine 24-bit input data and are not working with an Audio DSP or equivalent fixed-point library, cast your 32-bit fixed point source data to doubles rather than floats and do the convolution in double precision, then dither the results back to whatever precision your ultimate destination uses - e.g. 24 bits if feeding a 24-bit DAC - for your final output stream.

32 bit linear to 32 bit float

Reply #20 – 2004-01-08 14:27:15

Quote

With 32-bit float (not taking advantage of the number space above 0dB FS) the noise is (at most) 144dB down from digital full scale, and usually even lower. I think we are (were?) talking simply about storing the data, not processing - but either way, how is there any problem 0 to -48dB?

Quantisation noise depends on the exponential in use which depends on the signal leve. Between 0 to -48db, you have 24bits of data plus an exponential telling you to leftshift 8 bits, so you lose the 8 LSBs when converting from 32 bit fixed to 32 bit float.

32 bit linear to 32 bit float

Reply #21 – 2004-01-08 17:07:46

Quote

Quote
With 32-bit float (not taking advantage of the number space above 0dB FS) the noise is (at most) 144dB down from digital full scale, and usually even lower. I think we are (were?) talking simply about storing the data, not processing - but either way, how is there any problem 0 to -48dB?

Quantisation noise depends on the exponential in use which depends on the signal leve. Between 0 to -48db, you have 24bits of data plus an exponential telling you to leftshift 8 bits, so you lose the 8 LSBs when converting from 32 bit fixed to 32 bit float.

Well, yes, we've been through that. So before any processing, that leaves 0 to -144dB perfectly intact. So by 0-48dB being wrecked do you actually mean -144 to -192dB?

btw JohnM, I see the point about processing, though I'd assumed 32-bit float for intermediate storage, and whatever was appropriate for actual processing. It makes me wonder what CEP does though - a nasty IIR filter at 32-bit float processing could be less than ideal, but most things should be more than good enough. (whatever that means!)

Cheers,
David.

32 bit linear to 32 bit float

Reply #22 – 2004-01-08 17:52:03

On the original data conversion point, IEEE754 floats are implicitly normalised (i.e. the hidden msb of the mantissa is always 1 - thats how it gives 24 bit precision using 23 bits). Converting from 32-bit fixed (31 bits + sign bit) to 32-bit float (24-bit mantissa, one hidden, plus mantissa sign bit, plus 8 bit exponent which is the actual exponent value offset by 127) loses the bottom 7 bits of the input data for sample values of magnitudes between 0.5 and 1, 6 bits for values with magnitudes between between 0.25 and 0.5, 5 bits for magnitudes between 0.125 and 0.25 etc. It is not something anyone needs to worry about as a one-off conversion hit, but such repeated truncation would cause problems if that 24-bit precision were carried through into the processing (and as David mentioned in passing, an IIR filter is about the last thing you would like to see implemented using single-precision floats if you cared about the quality of your data, some architectures being orders of magnitude worse than others in that respect).

32 bit linear to 32 bit float

Reply #23 – 2004-01-16 14:39:49

Quote

Yes, for very small value this is true.. As stated if the integer value is out of the range of +8388608 and -8388608 then the conversion will be truncated....

This means that 8388609 WILL NOT be converted to 8388609.000

Oh Sorry.. Actually the integer range is from -16777216 to +16777215.
32 bits Float type can actually represent the full range of 25 bits signed integer.

32 bit linear to 32 bit float

Reply #24 – 2004-01-17 12:11:37

I was wondering:-

1/ How does the 8 bit exponent work in 32 bit float to give a dynamic range of 1500 dbs? Could it be likened to stacking several 24 bit (25 bit?) files on top of each other to create a continuous resolution which reaches 1500 dbs?

2/ What is the level were 32 bit floats typically clip? I have had files +64 db that normalize to 0 db with no problems.

Notice