Quote Originally Posted by datenwolf View Post
Well, technically it is and the RAID checksumming code actually makes use of the FPU registers, which are what the fuzz is all about.



A oftenly found misconception is that floating point equals precision. What floating point gives you is a fixed number of significant (binary) digits, the mantissa, and a value scale defined by the exponent. In a floating point number you never get more precision than the mantissa can hold. The mantissa is usually normalized, which means it's value is interpreted as being in the range 0…1 and the most significant bit shifted to the upper boundary. In addition you have a sign bit which gives you an effective mantissa value range of -1…1 quantized by the integer range given by the mantissa bits. As long as the effective value of a floating point number stays within the range -1…1 the scaling exponent is <=0 and what you actually have is the equivalent of an integer of the same number of bits as the mantissa has (divided by the integer value range).

Of course with small absolute values, and hence negative exponent the number resolution increases. But you don't benefit from this, because you're dealing with a quantized signal and eventually the signal goes into a DAC which can, by principle not resolve less than one bit. DACs are inherently fixed point, that's the nature of their circurity.

So why use floating point at all then? Because say if the absolute value is larger than what would fit into the integer range? Then a integer would either wrap or saturate (depending on operations used). A floating point number however uses a exponent >0 then. But this means that now you must drop bits of your mantissa. If the exponent is 1, then your mantissa resolution has been halved, if the exponent is 2 it's only 1/4 of the mantissa resolution and so on.

In audio applications float is used, because this allows you to mix several signals without fearing of saturating or wrapping them. However this means a loss in precision. Also it allows to apply a wide range of gain without thinking too much about it.

But let's look at this from an engineering standpoint. An IEEE754 32 bit floating point has 23+1 bits of mantissa. That's the precision you get. Add to this 8 bits exponent, allowing to expand the value range up to 2^127, which equates to some 20*log10(2^127)=770 dB of additional dynamic range (only usable if the each signal was a square wave). You'll never find this situation in audio.

Remeber why you need that additional headroom? Because when you mix together several channels they may saturate. By how much? Well lets assume every channel makes use of its full scale, then the total value range required is that of n channels. But every bit we add, doubles the value range.

Lets say we use a 32 bit integer value, but our audio signal is normalized to 24 bits. Then we have 2^8 = 256 times the value range we can work with. Which means we can mix up to 256 channels of 24 bit audio without saturating and without loss of precision, because we don't subquantize our original value range. Eventually we'll have to attenuate this, but until then we don't loose information. And now think about it: You'll hardly find 256 audio sources playing at the same time. So we can in practice mix even more stuff, without problems.

And similar applies for actually doing calculations on the stuff. What kind of calculations are these? Well actually only one, namely resampling, which is implemented as a fast finite impulse response filter. Wait what, how can a filte resample you ask? well I leave that as an exercise for the reader to think about this. Just a hint: If you reduce the sampling frequency, what kind of cutoff must you apply beforehand to avoid high frequency aliasing?

This is probably the number one key lesson told in every numerics lecture given (if not, it's a bad lecture): Using floating point makes only sense if your numbers vary over a large range and are not of similar scale. Especially never subtract floating points of similar absolute value. If the numbers are all of similar value, or in a narrow range and precision is pinnacle never use floating point. Use fixed point then. If you are a pure computer science student you actually might not have learnt this, because computer science doesn't deal with such practical problems . But if you're in engineering or physics, then you must know how to work with digital numbers.

So what does this mean for KLANG? KLANG can be compiled with either 40 bits or 48 bits per sample processing, with a footroom of 8 bits (this is for reversible attenuation of up to 48dB, which is plenty). For the typical desktop audio user the 40 bit configuration is already total overkill, as for his el-cheapo 16 bit audio hardware this gives a headroom of 16 bits and due to the footroom of 8 bits, calculations can be done with a precision of 24 bits. Even with 24 bit audio samples the precision is sufficient for most tasks. If desired 48 bit processing can be enabled, but this is more snake-oil than actually audible. But: Since KLANG is also to be used to pass around audio between applications that additional precision might be required.
The theory seems ok for mixing sound, but can it be applied to resampling? if upsample from 44Khz to 48 khz, is it feasible in fixed-point?