On Mon, 3 Apr 2000, Rob Leslie wrote:
I don't have a good grasp of why this is yet. I have some ideas but not much more than empirical evidence at the moment.
Anyone else have an opinion?
If we suppose that each instruction take one cycle, a fixed point version need a multiplication and a shift, whereas a floating point instruction doesn't need a shift. And, CPUs with floating point support usually have pretty optimized and fast FP instructions.
I have had some thoughts about using a 16-bit integer representation for the fixed-point operations (rather than 32) which risks losing some audio quality but might run a lot faster under many CPUs, particularly those with 2x16 or 4x16-bit SIMD instructions. It seems this should be possible without losing quality (for 16-bit PCM output anyway) but the hard part is dealing with a lot of the scaling that goes on with numbers outside the (-1.0, 1.0) range.
I tried it with splay, but you loose quality anyway. It's awfully audible. The cummulative rounding error becomes too important.
Nicolas