Rob Leslie wrote:
MAD 0.12.4b is now available.
This version incorporates a number of performance improvements for all platforms.
It also includes (untested) native fixed-point math support for the PowerPC platform, contributed by David Blythe. Since I've made some significant changes to the way the fixed-point math routines are maintained, I'd appreciate feedback from anyone using MAD on a PPC to make sure I haven't broken anything.
I did a sample-by-sample compare against the Intel implementation and (oops) found a bug (i introduced). The add with carry sequence in the MLA code for the ppc was being too aggressively scheduled and the carry bit was being lost. The attached patch fixes it. thanks david
--- ../../mad-0.12.4b/libmad/fixed.h Thu Feb 8 18:12:24 2001 +++ libmad/fixed.h Fri Feb 9 20:52:37 2001 @@ -290,12 +290,10 @@ ({ mad_fixed64hi_t __hi; \ mad_fixed64lo_t __lo; \ MAD_F_MLX(__hi, __lo, (x), (y)); \ - asm ("addc %0, %1, %2" \ - : "=r" (lo) \ - : "%r" (__lo), "0" (lo)); \ - asm ("adde %0, %1, %2" \ - : "=r" (hi) \ - : "%r" (__hi), "0" (hi)); \ + asm ("addc %0, %2, %3\t\n" \ + "adde %1, %4, %5" \ + : "=r" (lo), "=r" (hi) \ + : "%r" (__lo), "0" (lo), "%r" (__hi), "1" (hi)); \ })
# if defined(OPT_ACCURACY) @@ -306,7 +304,7 @@ * tracking the magnitude of (1 << (MAD_F_SCALEBITS - 1)) is too * complicated. * - * The __volatile__ improve the generated code by another 5% (fewer spills + * The __volatile__ improves the generated code by another 5% (fewer spills * to memory); eventually they should be removed. */ # define mad_f_scale64(hi, lo) \