Rob Leslie wrote:
FWIW, here's a patch to use the multiply/accumulate macros in the Layer III IMDCT.
Cool! I would like to request one change. The lowest level MLA inline functions assume a pointer to memory locations, and this isn't the most efficient for the PowerPC implementation. I can shave about another 5 percent or so off the decoding time if I let the compiler actually think these are registers, and not just memory locations.
What I would like to do is have the code in layer3.c and synth.c do something like: mad_f_mla(hi, lo, x, y), which I can use directly, then: #define mad_f_mla(hi, lo, x, y) mad_f_amla(&(hi), &(lo), x, y) and change the names of the processor accelerated inlines to be "mad_f_amla" instead of "mad_f_mla".
Does this make sense? Sorry to be a pain about this........
-- Dan