David Blythe wrote:
Perhaps it depends on what compiler you are using, (I'm using gcc 2.95.2). The FPM_64BIT option.........
I created an FPM_PPC that just used C code macros that generated code equivalent to the in-line assembler you created. I sent the patch to Rob, but I don't know where it went. I also worked on the first version of the layer 3 MLA that Rob then made more generic. I avoid writing in-line assembler when the compiler can do the same job with a couple of hints.
...... I didn't see any suitable 16-bit MAC instruction sequence to do what the MAD_F_MLA macro does on the 405gp,
The 405gp has about a dozen MAC instructions that may not map exactly into something for the MLA. At some point I will probably work on those. My only comment was about your comment.......you didn't do anything unique for the 4xx, you just used regular PowerPC instructions. If you did something unique for the 4xx it should use the MAC instructions.
You will also notice that the superscalar PowerPCs (pretty much anything but the IBM 4xx and MPC8xx) all process about the same speed regardless of FPM_64BIT, FPM_PPC, or other options used. My 400 MHz iMac (G3/750) runs between 10 and 13 percent depending upon options, my 500 MHz G4/7400 runs between 7 and 10 percent. On the other hand, the MPC8xx, where I started the modifications, will run between 60 and 100+ percent on a 66 MHz processor with EDO memory depending upon options chosen.
-- Dan