Rob Leslie wrote:
David Blythe wrote:
measuring on a 200Mhz 405 with gcc 2.95.2, -O3, i get the following cpu utilization
FPM_DEFAULT 20% (low accuracy) FPM_PPC 24% FPM_64BIT 35% FPM_64BIT + OPT_SSO 26%
David, I'm curious: do you get better performance with -O3 than with the default set of optimization flags?
Seems like a fair question. I got in the habit of using -O3 with the 0.11.xx release before you added the extra options and never broke the habit. I gave the default options a try today and it broke the cross compiler we are using :(
powerpc-linux-gcc -DHAVE_CONFIG_H -I. -I. -I. -DFPM_PPC -Wall -g -O -fforce-mem -fforce-addr -finline-functions -fstrength-reduce -fthread-jumps -fcse-follow-jumps -fcse-skip-blocks -fexpensive-optimizations -fregmove -fschedule-insns2 -c frame.c -o frame.o frame.c: In function `mad_frame_mute': frame.c:496: Internal compiler error in `make_edges', at flow.c:967 Please submit a full bug report. See URL:http://www.gnu.org/software/gcc/faq.html#bugreport for instructions.
Its the -fstrength-reduce option that seems to provoke the problem; however, removing just that option and plowing ahead, i get for cpu utilization (i wouldn't call these particularly exact, they are easily +/-2%):
FPM_DEFAULT 19% FPM_PPC 24 FPM_PPC+SSO 21 FPM_PPC+ACCURACY 25 FPM_64BIT 27 FPM_64BIT+SSO 23 FPM_64BIT+ACCURACY 35
There are definite improvements with the config default options. I also did a fairly unscientific look at the static instruction counts for synth.c. -O3 versus "config defaults" for FPM_PPC were about 2400 versus 2000 instructions (although, there wasn't really a noticeable improvement in the cpu utilization). FPM_64BIT was about 3800 versus 2300. I think the FPM_64BIT performance and accuracy could also be further improved by creating MLA macros to work for FPM_64BIT (just redefine mad_fixed64hi_t to be long long and accumulate in hi, ...)
So is anybody else having problems with the -fstrength-reduce? david