I made another significant improvement to MAD under ARM: using 64-bit multiply/accumulate instead of scaling after each multiply during synthesis, both performance and accuracy is increased.
The performance is increased so much, in fact, it is faster than using SSO.
I decided it was best to change the way MAD is configured, since individual options can now have unpredictable effects on performance and accuracy... instead of specifying SSO, ASO etc. explicitly, there are now three general options:
(the default) produce optimum accuracy and speed --enable-speed optimize for additional speed over accuracy --enable-accuracy optimize for additional accuracy over speed
The idea is to specify how to compromise between the two goals. The default is not to make any compromises, and deliver good accuracy with reasonable performance. Configuring with --enable-speed now automatically enables SSO, if that will make things faster. Likewise --enable-accuracy gets better accuracy wherever possible, at the expense of performance.
It turns out this doesn't make any difference now for FPM_ARM, but it does for FPM_64BIT and FPM_INTEL.
Andre's ARM code is now always enabled for ARM hosts, since it always improves both speed and performance; it must be explicitly disabled if not wanted.
Here's some updated statistics I measured with the new 0.11.3b:
(StrongARM 1100 220 MHz, accuracy and %CPU speed)
Layer I Layer II Layer III FPM_64BIT default 6.198e-8/F 20% 6.198e-8/F 20% 6.931e-8/F 29% FPM_64BIT OPT_SPEED 8.482e-6/F 17% 1.144e-5/L 18% 1.008e-5/L 26% FPM_64BIT OPT_ACCURACY 4.715e-8/F 25% 4.941e-8/F 24% 5.375e-8/F 33% FPM_APPROX * 7.880e-5/L 17% 6.627e-5/L 17% 6.431e-5/L 25% FPM_ARM * 4.667e-8/F 16% 4.906e-8/F 16% 5.338e-8/F 24%
/F means full compliance, /L means limited accuracy. Smaller numbers are better in all cases.
Using FPM_ARM is obviously a big win. :-)
By these numbers, MAD is now tied with Xaudio for Layer III decoding speed! Layer I and Layer II have been faster than Xaudio for some time, and in any case, MAD has always been much more accurate.
Get the new release here:
ftp://ftp.mars.org/pub/mpeg/
Cheers, -rob