I've cleaning up and optimising the imdct36() function in layer3.c (where most of the execution time was being spent), with the result that its now over 30 percent faster when compiled for x86.
The attached .tgz file includes the changed layer3.c, plus hacked madplay.c and Makefile which profile the imdct36() function using the pentium timestamp counter.
Thanks, Andre.
Have you compared your optimized version of 0.10.0b to what's now in 0.11.0b? I'm curious what your benchmarking will reveal... when I get a chance I'm definitiely going to take a closer look at it.
On a different subject, does anyone have access to an ARM platform for testing ?? I would be interested to know how the MAD code (with and without my changes) compares to ARM's own mp3 decode library (which claims to use only 29 MHz of cpu bandwidth for real time decode on an ARM7 core).
I have access to a StrongARM 1100 and a 110, but not to anything else. I don't think I have access to ARM's MP3 decoding library, so I can't do any comparisons against it. However, I can evaluate your improvements to MAD alone on the StrongARM.
-rob