The Layer III decoding should get a bit faster once the IMDCT is rewritten, and the memory footprint may even get smaller. :-)
Less memory is good (-:
I have doubts that the speed will eventually reach that of the floating-point decoders, but I'd love to be proven wrong. My primary goal was to have it run well on machines without an FPU, and so far so good.
Why do the floating point libs have the advantage? Pardon me, but I am new to mp3 decoding and the math involved.
Once I get some feedback and become fairly confident in the API, I plan to write up some documentation. Until then, feel free to ask if anything is unclear.
MAD's API is a little different from others I've seen in that the synthesis step is decoupled from the decoding step. This means there is an opportunity to write some interesting filters on the decoded subband samples before they are synthesized into PCM samples. For example, this is how madplay implements stereo->mono conversion given the -m switch, although for Layer III there is also a more efficient way to do this.
Could you give me a rough data flow from input file to sound output? I see it currently as:
get next chunk from file decode chunk into useful data send data to sound processing
and I am sure I am missing the whole picture.