Rob,
Before I forget, there is something else I looked at a while ago which I haven't had time to explore more fully....
The distribution of values fetched from rq_table[] seems to be far from even (certainly for the few files I tested it with): rq_table[0] gets by far the most hits, rq_table[1] gets a few percent, with values after that dropping off fairly quickly.
Based on this, maybe putting 'if (value == 0) return (0);' at the top of III_requantize() might give a (very) small speed up ?
Also, by removing the first few entries from the table and recognising them directly it would be possible to fit the range of exponents in 4 bits rather than 5 and increase accuracy slightly.
Any thoughts ??
Andre --
____________________________________________________________ Do You Yahoo!? Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk or your free @yahoo.ie address at http://mail.yahoo.ie
Andre McCurdy armccurdy@yahoo.co.uk wrote:
The distribution of values fetched from rq_table[] seems to be far from even (certainly for the few files I tested it with): rq_table[0] gets by far the most hits, rq_table[1] gets a few percent, with values after that dropping off fairly quickly.
Based on this, maybe putting 'if (value == 0) return (0);' at the top of III_requantize() might give a (very) small speed up ?
An interesting idea... currently the routine is never called with a 0 value since the result is always 0, but there might be an opportunity to shortcut some of the other small values.
On the other hand, wouldn't it be very likely for the oft-referenced values to be in the CPU's memory cache?
Also, by removing the first few entries from the table and recognising them directly it would be possible to fit the range of exponents in 4 bits rather than 5 and increase accuracy slightly.
I would definitely like this. I was sad when I discovered I would have to waste an exponent bit just for the values 16-19... ;-)
BTW, I think the next best speed increase for Layer III will probably come from performing requantization during Huffman decoding. This will do two things: eliminate a lot of extra calculation, since some values stay the same in each scalefactor band, and eliminate a lot of is[576]->xr[][576] data copying.
-rob