So, I was running the profiler and seeing what the slow points were. To help me see it better I split out a chunk of III_decode into III_decode_a. This chunk is the sizable portion from the first for loop. Now profiler output is better broken down:
29.81 5.72 5.72 7394 0.77 1.05 mad_synth_frame 20.43 9.64 3.92 768704 0.01 0.01 imdct36 10.84 11.72 2.08 532368 0.00 0.00 dct32 8.49 13.35 1.63 29576 0.06 0.12 III_decode_a 7.76 14.84 1.49 29576 0.05 0.06 III_huffdecode 7.14 16.21 1.37 7394 0.19 1.53 III_decode 6.88 17.53 1.32 177728 0.01 0.01 III_imdct_s 3.49 18.20 0.67 768704 0.00 0.01 III_imdct_l 2.50 18.68 0.48 14788 0.03 0.03 III_stereo 1.62 18.99 0.31 2919640 0.00 0.00 mad_bit_read
Where III_decode() used to weigh in at 2, it is now seen as 3 chunks.
I doubt this patch is of any worth, but I just wanted to share.
in dct32() there is a cos table declared every single call, why not move this to just before the function?
in III_huffdecode() you make a new block for most of the function call. Why use this temporary block?
in dct32() there is a cos table declared every single call, why not move this to just before the function?
It's not an array, it's a list of enum constants. It is functionally equivalent to a set of #define's except that it is conveniently limited in scope to the function in which it appears.
The location of this has absolutely no effect on the overhead of each call to the function.
in III_huffdecode() you make a new block for most of the function call. Why use this temporary block?
I'm not sure I understand what you're really asking. There is indeed a separate statement block in III_huffdecode() for decoding the big_values portion of the spectrum, and similarly another block for decoding the count1 region. This is mostly just to keep the scope of the local variables clear, but it can also help the compiler make efficient use of the stack, since variables in the latter block can replace variables in the former.
This doesn't add any overhead at run-time... it's not like calling a function. Just pretend the first block is preceded with "if (1)". :-)
-rob
in III_huffdecode() you make a new block for most of the function call. Why use this temporary block?
I'm not sure I understand what you're really asking. There is indeed a separate statement block in III_huffdecode() for decoding the big_values portion of the spectrum, and similarly another block for decoding the count1 region. This is mostly just to keep the scope of the local variables clear, but it can also help the compiler make efficient use of the stack, since variables in the latter block can replace variables in the former.
This doesn't add any overhead at run-time... it's not like calling a function. Just pretend the first block is preceded with "if (1)". :-)
I was not aware the compiler treated it in this manner. Thanks for explaining. All of my compiler knowledge comes from the books in school, which did not really go into the modern arts much.