Hi!
I tested my implementation of some math function for MAD 0.15 library. And I have found this:
function mad_fixed_t III_requantize(unsigned int value, signed int exp) in layer3.c file
input value=8 exp=-23;
that means 8^(4/3)*2(-5.75) correct?
result obtained by mad_f_todouble(requantized)is
0.297241
but calculator says
8^(4/3)*2^(-5.75)
ans = 0.29730177875068
for Q15 format its means ~2 bits difference.
(0.29730177875068-0.29724100000000)*32768
ans = 1.99159810228230
my Q15 arithmetics gives: 9743/32768=0.29733276367188
(0.29730177875068-0.29733276367188)*32768
ans = -1.01531389788215
-1 bit difference
And total difference between MAD math and my = 3 bits!
Maybe I have to compare my math not with MAD - but with some float point implementation?