Rob,
Before I forget, there is something else I looked at a while ago
which I haven't had time to explore more fully....
The distribution of values fetched from rq_table[] seems to be far
from even (certainly for the few files I tested it with): rq_table[0]
gets by far the most hits, rq_table[1] gets a few percent, with
values after that dropping off fairly quickly.
Based on this, maybe putting 'if (value == 0) return (0);' at the top
of III_requantize() might give a (very) small speed up ?
Also, by removing the first few entries from the table and
recognising them directly it would be possible to fit the range of
exponents in 4 bits rather than 5 and increase accuracy slightly.
Any thoughts ??
Andre
--
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie
Rob,
Thanks for catching my stupid mistake. Maybe it will teach me to read
my own comments while fixing code rather than just using a regular
expression search and replace... !
Your patch looks fine.
Andre
--
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie
Nicolas,
Well done for spotting the ARM 'Round while you shift' optimisation !
Originally in imdct_l_arm.S, I made a fairly arbitrary choice to
round in some places and just shift in others to try and balance
code-size/speed against accuracy. With your optimisation its possible
to round everywhere with no penalty, so I guess it makes sense to do
so.
I've attached a new version to this email which hopefully should be
the most accurate version so far - it now rounds everywhere like
FPM_ARM and FPM_64BIT, but has the advantage over them of using 64bit
accumulators for the imdct part.
Rob,
Just a small tweak: if ASO_IMDCT is defined, the window_l[] table in
layer3.c doesn't need to be included (at a saving of 144 bytes....)
as imdct_l_arm.S already contains its own copy.
Andre
--
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie
I put a new release of MAD on the FTP site. It's a day later than promised,
but I wanted to incorporate some patches Nicolas sent me.
This version has many changes. The highlights are:
libmad changes:
* Incorporated Nicolas Pitre's ARM assembly and parameterized scaling
changes.
* Incorporated Andre McCurdy's ARM assembly optimization (used only if
--enable-aso is given to `configure' to enable architecture-specific
optimizations.)
* Reduced FPM_INTEL assembly to two instructions.
* Fixed accuracy problems with certain FPM modes in synth.c.
* Improved the accuracy of FPM_APPROX.
* Improved the accuracy of SSO.
* Added experimental rules for generating a libmad.so shared library.
madplay changes:
* PCM output is now dithered for better audio quality.
* Added a resampling feature for unsupported output sampling frequencies.
* Improved the OSS output module by falling back on 8-bit format if 16-bit
is not available, and by using native 16-bit endianness.
* Added a dual channel output selection option.
The ARM changes produced a favorable effect on the accuracy of the output from
libmad on ARM processors; ARM output is no longer the same as Intel output and
instead now matches the 64bit output, but with better performance:
default with SSO with ASO with SSO+ASO
FPM_APPROX 6.800e-05/L 6.775e-05/L 6.431e-05/L 6.431e-05/L
FPM_64BIT 5.580e-08/F 1.007e-05/L 5.652e-08/F 1.008e-05/L
FPM_ARM 5.580e-08/F 1.007e-05/L 5.652e-08/F 1.008e-05/L
FPM_INTEL 9.000e-08/F 1.008e-05/L
/F means full compliance, /L means limited accuracy, /N means not compliant.
Perhaps the most significant change to `madplay' is the addition of PCM output
dithering. This is an alternative to ordinary rounding that improves the audio
quality by reducing the negative effects of quantization noise. Dithering is
commonly used in professional mastering when reducing studio-quality audio to
16 bits for pressing onto a CD. Since MAD produces PCM samples with >24 bits,
dithering is a good idea.
The dithered output sounds less "grainy" than non-dithered, although this is
easier to perceive with 8-bit output than with 16. Best of all, however,
dithering effectively increases the precision of the output because it allows
bits below the LSB to be heard.
There are many possible dithering strategies, but the chosen algorithm is
fairly simple: it merely involves keeping the cumulative quantization error
less than the significance of the LSB. The effect of this is for the LSB to
modulate in proportional agreement with the bits below it.
The only significant drawback with dithering is that it hinders an analytic
examination of the output, such as compliance testing. Therefore, it can be
turned off with a -d option to `madplay'.
As always, the release can be found here:
ftp://ftp.mars.org/pub/mpeg/
Cheers,
-rob
Rob,
Interesting.
If you have them, I would also be interested in figures for MAD which
compare multiplies using FPM_64BIT, a version which truncates the
LSBit (eg FPM_ARM), and FPM_APPROX.
Thanks
Andre
--
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie