> Have you compared your optimised version of 0.10.0b to what's
> now in 0.11.0b ?
> I'm curious what your benchmarking will reveal.. when I get a
> chance I'm definitely going to take a closer look at it.
I just compared the 0.11.0b version of imdct36() with mine:
gcc -O1 : 1329 clocks (asm mul_f), 1529 (C mul_f)
gcc -O2 : 1607 clocks (asm mul_f), 2176 (C mul_f)
gcc -O3 : 1608 clocks (asm mul_f), 2186 (C mul_f)
All of which beat my best of 2215 clocks. It could be closer on
an embedded processor with a smaller cache though.
The raw output isn't identical to the older version, but i guess
this is just different rounding errors (?).
My x86 assembler knowledge isn't to good so I haven't really looked
at why gcc seems to be so much worse with optimisations above -O1.
As well as being slower, code size almost doubles e.g. 10448 bytes
(-O3) against 5391 (-O1) for the latest imdct36() using a C mul_f,
so maybe its a problem with optimisation in my version of gcc (the
default egcs-2.91.66 installed with RH6.2)
Does arm-elf-gcc behave in the same way ??
> > On a different subject, does anyone have access to an ARM
platform
> > for testing ?? I would be interested to know how the MAD code
> > (with and without my changes) compares to ARM's own mp3 decode
> > library (which claims to use only 29 MHz of cpu bandwidth for
real
> > time decode on an ARM7 core).
>
> I have access to a StrongARM 1100 and a 110, but not to anything
> else. I don't think I have access to ARM's MP3 decoding library, so
> I can't do any comparisons against it. However, I can evaluate your
> improvements to MAD alone on the StrongARM.
ARM's MP3 decode library (see http://www.arm.com/SoftSys/as022.html )
isn't available for free so it might be difficult to get hold of a
copy, but if the claims are true (ie 20 MHz cpu bandwidth for
realtime decode on a StrongARM using 27k of code) it looks like being
quite a good benchmark to compare MAD against... Is it anywhere near
yet ??
If not (to answer lwong's questions...) its probably due to:
1) ARM's library being written with all critical sections in
assembler
by ARM engineers who know the architecture inside out.
2) ARM's library may use more approximate calculations in some
places.
3) Any code compiled from C will have used ARM's own c compiler which
from what I've heard seems to be at least 20 percent better than gcc.
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie
This list seems a little quiet, but maybe this will be of interest to
someone.
I've cleaning up and optimising the imdct36() function in layer3.c
(where most of the execution time was being spent), with the result
that its now over 30 percent faster when compiled for x86.
The attached .tgz file includes the changed layer3.c, plus hacked
madplay.c and Makefile which profile the imdct36() function using the
pentium timestamp counter.
On a different subject, does anyone have access to an ARM platform
for testing ?? I would be interested to know how the MAD code (with
and without my changes) compares to ARM's own mp3 decode library
(which claims to use only 29 MHz of cpu bandwidth for real time
decode on an ARM7 core).
Andre
--
____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie
Hello,
I have try to compile the last version of this program and run on the arm7 with cpu speed is 74MHz, os is ArmLinux. With arm-asm code used, the program is using 25secs to decode a 10secs mp3 bitstream where 192kbps, 44.1kHz. Using approx 32bits code, the program is using 18secs to decode that 10secs mp3 bitstream. The mad cannot reach to realtime decoding at this cpu. I have already turn on all optimize choices.
I have used same arm7 cpu and ran circuit logic mp3 decoder demo to decode some mp3 bitstreams in 128kbps, 44.1kHz. This mp3 program does not need ArmLinux, it run as standalone. The audio playback work very well. I do not know that mp3 program whether possible decoding realtime at 29MHz.
Do anyone know why ? ArmLinux eat some resources ? Or some compiler tricks I need to add to speed up the program. As I know, Arm7 has 3 pipeline for processing instruction. Does normal compile not take all advantange of this CPU's structure.
Regards,
lwong.
---------------------------------------------------
Get free personalized email at http://www.iname.com
Hello
I'd like to know how conformance of decoders has to be tested. I think that
there are some reference tracks, but where to got them?
Regards,
--
Gabriel Bouvigne - France
bouvigne(a)mp3-tech.org
icq: 12138873
MP3' Tech: www.mp3-tech.org
Good news!
MAD now supports the MPEG 2 extension to Lower Sampling Frequencies (24,
22.05, and 16 kHz.)
In addition, the Layer III performance has improved yet again due to a new
optimization that avoids the IMDCT calculation when all the inputs would be
zero.
I'd appreciate any testing anyone is able to do with respect to MPEG 2 LSF; I
have several Layer III streams I could verify but I don't have any Layer I or
II streams handy.
The release:
ftp://ftp.mars.org/pub/mpeg/
Cheers,
-rob
After talking to a few debian developers (all ppc and sparc), they seem to
believe that letting the OSS lib handle the LE -> BE conversion is fine. Seems
there are special commands for that and everything that the architecture code
knows about.