mad-dev

mad-dev@lists.mars.org

682 discussions

Re: [mad-dev] MAD 0.11.2b available
by Andre McCurdy 16 Sep '00

16 Sep '00

Nicolas, Well done for spotting the ARM 'Round while you shift' optimisation ! Originally in imdct_l_arm.S, I made a fairly arbitrary choice to round in some places and just shift in others to try and balance code-size/speed against accuracy. With your optimisation its possible to round everywhere with no penalty, so I guess it makes sense to do so. I've attached a new version to this email which hopefully should be the most accurate version so far - it now rounds everywhere like FPM_ARM and FPM_64BIT, but has the advantage over them of using 64bit accumulators for the imdct part. Rob, Just a small tweak: if ASO_IMDCT is defined, the window_l[] table in layer3.c doesn't need to be included (at a saving of 144 bytes....) as imdct_l_arm.S already contains its own copy. Andre -- ____________________________________________________________ Do You Yahoo!? Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk or your free @yahoo.ie address at http://mail.yahoo.ie

2 1

MAD 0.11.2b available
by Rob Leslie 15 Sep '00

15 Sep '00

I put a new release of MAD on the FTP site. It's a day later than promised, but I wanted to incorporate some patches Nicolas sent me. This version has many changes. The highlights are: libmad changes: * Incorporated Nicolas Pitre's ARM assembly and parameterized scaling changes. * Incorporated Andre McCurdy's ARM assembly optimization (used only if --enable-aso is given to `configure' to enable architecture-specific optimizations.) * Reduced FPM_INTEL assembly to two instructions. * Fixed accuracy problems with certain FPM modes in synth.c. * Improved the accuracy of FPM_APPROX. * Improved the accuracy of SSO. * Added experimental rules for generating a libmad.so shared library. madplay changes: * PCM output is now dithered for better audio quality. * Added a resampling feature for unsupported output sampling frequencies. * Improved the OSS output module by falling back on 8-bit format if 16-bit is not available, and by using native 16-bit endianness. * Added a dual channel output selection option. The ARM changes produced a favorable effect on the accuracy of the output from libmad on ARM processors; ARM output is no longer the same as Intel output and instead now matches the 64bit output, but with better performance: default with SSO with ASO with SSO+ASO FPM_APPROX 6.800e-05/L 6.775e-05/L 6.431e-05/L 6.431e-05/L FPM_64BIT 5.580e-08/F 1.007e-05/L 5.652e-08/F 1.008e-05/L FPM_ARM 5.580e-08/F 1.007e-05/L 5.652e-08/F 1.008e-05/L FPM_INTEL 9.000e-08/F 1.008e-05/L /F means full compliance, /L means limited accuracy, /N means not compliant. Perhaps the most significant change to `madplay' is the addition of PCM output dithering. This is an alternative to ordinary rounding that improves the audio quality by reducing the negative effects of quantization noise. Dithering is commonly used in professional mastering when reducing studio-quality audio to 16 bits for pressing onto a CD. Since MAD produces PCM samples with >24 bits, dithering is a good idea. The dithered output sounds less "grainy" than non-dithered, although this is easier to perceive with 8-bit output than with 16. Best of all, however, dithering effectively increases the precision of the output because it allows bits below the LSB to be heard. There are many possible dithering strategies, but the chosen algorithm is fairly simple: it merely involves keeping the cumulative quantization error less than the significance of the LSB. The effect of this is for the LSB to modulate in proportional agreement with the bits below it. The only significant drawback with dithering is that it hinders an analytic examination of the output, such as compliance testing. Therefore, it can be turned off with a -d option to `madplay'. As always, the release can be found here: ftp://ftp.mars.org/pub/mpeg/ Cheers, -rob

1 0

Re: [mad-dev] MPEG audio decoder compliance
by Andre McCurdy 12 Sep '00

12 Sep '00

Rob, Interesting. If you have them, I would also be interested in figures for MAD which compare multiplies using FPM_64BIT, a version which truncates the LSBit (eg FPM_ARM), and FPM_APPROX. Thanks Andre -- ____________________________________________________________ Do You Yahoo!? Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk or your free @yahoo.ie address at http://mail.yahoo.ie

3 2

MPEG audio decoder compliance
by Rob Leslie 18 Aug '00

18 Aug '00

I thought people might be interested in some results I'm putting together of tests for MPEG audio decoder accuracy. These are the same tests I wrote about some time ago, but now I have results for a much wider range of decoders, and across all three layers. MAD is still showing well, and is in fact the only decoder I am aware that will produce 24-bit output. The results are interesting: http://www.mars.org/home/rob/proj/mpeg/compliance/ Worth noting is that MAD is the only decoder in the class of integer decoders that can produce fully compliant output. Does anyone have any suggestions for decoders to test that are not listed? I would like to test the ARM decoder somehow, but I don't know if I can get access to it. Cheers, -rob

1 0

Layer3 III_imdct_l() in ARM Assembler.
by Andre McCurdy 16 Aug '00

16 Aug '00

Rob + others, Please find attached a version of the layer3 III_imdct_l() function I've written in ARM assembler. I've been messing around with it for a while, mainly as an exercise to learn ARM assembler, but hopefully the end result is worth sharing. Performance wise, it should be quite a bit faster than the current C version (and slightly more accurate as well since the multiply-accumulate steps accumulate into 64bits, then round back to 32bits only when finished). Unfortunately, I don't actually have any ARM based hardware that will play audio, so its only been tested standalone with a small range of test cases on the 'armulator' ARM simulator. Any feedback (especially overall performance) or bug reports from anyone actually able to test it for real would be appreciated. It assembles for me (using gcc v2.95.2) with just: arm-elf-gcc -c arm_III_imdct_l.S (making sure that the extension is .S rather then .s to cause gcc to run it though the C pre-processor). I'd appreciate some feedback, even if the performance increase isn't big enough to bother including it in future releases. Andre -- ____________________________________________________________ Do You Yahoo!? Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk or your free @yahoo.ie address at http://mail.yahoo.ie

2 1

MAD 0.11.1b available
by Rob Leslie 04 Aug '00

04 Aug '00

I made a new release of MAD (0.11.1b) with the following changes: - the libmad code is now in a separate directory - the robustness of the Win32 audio output module is much improved - the SSO is now disabled by default, as output accuracy is deemed to be more important than speed in the general case - a bug in the Layer III sanity checking was fixed that could have caused a crash on certain random data input - the Layer III requantization table was extended from 8191 to 8206 values as some encoders are known to use these values, even though ISO/IEC 11172-3 suggests the maximum should be 8191 (and I couldn't convince anyone on the LAME mailing list that this could be a bug in LAME) - a short man page for madplay was added - a new `madtime' program (not yet built or installed by default) accurately calculates average bitrate and playing time for any file, including VBR - a new experimental multi-stream mixer `madmix' was added (--enable-experimental during configure to add -x option support for this to madplay) The experimental mixer code is designed to minimize CPU involvement in decoding multiple bitstreams; subband synthesis is performed only once after all the mixing has taken place on the intermediate decoded data. Here's an example usage: madmix <(madplay -Qx one.mp3) <(madplay -Qx two.mp3) Any number of input streams can be given on the command line to be mixed. You can also use the same -o option as for `madplay'. If your shell doesn't support process substitution with named pipes, you'll have to mess around and make them yourself. Currently the mix is fixed at 100% for all streams, but this can be adjusted on line 330 of madmix.c. I think there's potential for command-line or file-based configuration to further make this useful, or possibly even a GUI. As always, the release can be found here: ftp://ftp.mars.org/pub/mpeg/ Cheers, -rob

1 0

Re: [mad-dev] Optimisations in layer3.c
by Andre McCurdy 29 Jun '00

29 Jun '00

> Have you compared your optimised version of 0.10.0b to what's > now in 0.11.0b ? > I'm curious what your benchmarking will reveal.. when I get a > chance I'm definitely going to take a closer look at it. I just compared the 0.11.0b version of imdct36() with mine: gcc -O1 : 1329 clocks (asm mul_f), 1529 (C mul_f) gcc -O2 : 1607 clocks (asm mul_f), 2176 (C mul_f) gcc -O3 : 1608 clocks (asm mul_f), 2186 (C mul_f) All of which beat my best of 2215 clocks. It could be closer on an embedded processor with a smaller cache though. The raw output isn't identical to the older version, but i guess this is just different rounding errors (?). My x86 assembler knowledge isn't to good so I haven't really looked at why gcc seems to be so much worse with optimisations above -O1. As well as being slower, code size almost doubles e.g. 10448 bytes (-O3) against 5391 (-O1) for the latest imdct36() using a C mul_f, so maybe its a problem with optimisation in my version of gcc (the default egcs-2.91.66 installed with RH6.2) Does arm-elf-gcc behave in the same way ?? > > On a different subject, does anyone have access to an ARM platform > > for testing ?? I would be interested to know how the MAD code > > (with and without my changes) compares to ARM's own mp3 decode > > library (which claims to use only 29 MHz of cpu bandwidth for real > > time decode on an ARM7 core). > > I have access to a StrongARM 1100 and a 110, but not to anything > else. I don't think I have access to ARM's MP3 decoding library, so > I can't do any comparisons against it. However, I can evaluate your > improvements to MAD alone on the StrongARM. ARM's MP3 decode library (see http://www.arm.com/SoftSys/as022.html ) isn't available for free so it might be difficult to get hold of a copy, but if the claims are true (ie 20 MHz cpu bandwidth for realtime decode on a StrongARM using 27k of code) it looks like being quite a good benchmark to compare MAD against... Is it anywhere near yet ?? If not (to answer lwong's questions...) its probably due to: 1) ARM's library being written with all critical sections in assembler by ARM engineers who know the architecture inside out. 2) ARM's library may use more approximate calculations in some places. 3) Any code compiled from C will have used ARM's own c compiler which from what I've heard seems to be at least 20 percent better than gcc. ____________________________________________________________ Do You Yahoo!? Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk or your free @yahoo.ie address at http://mail.yahoo.ie

1 0

Optimisations in layer3.c
by Andre McCurdy 28 Jun '00

28 Jun '00

This list seems a little quiet, but maybe this will be of interest to someone. I've cleaning up and optimising the imdct36() function in layer3.c (where most of the execution time was being spent), with the result that its now over 30 percent faster when compiled for x86. The attached .tgz file includes the changed layer3.c, plus hacked madplay.c and Makefile which profile the imdct36() function using the pentium timestamp counter. On a different subject, does anyone have access to an ARM platform for testing ?? I would be interested to know how the MAD code (with and without my changes) compares to ARM's own mp3 decode library (which claims to use only 29 MHz of cpu bandwidth for real time decode on an ARM7 core). Andre -- ____________________________________________________________ Do You Yahoo!? Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk or your free @yahoo.ie address at http://mail.yahoo.ie

2 1

Re: Running on Arm7
by lwong＠cheerful.com 28 Jun '00

28 Jun '00

Hello, I have try to compile the last version of this program and run on the arm7 with cpu speed is 74MHz, os is ArmLinux. With arm-asm code used, the program is using 25secs to decode a 10secs mp3 bitstream where 192kbps, 44.1kHz. Using approx 32bits code, the program is using 18secs to decode that 10secs mp3 bitstream. The mad cannot reach to realtime decoding at this cpu. I have already turn on all optimize choices. I have used same arm7 cpu and ran circuit logic mp3 decoder demo to decode some mp3 bitstreams in 128kbps, 44.1kHz. This mp3 program does not need ArmLinux, it run as standalone. The audio playback work very well. I do not know that mp3 program whether possible decoding realtime at 29MHz. Do anyone know why ? ArmLinux eat some resources ? Or some compiler tricks I need to add to speed up the program. As I know, Arm7 has 3 pipeline for processing instruction. Does normal compile not take all advantange of this CPU's structure. Regards, lwong. --------------------------------------------------- Get free personalized email at http://www.iname.com

1 0

Testing conformance of decoders
by Gabriel Bouvigne 16 Jun '00

16 Jun '00

Hello I'd like to know how conformance of decoders has to be tested. I think that there are some reference tracks, but where to got them? Regards, -- Gabriel Bouvigne - France bouvigne(a)mp3-tech.org icq: 12138873 MP3' Tech: www.mp3-tech.org

2 1

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

mad-dev