There are only a very few places where I make assumptions about type length, and this is largely due to haste more than anything. These places are marked so I (or you :-) can fix them.
would a s/long long/int64_t/g patch be accepted then?
Sure.
A bigger problem in my view with respect to portability are the places where I rely on sign-extending right-shifts, and the GCC extension I used to initialize members of the Huffman table unions. In time I'll fix these too.
well gcc is one of the most portable compilers around, depending on gcc is not that horrible. But yeah, it would probably be better to fix this as well.
I'm hoping I can at least get the code to compile with (gasp) Microsoft's VC++ compiler. There are some things I'd like to do that apparently a lot of Windows users would appreciate...
Other than building a generic MP3 player, an idea I have is to write something that will modify the global_gain field of a Layer III stream, effectively changing the overall loudness when the audio is decoded. There has been some interest expressed in this idea, as it would permit a way to "normalize" an MP3 file without converting to WAV first, normalizing, then encoding back into MP3, losing quality in the process. Modifying the global_gain field is trivial as it is located at fixed positions within each frame, and it has a direct and predictable effect on the Layer III requantization and scaling.
I'm not a Windows programmer, so I've promised only to write the back-end stuff to do this, including some analysis to determine what might be an appropriate global_gain offset. A command-line tool would be easy for me, but someone else will have to write the Windows GUI for the masses. :-) I'm guessing whoever does this *may* want to use a compiler other than gcc...
Be careful not to confuse libmad with madplay; all (well, most) exported libmad symbols are indeed prefixed with mad_. The only exception I think is the fixed-point abstraction which uses f_ and the fixed_t type. (Perhaps these should also be prefixed with mad_?)
if it is exported in libmad.h (or seen in 3rd party code) it should have the mad_ prefix or some other library specifix prefix.
I'll see about fixing this.
The other audio_* calls you see in madplay are not part of libmad; they're part of madplay's audio abstraction to support multiple output modules. A look through audio.h, audio.c, and audio_*.c should be instructive. Admittedly, madplay probably should not have defined its own audio_init() and audio_finish(). Sorry about that. :-)
perhaps the audio functions could get wrapped into their own lib as well? If I were to write a player based on this lib, I would likely use the audio routines as well. No sense recreating the wheel.
Maybe not a bad idea. The ID3 stuff should also probably be in a library.
Moving to subdirs would be a good idea.
/mad /decoder /audio /id3 (maybe) /player /docs
I'll probably do this later on, particularly once I get into cleaning up and writing some documentation.
As I see it, if a dsp exists, it does all the math. So it should simply be possible to do: stream -> dsp -> synth / sound (depending on dsp). In the library it should simply choose not to do the hard work, bypassing most of the lib.
That takes the fun, and maybe the whole point, away from the lib. ;-)
I was recently doing some performance tests to compare MAD against other fixed-point decoders I'm aware that also run under ARM. Here's what I found; this is the amount of CPU time required to decode a stereo MPEG stream in each of the audio layers as a percentage of audio real-time:
What do you use to create these numbers? I have simply been using top.
% time madplay -o raw:- $file >/dev/null % time xaudio -output=raw:- $file >/dev/null % time splay -d - $file >/dev/null % time mpg123 -s $file >/dev/null
I then divided the CPU time of each by the actual playing time of the file.
(Some versions of "time" also show the CPU use percentage directly, if you want to play the actual output through /dev/audio.)
How does mad compare in quality of output? My sound system here is rather weak, so I can not really hear the difference between say mpg123 and mad (if there is one).
Although I've not delved seriously into it yet, there is a formal ISO/IEC compliance test suite described by part 4 of the 11172 standard.
I have compared the output of MAD to *some* of the supplied compliance data, and the largest error I found was an off-by-one in the least significant of 24 bits. In other words, 16 bit output should largely be error-free...
This assumes you're not using the FPM_APPROX option; with this there is a noticeable degradation of quality. Another caveat is the 'is' kluge...
Your docs mention the 'is' kluge. What files use is? None of my collection seems to sound different whether I have the kluge enabled or not.
The meaning of 'is' is... intensity stereo. This is a joint stereo encoding option for all three layers, although it's not used very often in Layer III.
Briefly, Layer III can use any of the following stereo encodings:
regular stereo L and R channels are encoded independently
middle/side (M/S) joint stereo L+R and L-R are encoded in place of L and R
intensity joint stereo high frequencies are encoded mono with some stereo imaging info
M/S + intensity joint stereo a combination of M/S for low freqs and intensity for high freqs
Not many encoders support this last option, so it's not very common, but it's the one I've found that gives different results under different decoders.
Without the kluge, MAD produces a weak signal in the right channel for all the bitstreams I was able to find that use M/S + intensity stereo. With the kluge, MAD sounds most like the output from Xaudio, but there are still annoying artifacts in the right channel. The best output I've found (with a clear signal in both channels) comes from mpg123.
I've already started discussing this in other places, but I'd like to get into more of the details here, perhaps in another thread...
-rob