Thanks, Glenn.
I'll have a look at your code.
Erik
----- Original Message ----- From: "Glenn Maynard" g_mad@zewt.org To: mad-dev@lists.mars.org Sent: Thursday, April 01, 2004 9:58 PM Subject: Re: [mad-dev] Heuristic for determining valid MP3 file
On Thu, Apr 01, 2004 at 02:00:44PM +0100, Erik Jälevik wrote:
What's a good heuristic for determining whether a file is a valid MP3 or
not?
My first thought was to grab a chunk at the beginning of the file and
repeatedly
call mad_header_decode until it comes back with a header and take that as an indication that the file is valid.
However, on trying this out on the first 10k of a plain text file, mad_header_decode still returns a header found at 8k but with wrong data in
it.
Maybe this is due to it just looking for the sync bits and this file
happened to
have them within its first 10k?
Is there a better way of doing this? Maybe start trying to decode and if
more
errors than a certain threshold occur, assume the file is not valid?
I read up to 25000 (arbitrary) bytes of data. Fatal errors are fatal; the only errors I handle are MAD_ERROR_LOSTSYNC, MAD_ERROR_BADCRC (which I don't have a test case for), and MAD_ERROR_BUFLEN/MAD_ERROR_BUFPTR.
More specifically, I count the number of bytes read in a given pass; if more than the threshold is read without getting a good packet, I bail. (I also explicitly subtract things like Xing and ID3 headers from this count; ID3v2 headers can easily be larger than that.)
The primary case where this matters is WAVs with MP3 data in them, and unknown headers. This handles those fine (though why people are putting MP3 data in WAVs is well beyond my comprehension).
My actual code is at
http://cvs.sourceforge.net/viewcvs.py/*checkout*/stepmania/stepmania/src/Rag...
See RageSoundReader_MP3::do_mad_frame_decode.
(Any comments on the way I'm doing this are appreciated. I havn't received any bug reports due to mis-detected files in quite a while.)
-- Glenn Maynard