On Saturday, May 25, 2002, at 12:15 AM, Russell O'Connor wrote:
Hmmm, basically I'm looking for the length in bytes. Since the bytes per sample per channel is 2, then we get 64*NBSAMPLES*NCHANNELS. Or so I was thinking.
I see. The number of bytes per sample is of course arbitrary, which is why I counted samples instead of bytes.
So since I haven't decoded the last couple of frames, getting an error on the first frame I decode in the middle of the stream is to be expeded?
Yes, this is not unusual.
Do you know how many frames back I would at most have to decode before I can correctly decode my target frame? Do I need to synth those frames too?
I think the worst case would be the largest possible bit reservoir against the smallest possible frame size.
The largest possible bit reservoir is 511 bytes (MPEG-1 only), and in this case the smallest Layer III frame size is 96 bytes (32 kbps, 48000 Hz). The bit reservoir reaches back over previous frames, but does not include space used by frame headers or Layer III side information. In MPEG-1, the largest possible frame header + side info size is 38 bytes, leaving 58 least possible bit reservoir bytes per frame. Rounding up, 511 / 58 is 9 frames.
The smallest possible frame size for Layer III at all is 24 bytes (8 kbps, 24000 Hz) and the largest bit reservoir in this case is 255 bytes (MPEG-2) . The largest possible frame header + side info size for MPEG-2 is 23 bytes, which is probably impractical in this case. More likely the stream would be single-channel, and in this case the header + side info size will be at most 15 bytes, leaving 9 bytes per frame. Rounding up, 255 / 9 is 29 frames.
These are extreme cases. I think it's actually rare for the bit reservoir to extend beyond one or two previous frames in the most common bitrate/samplerate combinations.
In any case, you only need to synth the frame immediately preceding the target frame.
So basically to do perfect seeking in MP3, I would have to skip though the headers, and keep a point back n number of frames, until I find where I want to be. Reinitalize mad_stream, etc. Seek the file to where my back pointer is indicating. Decode n frames silently, and then start playing at the appropriate place in the nth frame. Whew.
Since frames always have the same playing time for any given layer and sampling frequency, you don't have to read headers all the way to your seek point. You can calculate n * the duration of one frame, subtract this from the time of your seek point, and stop scanning when you reach the frame containing this time point.
-- Rob Leslie rob@mars.org