[gst-devel] seeking VBR

Wed Nov 27 02:10:07 CET 2002

On Tue, Nov 19, 2002 at 07:24:44PM +0100, Wim Taymans wrote:
> For the plain indexing round, I would just count the number of bytes
> handed to mpeg2dec, guestimate the start code of the picture and store
> that time<->offset pair in the cache.

What i am finding is that my external time->byte offset index is OK
for VCD but not working when the bitrate is reduced down to 130kbps
(1/4 size video, 64kbps audio).  The stream is too compact to locate
frames accurately using only byte offsets.  Instead of 5-10 frame accuracy,
i'm getting less than 30 frame accuracy.  i speculate that i have no
alternative other that using the GstTimeCache (with 2 level indexing).

At least i'm a lot more familiar with the gstreamer mpeg internals now.

> The idea is to hand a GstTimeCache object (object is already included in
> the core) to the plugin.

Looking at current CVS, it seems strange that the entries are
stored in a GList (GstTimeCacheGroup).  For a per-I frame index,
this is going to kill performance.  i would expect a bsearch'able
and mmap'able array with numbers in big-endian byte order.  Mmap
is great if we get a chance to map it read-only.

For updates, i expect that we are mostly appending records to the
end of the array.  (i don't think we need to optimize the data
structures for indexing during reverse playback.)

> The seeking event on the mpeg video decoder would then first figure out
> what the nearest I frame is, it would convert the timestamp (or frame number)
> to the PTS. It would then do a timeseek on its sinkpad with the PTS
> value. Mpegdemux would get the seek on the PTS, it would map this to the
> byteoffset of the videopacket with that PTS and would then forward the
> byteseek to filesrc.

OK, but there are still some steps remaining to achieve frame-accurate
seek with only an I-frame index:

* Seek to the nearest I-frame
* Decode the I-frame, but don't display it
* Decode, but don't display the intermediate P & B frames
* Once the desired frame is found then resume normal playback

Correct?  Is it true that P frames depend on the previous P frame?  Or only the
previous I frames?

> - API to to mapping from byte<->timeoffset
> - API to get individual index records.
> - API to get info about a record (is it a keyframe (I frame,... ), ..)
> - API to load/save timecache (XML?, customizable?)

This looks pretty easy except for load/save.  What i suggest is to save
each timecache as a directory instead of as a single file.  This way
we can put big-endian binary data in individual files along side
an XML file with all the metadata.  If this sounds strange then we
can also support a tar.gz format transparently.

> - certainty of index (for indexes created after a seek, ...)
> - merging of index entries if certainty is known (playback reaches
>   previously seeked and indexed position from region with higher
>   certainty)

This sounds like a mess.  Do we really need so many different
levels of uncertainty?  How about two certainty levels: anchored
and unanchored?

For example, if you seek to the middle of an mpeg and start
indexing then you can create an I-frame -> PTS index except
that you don't really know the exact I-frame number.  So
this would be a relative index.  As soon as another index
overlaps which is anchored (has an exact I-frame counter)
then the indicies can be matched on the PTS and merged.

> I'm also thinking that the indexing should actually happen both in the
> mpeg demuxer and the mpeg video decoder. The video decoder would map
> frames (I frames) to PTS timestamps, the mpeg demuxer would index
> byte offsets to the PTS values of the different streams, it would
> probably also index SCR timestamps to offsets

Yah, that sounds great.

How can i help?  The load/save code seems like a fairly independent
project, but the timecache data structures need to be finalized.

-- 
Victory to the Divine Mother!!         after all,
  http://sahajayoga.org                  http://why-compete.org