[gst-devel] Ogg and GStreamer
in7y118 at public.uni-hamburg.de
in7y118 at public.uni-hamburg.de
Wed Jul 30 06:24:08 CEST 2003
Hi guys,
I have an issue with ogg (no, not vorbis, but ogg) and how to demux/mux ogg
streams. The reason to do this was me writing a comment plugin for vorbis
(which would probably work for theora and speex, too with minor changes) and
not wanting to have all the ogg stuff in that plugin.
For all that don't know how ogg works exactly,
http://www.xiph.org/ogg/vorbis/doc/framing.html is the bitstream description
and I don't know much more about it either.
Let me try to summarize what ogg does (in quotation marks the official name in
the ogg spec): Ogg takes "packets", which are data chunks of any length,
seperates them into "segments", which are data chunks of at least 255 bytes.
Then it packs segments into "pages", which are a maximum of 64k including some
seek and CRC stuff. These pages are then concatenated and that makes up an ogg
stream.
So, what is the problem? The problem is the type of additional information that
is put into the page headers. Ogg requires an "absolute granule position",
which corresponds to a format that GStreamer calls "frame". It is media
specific and up to the encoded packets to define. And ogg saves for each packet
the number of frames _including_ the packet. Ogg uses this instead of
timestamps. Now I am trying to put/get this stuff into/from GstBuffers.
The obvious idea is to require that all streams that get muxed into ogg need to
be framed so that you have 1 packet per buffer. There is one problem though:
When you get a GstBuffer, you have no idea what the frame offset is _including_
the frame. buffer->offset is the frame offset _excluding_ the buffer.
So there are two options now:
- include a frames field in GstBuffer, so that the frame offset can be
computed. (That's what I vote for)
- wait for the next buffer, and use the offset field of that buffer as
the "absolute granule position" in the ogg stream.
This brings up another question:
If you have for a format (be it time, be it frames, be it whatever) for some
buffer only the end offset of the buffer, you can neither specify length nor
offset, because we use (start offset, length) tupels and compute the end. If we
were to use (start offset, end offset) tupels, we could store the end offset
but not the length. What way do you think would be better to use?
Some other things wrt ogg:
1) I decided to do typefinding with oggdemux because the ogg spec reequires
that the first buffer identifies the stream inside the ogg unambigiously. So
it's easy to do and allows adding more formats easily later. As an interim
solution, oggdemux will do the typefinding itself, later on - when the
autoplugger is able tio do this - it'll just use NULL caps and let the app or
the autoplugger to their job.
2) Ogg allows streams to be multiplexed "grouped" or "chained" (see
http://www.xiph.org/ogg/vorbis/doc/oggstream.html at the bottom). Oggdemux will
handle grouped streams by using multiple pads and chained streams by removing
all pads and giving out new ones. This will not work with spider either.
Oggmux will apparently work just the other way around. It'll take any number of
input streams (not caring about caps) and pack the data into an ogg file.
NEW_MEDIA events will make it use chained streams.
Benjamin
More information about the gstreamer-devel
mailing list