[gst-devel] RTP work

Tue Apr 5 05:41:24 CEST 2005

Hi all,

This post describes all the ideas and features we want to implement
regarding RTP in Gstreamer. RTP is essential to Farsight, so is
GStreamer, therefore we must take this step.

As you know, we currently have a setup based around jRTP and seperate
the rtp sink/source, and a jrtpenc/dec payload handler.
Going forward we plan to keep basically this structure, but tighten it
up and make it more generically useable. To this end, we propose
splitting like so:

 - rtp sink: has sink pad of type application/x-rtp, 
   has an RTP session manager
 - rtp source: has source pad of type application/x-rtp, 
   has a RTP session manager
 - payload encoder: specific to a given media type/codec, 
   emits on a source pad type application/x-rtp
 - payload decoder: specific to a given media type/codec, 
   receives on a sink pad type application/x-rtp

Our current thinking is that the rtp src/sink should perfectly respect
RFC 3550, and not limit payload possibilities by making assumptions or
preemtively optiminsing. 

Now, obviously many payload decoders will share similar features, e.g.:

 - Buffering
 - Delay calculation and management
 - Packet reordering
 - Packet dropping
 - RTCP awareness for synching streams with wallclock and assigning
   Gstreamer timestamps.

Hence we plan to have a base class that contains a default
implementation of this funtionality. If any payloads have specific
requirements, they can overload these implementations to use their own.

As any payload encoding can specify use of RTP header bits in almost any
manner they wish, the rtp source/sink should emit/receive full RTP
packets, with intact headers. Using jRTP via the base class
funtionality, the payload decoders can read the timestamps, segment
numbers, and anything else it needs. The same thing applies to RTCP
compound packets, these will be sent from the src to the payload
decoder, and again can use jRTP to read all information contained in
these RTCP packets.

Any other specific information can be sent from the src to the payload
decoder synchronously using GstEvents, such as errors, people leaving
sessions, unknown packets etc.

This brings us to slightly harder funtionality:

 - RTCP awareness for on the fly codec/bitrate changes
 - RTCP awareness for multiuser session (later)

As for managing on the fly bitrate/codec changes, this can be done by
using a demux-style element that would read the incomming mime-type, and
send the stream to the corresponding payload decoder, then a simple
switching muxer can combine the back into a single source.

For the managment of multiuser/multicast sessions, another element would
be needed. 
The src/sinks are already multicast aware, therefore the sink will send
to all registered members, and the src will receive from all members
of the corresponding session. After this, some sort of dynamic muxing
element would detect if incoming streams are coming from multiple users
(this can be done by inspecting the RTP headers, or by having the src
tell us about it). This element would then separate the streams, and
send each one to a different payload decoder.

Therefore the payload decoders/encoder can stay multicast UNAWARE, and
act as if they only receive streams from one source. I don't know if
gstreamer allows this, but members can join and leave the session on the
fly, this means that new pipelines will have to be added and removed
during the running of the pipeline. How does this work? Is it doable?

That sums it up for now. We're sure we've missed some important
things ;) Questions still lie open about emitting RTSP information from
the rtpsink to a payload encoder and if this is useful/necessary. We
haven't taken any final decisions yet, we're sure this mailing list will
help us do so.

In terms of work done and to do, the current implementation in farsight
cvs now uses GstEvents rather than buffer flags, but quite a bit of
refactoring will be needed to get to this structure.

Thanks,

Farsight Team