[gst-devel] RFC: gstreamer sdpbin for audio/video telephony
kvehmanen at eca.cx
Fri Oct 14 03:47:49 CEST 2005
this is a bit of long request for comments, but please bear with me. I've
been working with the Farsight project for a while to add SIP to the list
of supported protocols (using the Sofia-SIP stack I'm also working with).
The idea would be to make gstreamer the preferred media subsystem for
VoIP, video-conferencing, and other similar applications. This has the
nice side-effect that gstreamer would also become the preferred framework
to implement/port new audio/video codecs, packetizers and network
transports. Now I guess we'd all be happy to see this happen! :)
We are currently facing two big challenges: 1) adding missing RTP features
like RTCP, etc; and 2) mapping SIP signaling/state-changes, expressed with
SDP, to gst elements. Now as many of you know, Philippe Khalaf has been
working on (1) for a while already, and some of the stuff has already
found its way to gstreamer. Now I'm now working on the second issue at the
I'm mainly interested in getting architectural comments to this proposal -
IOW, is what I'm proposing here a sane way to use gstreamer, and are there
any clear mistakes/misunderstandings. Comments about the design details
are of course also welcome, but perhaps off-topic for this list.
And a small disclaimer/warning: I'm still a newbie in the world of
gstreamer (although the Farsight people have tried hard to educate
me on the topic ;)), so beware of possible stupid mistakes...
The idea: gstsdpbin
To create a new custom gst bin, in the spirit of
gst-plugins-base-cvs/gst/playback/gstdecodebin.c, which would create and
maintain the required gst elements, based on the SDP inputs received from
the application (.. and more precisely from SIP (or perhaps also RTSP,
Jabber, etc) signaling).
1. Is having multiple independent pipelines (a send/receive pair for
each media) in the same bin, an ok design?
Philippe's rtpbin already does this (rtp receiving and sending are linked,
to be able to support RTCP), so I guess there are no fundamental problems.
Having multiple media (let's say audio and video) managed by the same bin,
is also something I'm not completely sure about. But there are arguments
speaking for it: to realize lip-sync, the audio and video streams must be
managed by a common entity. The gst bin would be the natural place to do
this as it has the necessary timing information, and access to the jitter
buffers which are used to fine-tune the sync.
2. Is gstsdpbin a useful concept?
... IOW, should we keep this in Farsight, or could this be of
interest to a wider audience (possibly integrated to gstreamer)?
The main selling points are to concentrate the following functionality
into one place:
- ability to map between SDP and gst elements (using
gst-plugins-good-cvs/gst/rtp/README ) => describe available
gst elements as SDP, and build a set of gst elements based on
- ability to handle on-the-fly updates (see below)
- handling "intra-pipeline" dependencies: RTCP, lip-sync, etc
The architecture via an example
If you are unfamiliar with how SIP works, please check out
"SIP Basic Call Flow Examples"
.... first, and especially the section 3 on session establisment.
I'll use the audio+video call case as an example here. The application
will provide two properties, the local and remote SDP to the gstsdpbin.
The local SDP describes what ports you are listening on, which codecs
you support, codec parameters, and how rtp payload-types are mapped to
specific codecs. The remote SDP describes the same for the remote
More information about the gstreamer-devel