[pulseaudio-discuss] Summary of the PulseAudio/GStreamer meeting at FOSDEM

Wed Feb 11 11:53:14 PST 2009

Heya!

At FOSDEM in Brussels this WE the GStreamer folks and I sat down to
discuss how to improve interfacing between GStreamer and
PulseAudio. Here's a short, rough summary of that meeting:

First topic: zero-copy/latency/...

(This is a bit vague, I know)

Currently writing sinks for GStreamer kind of forces you to make use
of a GStRingBuffer, for communicating between the pipeline and the IO
thread. For the PulseAudio sink we'd like to get rid of that, because
that basically doubles latency, comes at the cost of one unnecessary
context switch and some amount of needless memcpy()s.

Given that Gstreamer elements can support allocation functions where
elements connected to them can place data, we should implement this
for PA to allow Gst to directly write into PA's shared memory
segment. This needs some changes to the PA client librariers however
since this kind of "zero-copy" writing is not exposed in the API.

Hence: on the gst side i'd like to see a more "fine-grained" API that
allows the pulse sink to bypass the GstRingBuffer entirely if the
playback speed is 1, and only have it in line for the other
cases. Also, the buffer should generally only filled on request, not
preemptively. PA provides a function pa_stream_writable_size()
that returns how much data is needed by PA and only that much data should
be generated.

To formalize this a bit: The PA API shall provide five functions
(besides initialization/teardown): 

1. a function that allows GstPulseSink to allocate arbitrary memory
   from the client's shared memory area.

2. a function to free/unref memory allocated like this

3. a write functin that takes memory like this plus a seek index where to
   write it to.

4. a function for querying how much data PA needs
   (pa_stream_writable_size() -- already available)

5. a way for notificating GstPulseSink when more data is needed,
   i.e. when pa_stream_writable_size() changes. (already available)

Second topic: Volume interface

We need a nice way to allow per-stream volume adjustement in
GstPulseSink. Right now we have a 'volume' property for that which
some folks implement. This should become part of an official interface
that includes both a mute and a volume field plus some kind of
notification mechanism for changes of these fields triggered from the
server.

Third Topic: Pause/Resume request mechanism

PulseAudio would like to be able to ask applications to pause/resume
their pipelines in certain situations. Background: audio players should
pause when a voip call comes in. All that is missing is defining two
new standard messages on GstBus.

Fourth Topic: Tagging application streams

To present streams nicely for the user in volume control UIs and for
policy decisions PA relies on 'properties' set for client
streams. Some of these properties can be deduced automatically, some
cannot and need to be set explicitly by an application. The most
important property here is the "role", i.e. a property that classifies
a stream as "music", "movie", "phone call" and so on. We probably need
an interface that allows setting this for a sink. How to do this in
detail has not been discussed.

The properties PA knows right now are listed here:

http://0pointer.de/lennart/projects/pulseaudio/doxygen/proplist_8h.html

The properties starting with "application." should probably not be
configured explicitly for GStreamer but deduced from the process
environment or additional GLib facilities. (The properties starting
with "event." and "device." are irrelevant in this context). Leaves
"media." and "window.". name/title/artist can be deduced automatically
from the id3-style tag data. However as mentioned media.role needs to
be explicitly set by the application. It would be very valuable to
attach information about the window that a streams belongs to to the
stream. Big question is how to do that cleanly. Given that Gtk doesn't
want to depend on Gstreamer and Gstreamer not on Gtk this is not easy
to solve... The window issue hasn't been discussed.

Fifth Topic: Better format negotiation

Right now GstPulseSink/GstPulseSrc have a static set of
capabilities. This is suboptimal. We need a more elaborate way of
doing negotiation since sometimes (i.e. AC3) the codec knows better
how to upmix/downmix than PA. We discussed that PA should allow
dynamic reconfiguration of the sample spec of a stream. However, I am
not really convinced anymore that this is a good idea. Main reason is
that the sample spec might be relevant for policy decisions
(i.e. choose a different sound card when 5.1 is requested than when
stereo is requested.). I need to think about this quite a bit more.

Sixth Topic: GstPulseSrc Timing issues

Apparently GstPulseSrc's timing is not that reliable. Needs to be
debugged in more detail.

So, thtat's mostly what we discussed. Now all that's missing is that
someone actually starts coding on this ;-)

Hope I didn't forget anything.

Lennart

-- 
Lennart Poettering                        Red Hat, Inc.
lennart [at] poettering [dot] net         ICQ# 11060553
http://0pointer.net/lennart/           GnuPG 0x1A015CC4