[gst-devel] 0.9 proposals

Fri Dec 3 06:29:01 CET 2004

On Wed, 1 Dec 2004, Wim Taymans wrote:

> >From looking at your code, which is certainly interesting, I made the
> following observations:
>
For anyone that doesn't know this yet:
My code is in an arch repository at
http://www.freedesktop.org/~company/arch/2004 - gstreamer and gst-plugins
Don't expect it to work too well (though it passes make check), it's
mostly an expression of ideas in code.

I've tried to design the scheduling model top-down. As such I have
identified the requirements of an application first, and I came to the
following conclusions. Please tell me if you as an application developer
see this differently. Note that "application" should be interpreted as
"GStreamer element user" in this context, I consider autopluggers as
applications, too.
- Applications create elements that they want to work with.
- Working with an element works in three different ways:
  1) initiating operations on the elements (like gst_element_set_state or
     gst_element_query)
  2) getting indications about what happens (like connecting to the
     "found-tag" signal or updating a volume level widget)
  3) waiting for events that require interaction and reacting to those
     events (autopluggers do this when they connect to the "new-pad"
     signal to continue plugging the pipeline, multifilesink's
     new-file signal is an example, too)
- no application expects to get into threading problems while doing any of
  the above operations. They only introduce thread safety because we tell
  them to.
- applications view the pipeline they are using as a single object, and
  noone expects threading problems from a single object.
By looking at this it became obvious to me that the interaction with the
application has to be in the application's thread and fully reentrant. Now
the question is how should the core achieve that. There is two
possibilities:
a) Don't use threads at all. This makes all of GStreamer work in the main
thread by default.
b) Provide a way to easily marshal operations between different threads.
Option b) is certainly the desirable thing here, but it is very
complicated to get the marshalling right, especially once you have more
than 2 threads interacting. Think { filesrc ! demuxer ! { queue !
videostuff } { demuxer. ! queue ! audiostuff } } interacting with the main
thread. Just a little example:
- videostuff finds a tag, emits found-tag (works on: audiothread)
- application writes tag into database (application thread)
- application sets state of pipeline back to NULL to stop it (pipeline
thread)
- pipeline sets its children back to NULL (audio and video thread)
- elements signal their successful state changes (pipeline thread)
By now we've crossed 4 thread boundaries while synchronizing the threads,
reentering into two thread contexts multiple times (audio and pipeline). I
don't want to debug this when it fails. I probably won't get what happens
anyway.
That's why I'm proposing to drop threads for 0.10 to get interactivity
between applications and elements right. Just so I don't need to wrap
every interaction with the pipeline in a g_idle_add in the future.

> e) saving state as opposed to keeping it on the stack might be awkward
> and suboptimal for demuxer elements.
> f) lots of interaction with and activity in one central place might turn
> it into a bottleneck. Not sure if that is what you experience. Also
> consider the GUI mainloop that might block the gst mainloop, causing it
> to skip or hang. Due to b) the poll might only be checked after some
> delay, when the mainloop gets control again.
>
The problem with this approach is that GStreamer pipelines are dynamic.
This means the application is allowed to do all sorts of things with the
pipeline while the element is not running but has saved its state on the
stack. This includes (but is certainly not limited to)
- moving the element into another bin (and with that a different
scheduler)
- changing the state of the element (possibly multiple times: PLAYING =>
READY => PLAYING)
- unref the element and not care about it anymore.
This leads to a reentrancy problem that plugin writers are not aware of
when writing their plugins. At least I don't know about lots of plugins
that take into account that gst_pad_push might end up reentering into the
change_state function which cleans up the pad they just pushed on.
Add to this that most other functions require knowledge about the current
state anyway (event handlers, state changes). If that state is saved on a
stack, it is unaccessible to them.
(For anyone who wants an example: Create "fakesrc ! tee ! fakesink tee0.
! fakesink, connect to both fakesink's handoff and on handoff disconnect
tee from all sinks and unref its request pads. You'll end up with at least
some invalid memory reads, probably a crash, just becuase tee doesn't
check after a gst_pad_push that its pads might already be gone. The same
was true for oggdemux, I'm not sure if it still is or if someone has added
special code there.)
The point I'm trying to make here is this:
People are not used to think about reentrancy when they do something that
looks like a normal function call.
They're however very used to programming event based callbacks. It's what
g_signal_connect is all about after all.
So I'm expecting significantly less bugs by switching to an event based
model as opposed to the current model.

Benjamin