[gst-devel] Decoupled elements and schedulers

Tue Mar 30 06:04:02 CEST 2004

Hi Benjamin,

On Mon, 2004-03-29 at 15:26, Benjamin Otte wrote: 
> I tested opt and basic, but not fair (which is a cool scheduler and the
> reason I wrote entry btw).

By the way, should I commit it? It seems ready for more general testing,
now that it runs my DVD player app correctly, and will allow us to test
scheduling related core changes (and we *will* need a few more changes,
that's for sure).

I think it is really good that you coded a separate scheduler. If we
really want to be able to have replaceable schedulers, we need to have a
few working schedulers to test against.

> I think relying on decoupled elements being registered with a scheduler is
> wrong.
> 
> Anyway, I'll start with the problem:
> The core uses gst_scheduler_pad_link and gst_scheduler_pad_unlink to
> (un)register links with a scheduler. My common sense tells me that each
> link should only be registered and unregistered once. However, due to
> gstbin.c:gst_bin_unset_element_sched a link may be unregistered twice,
> once from gst_bin_unset_element_sched on the decoupled element and once
> from gst_pad_unlink, because gst_pad_get_scheduler tells
> gst_pad_unlink they both still belong to the same scheduler.
> If that was confusing: it's a bit hard to explain, you can see it by
> yourself by removing my patch and using entrygthread and gdb on this
> pipeline:
> fakesrc ! identity ! { queue ! identity ! fakesink }

I know the problem. Actually, I was bitten by it the first time I tested
the fair scheduler with a pipeline containing a queue. I put a check in
the pad_unlink function just to work around that, but didn't really look
into the cause.

> Now the reason why I don't think it's correct to allow a scheduler to be
> added to a decoupled element:
> Imagine the following pipelines:
> { fakesrc ! identity } ! { queue ! identity ! fakesink }
> { fakesrc ! identity ! queue } ! { identity ! fakesink }

I tested both cases. As far as I could see, GstBin puts the queue in the
scheduler associated to the corresponding container, which seems
reasonable to me.

> Imagine too, that two different schedulers are used for the two threads,
> scheduler X for the first thread, your scheduler for the second thread.

Ugh. I was wondering if that was possible, but looking at the existing
interfaces (or lack thereof) decided to code under the assumption that
such a thing was not allowed. Probably I'm wrong, but I'd swear that the
older schedulers would break miserably if you tried that.

But anyway, the point is whether we want to have that option, because,
if we do, we'll need to define exactly how schedulers are supposed to
interact.

> Now the only difference between those two pipelines from a scheduler's
> point of view is who controls the queue element. Since schedulers must
> also work when not controlling the decoupled element, it seemed like a
> valid idea to me to assume that schedulers may never control a decoupled
> element.

Well, no. I would say, the pipeline must work regardless of which
scheduler controls the decoupled element. In practice, this would imply
we have a good definition of what it means to control an element, so
that the results are just the same regardless of which scheduler does
the actual job.

> The reason I removed adding decoupled elements to different schedulers was
> now a combination of both: Since it's not ok to rely on controlling
> decoupled elements anyway it shouldn't break anything and the bug above
> was most easily solved by this.

Although I understand your reasons (and I know I could work around the
problem in fair if necessary) I think that if we want a really generic
and abstract scheduler definition, that we can implement effectively in
many different ways, we need to have a way to tell the schedulers of the
existence of decoupled elements in the pipeline. The tricky part, as I
say, is defining the expected scheduler behavior in such a way that
things happen like we want them to happen, no matter which scheduler(s)
is (are) used.

So, let me try my hand at it. This is sort of a primitive spec. Comments
are welcome, of course:

--
A scheduler is responsible for managing a number of elements and links,
where links are defined as a source pad and a sink pad that are linked
together. All links between elements managed by a scheduler must be
managed by the same scheduler. It is possible for a scheduler to manage
a link where only one of the end elements is managed by the scheduler.
In that case, the other end element must be decoupled.

When a scheduler manages a link, it owns the sched_private pointers in
both pads of that link.

One important precision is that elements aren't actually scheduled. What
is scheduled are certain functions provided by an element, namely, the
loop, chain and get functions. Let's call them the "schedulable"
functions of an element [if you have a better word for this, please let
me know ;-)]

A scheduler has two main jobs. The first one is to make sure that data
pushed on one end of a managed link can be later pulled from the other
end. The second one is to run the code in the schedulable functions in
such a way that data keeps flowing in the pipeline. A scheduler can pick
whatever run order it sees fit, as long as it guarantees that any
function that is eligible for scheduling will be eventually scheduled.

Scheduling a function means transferring control to the code of the
function. A scheduler is free to suspend the execution of a schedulable
function at certain points (see below) before if has returned, and to
resume the execution at some later arbitrary point in time.

The rules to decide if a function is eligible for scheduling are as
follows:

R1: Control can be transfered only to functions belonging to elements
that are in the PLAYING state.
R2: Loop functions are eligible at anytime, provided you do not break
R1.
R3: Chain functions can only be scheduled if the scheduler can provide
them with an input block.
R4: Get functions can on only be scheduled if the scheduler can provide
at least temporary space to store the returned data.

The execution of a function can be suspended [at least?] in the
following cases:

R5: If the function explicitly performs a yield operation, the scheduler
is free to suspend its execution and give control to other functions.
R6: If the function performs a push, pull, select or clock wait
operation, the scheduler is free to suspend its execution and give
control to other functions.

Suspending in push, pull and select operations is often necessary, since
the scheduler may need to schedule other functions in order to be able
to fulfill a function's request. Suspending in case of a wait allows
scheduling other code until the wait is completed.
--

This spec of course doesn't imply that you have to register decoupled
elements. But it seems to me that, in the general case, a scheduler may
need to know when a decoupled element changes states. Actually, I'd
rather say both involved schedulers must be informed, but I don't really
know how to do that. What I'm doing in the fair scheduler is managing
everything from the side where the decoupled element is registered. This
assumes, however, that both schedulers are of the same type. We need a
more general solution, anyway.

On the other hand, we can fix the link registering behavior by
registering (and unregistering) links based on the rules stated before.
That means, links between two elements in a scheduler are registered
with that scheduler. Links between a normal and a decoupled element are
registered with the non-decoupled element's scheduler. Links between two
decoupled elements should not be allowed.

The one large issue that still remains is clean up. If you abort the
execution of schedulable functions as the fair schedulers sometimes
does, you could end up leaking memory, namely, blocks of dynamically
allocated memory that are only pointed to from the aborted cothread's
stack. I don't have a general solution for that either. One option I've
thought of is letting the functions run until they finish, but under
special conditions. These would be something like all pull operations
return a "synthetic" EOS event, and all push operations discard the data
silently. As far as I know, this would bring most schedulable functions
to a clean termination, but I may be wrong.

Enjoy,

M. S.
-- 
Martin Soto <soto at informatik.uni-kl.de>
Universität Kaiserslautern