outside-dispatch-lock handlers

Thu Mar 2 08:24:18 PST 2006

Hi,

I think we're getting there ;-)

On Thu, 2006-03-02 at 10:57 +0100, Thiago Macieira wrote:
> IIUC, this means "process the handler queue without a dispatch lock". Is 
> that so? That sounds great for me. :-)
> 

It amounts to splitting the dispatch lock in half:
 - a non-recursive lock while we pop the message and decide 
   which handlers to run ("will I handle?" operation)
 - a recursive lock while we run the handlers
   ("actually handle" operation)

This is then combined with a detailed definition of expected recursive
semantics, so the recursive lock has well-understood behavior.

> >OK so say we have signals A, B, C, with handlers:
> >
> >A: 1, 2, 3
> >B: 4, 5, 6
> >C: 7, 8, 9
> >
> >while dispatching A, inside handler 2 we dispatch again twice; we'd
> >first run handler 3, and then queue and run 4, 5, 6.
> 
> I'd rather 3 weren't run until 2 finished running.

That's what happens now, but it seems a bit worrisome to me because it
means 2 can "time warp" 3, instead of only itself.

So for example if we have A and B as two signals:
 A: FlagChanged new value = true
 B: FlagChanged new value = false

say that handlers 3 and 6 are separate invocations of the same function,
which stores the latest value. We'd like to run them in order so we've
stored the correct latest value.

Handler 2, as the one that recurses, could know to use the value
*before* it recurses and avoid doing anything order-sensitive after
recursing, but handler 3 can't know that it's been "time warped" - it
would have no way to know that it should keep the value saved by handler
6.

I can't think of the case where running handler 3 in its normal order
would break - remember we're talking "actually handle" here and we've
changed the "actually handle" operation to never return "stop
processing", so we don't need to wait for 2 to return to know whether to
run 3.

> >If we did that, we could run the "will I handle this?" half when we
> >build the handler queue, and the "actually handle this" half could be
> >run when we process the handler queue. This means that in the above
> >example, we would know whether 3 should be run or not, before 2 has
> >returned.
> 
> "we would know whether the 'actually handle 3' would be run 
> before 'actually handle 2' has returned" <-- is that what you meant?

Right, since whether to run "actually handle 3" is based on "will I
handle? 2" not on "actually handle 2"

> >One way to think of this is that we're keeping all the state of
> >dbus_connection_dispatch() in the DBusConnection instead of local
> >variables. Whenever you call dbus_connection_dispatch(), it just
> >"resumes" where it was last.
> 
> Hmm... not sure I agree. I've got to think a bit more about the 
> consequences, but I'd rather dispatching were "atomic". I.e., either 
> dispatch() does it fully or it does not. And recursively calling 
> dispatch() won't resume an earlier dispatch -- it can only start a new 
> one.

That's more how the code works today. Atomic has two angles I guess:
 - the OOM atomic; this is only possible if we add a "rollback" 
   operation to application handlers, which I think is too much 
   to ask - right now the code is buggy and will re-invoke the app
   handlers on OOM, I think it should only re-invoke the handler
   that returned OOM which means the dispatch fails "halfway"
   not atomically
 - whether dispatching each message has to stay on the same "level"
   as it started, i.e. the above discussion about whether the 
   recursive dispatch keeps going on the current message or not

> >When the 
> >recursive dispatch returns, then handler 2 would be running _after_
> >handler 9. However, a handler does not jump _other_ handlers out of
> >order - handler 3 still runs before 4.
> 
> I wouldn't say "after 9", but "handler 2 is running _around_ handlers 
> 4-9". I still think handler 3 should be run after 2 has finished, since 
> it's processing the same message.

Maybe this differs for signals and method calls?

> >Say you are crazy and now you start calling dispatch() from several
> >threads at once... first theory is "don't do that then" but say we
> >wanted to invent something 'sane' to happen, my best suggestion is:
> > - we hold a recursive lock when invoking a handler
> > - that handler can recurse normally, other threads wait
> >this preserves ordering of handlers and still allows recursive dispatch.
> 
> Agreed. Should this be done in libdbus or in the binding? If in libdbus, 
> we need to add a new lock and add recursive mutex behaviour.

I think it's better in libdbus; then bindings can interoperate if they
want, plus this is kind of tricky and binding authors shouldn't have to
figure it out every time.

> But going back to your A, B, C & 1-9 example, if thread α called 
> dispatch() to handle message A and, before it finished, thread β called 
> dispatch() too, why not let it handle message B?

I _think_ if this is allowed then we can't preserve ordering, because
there's no lock around invoking the handlers for A and invoking the
handlers for B. So even if there's a lock around popping A and B (so the
threads _get_ the messages in the right order), there's no lock to make
the threads run handlers in the right order.

  α gets A
  β gets B
    [ enter free-for-all zone ]
  α runs one handler for A
  β runs a couple handlers for B
  α runs some more handlers for A
  β runs a couple handlers too

I think the only way to make this sane would be to bind specific kinds
of message (say the PropertyChanged signal or something) to specific
threads, so the same kind of message would remain correctly serialized.
But that sort of high-level thread model would have to be in the
binding; the binding would just install a handler that forwarded
messages to other threads, but we'd still only have one dispatch thread
at the dbus level.

dbus is inherently a serialized thing, since there's just one socket, so
keeping it serialized all the way up to the binding seems fine to me.

> Is there any possibility that a dispatch() running at the same time as 
> send_with_reply_and_block steal the message the block is waiting for?

At the top of dispatch(), if the next message is a reply we see if
anyone is blocking for it and if so we send it over to them. So
(assuming this code works!) this case should be fine.

OK!

I think the only unresolved issue then is whether recursion "jumps" all
remaining handlers for a message ahead in the message queue, or only
jumps the single handler that does the recursing.

Havoc