outside-dispatch-lock handlers

Wed Mar 1 02:14:10 PST 2006

Havoc Pennington wrote:
>Well, in both cases it means "stop processing the message, don't run
>more handlers" - signals and methods aren't treated differently in
>dbus_connection_dispatch() (except for the default handler for method
>calls that sends UNKNOWN_METHOD)

Ah, ok. That makes sense, in that light...

>Right now the object path handlers and the filters return the same
>DBusHandlerResult enum; I take your point that the enum's naming is more
>appropriate for the object path handlers, and for filters you might
>expect something more like:
> enum DBusFilterResult {
>   DBUS_FILTER_RESULT_REMOVE_MESSAGE,
>   DBUS_FILTER_RESULT_RUN_MORE_HANDLERS
> };

Please call them "stop processing" and "continue processing" :-)

>And in fact after typing that I went back in cvs history and we used to
>have:
>typedef enum
>{
>  DBUS_HANDLER_RESULT_REMOVE_MESSAGE, /**< Remove this message, no
>                                       further processing. */
>  DBUS_HANDLER_RESULT_ALLOW_MORE_HANDLERS /**< Run any additional
>                                            handlers that are
>                                        interested in this message. */
>} DBusHandlerResult;
>
>
>I think I probably changed it when adding the object tree stuff so
>things seemed more method-call-oriented than filter-oriented, but I
>dunno.

Hmm... that explains why the code from Harald never used the object tree 
stuff. It was probably written when filters were the only option.

>Well, the dispatch lock protects processing the message at the head of
>the incoming message queue. The functions that try to touch this are:
>
> - dbus_connection_pop_message()
>   - takes the lock, removes the message for good, drops lock
> - dbus_connection_borrow_message()
>   - takes and holds the lock, returns message temporarily with
>     lock held
> - dbus_connection_return_message()
>   - puts back borrowed message and unlocks
> - dbus_connection_steal_borrowed_message()
>   - converts a borrow into a pop, and unlocks
> - dbus_connection_dispatch()
>   - effectively this does a borrow_message(), then
>     runs the various handlers. If any handler returns
>     NEED_MEMORY it returns the message and unlocks,
>     if the handlers succeed then it steals the borrowed
>     message permanently and unrefs it
>     (I was thinking, it should probably be implemented in
>     terms of the borrow/return code which would move all
>     the locking logic to one place...)
>
>I _think_ that if we didn't have out-of-memory handling,
>dbus_connection_dispatch() could simply pop_message() at the top and not
>hold the dispatch lock while running handlers. The bus daemon does need
>OOM handling though.

That would be my suggestion.

The way things are, pop_message is probably the best for me right now, 
since we're in "uncooperative mode" with other bindings. This would solve 
the deadlocking and reentrancy problems for me.

As for OOM, we've come to the conclusion that checking for out-of-memory 
conditions is only useful if large amounts may be requested. If you're 
doing small allocations, running out of memory means you're in a very 
serious condition already -- not to mention that the system is probably 
already swapping like mad and the user has long pressed the Reset button.

I'm not sure if this applies here, though. Aside from handlers returning 
out-of-memory conditions, how else can we run out of memory? Are we 
trying to duplicate the message in memory (which could be a large chunk 
of memory)?

>There may be an ordering problem also... i.e. right now we're guaranteed
>to run all handlers for message 1, then all for message 2, etc. If we
>didn't keep a lock over the handlers, we would expect to have no
>guarantees about this. That would be pretty disastrous for a threaded
>app I suppose, since ordering of signals is normally significant.
>You don't want to get the "toggled, new value = true" signal and the
>"toggled, new value = false" signal in the wrong order.

Hmm... yes. Big problem. And you don't even need two threads to get into 
that.

Let's explore this further:

In a single-threaded app, if you don't lock, you run all handlers for 
message 1. Any of them can call to user code, which can in turn recurse. 
This isn't a problem for method calls, for which there is one handler and 
one only: if it called to user code, it means no further processing will 
be done.

However, a signal can be delivered many times to user code, through many 
handlers. If any of the user functions recurse, we may end up processing 
message 2 before all handlers had the chance to see message 1. That in 
turn means the handlers could see the messages in this order: 1, 2, 2, 1.

If you have multithreading, then only one thread can pop and process the 
message (though many can see it). If the application has the concept 
of "owner threads" (a call or a signal has to be delivered to a specific 
thread), the binding or the application must transfer the message to the 
correct thread after processing it, be it through a handler or by 
borrowing the message and analysing it.

(However, given that you can only peek at first message in the queue, I 
would advise against any binding to process the message in multiple 
threads by waiting for the correct thread to process the next message. 
Any handler should process it and deliver it to the correct thread 
instead.)

So, if there are two messages in queue and thread A is processing message 
1, thread B can start processing message 2. So far, so good.

If the application has the concept of "owner threads", then signals have 
to be delivered to the right thread by the binding or the application, 
like I said above. This means thread A will always see message 1 first, 
then message 2 (which was delivered through other mechanisms). And thread 
B will always see message 2 first, then message 1.

If the application has no such concept, then any message can be delivered 
to any thread at any time, by definition.

If this is unacceptable, then I advise against processing messages in 
multiple threads: do it in one thread only, to guarantee proper ordering.

Which brings us back to the problem of self-thread recursing. If the user 
code recurses and wants to process more messages, there can only be two 
possible outcomes: either we find the specific message that the user 
wants to process among the many messages we're receiving, or we process 
them all.

I chose solution #2 for the Qt bindings for the moment. DCOP's solution is 
#1.

> - instead of having to run or not run all handlers in one call to
>   dispatch, maybe we should convert each message into a queue of
>   handlers which is persistent across dispatch calls ... does it help?
>   maybe fixes the "multiple running of one handler on OOM" at least

Sorry, I didn't follow...

> - does it help to add some public API like the check_for_reply()
>   in dbus-connection.c
>   (essentially add a way to implement send_with_reply_and_block()
>    yourself, only blocking in the main loop instead of on the dbus
>    socket - right now you can't "look ahead" in the message queue
>    with the public API, so you're forced to either dispatch or block
>    in order to make a method call)

Not to me, sorry.

>I feel like the usage of dispatch() with threads is maybe too undefined
>- I doubt anyone would just call it from a bunch of threads at once with
>no locking scheme of their own, right?

Just as I explored above, I don't think this is a good idea at all! If 
your binding is the only one, it's ok to call it from multiple threads. 
The problems will be yours only.

In an environment that makes extensive use of threads, like Java, I'd 
expect actually that any external call or signal be delivered to a thread 
of its own, external to the GUI thread, event thread or main thread. I 
don't know how Matthew did it, though.

If you want to be cooperative to other bindings, you shouldn't call 
dispatch but you should let whoever has the event loop to do it.

>A more plausible scenario to me is that there's a main loop thread that
>always calls dispatch(), and then some other threads that do
>send_with_reply_and_block() so they need to watch the message queue for
>their reply, but not dispatch the whole queue.

That sounds like a plausible scenario. Does d_c_s_w_r_a_b also block other 
threads while waiting for its reply? I'm not that familiar with the code, 
but I think it's using a wait condition and waiting for a wake-all event.

>So maybe the right solution is to declare the right rules about how
>pop() and dispatch() are used with threads and what sort of locking you
>have to do yourself, and then we can simplify things on the libdbus
>level. If for example dispatch() were always called from the same thread
>we can probably use the handler queue concept to be robust against
>same-thread recursion and OOM, and just not have the dispatch lock.
>Modifying the handler queue would require the connection lock, but we'd
>drop it while actually invoking each handler.

To the other binding authors here: do you call dispatch from many threads? 
If so, under which conditions can a thread call dispatch?

Given my exploring above, I'd be surprised if any of you do it, though of 
course I'm rather biased by own concepts.

-- 
Thiago José Macieira - thiago.macieira AT trolltech.com
Trolltech AS - Sandakerveien 116, NO-0402 Oslo, Norway
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/dbus/attachments/20060301/2eb990a4/attachment-0001.pgp