Starting the kdbus discussions

Lennart Poettering mzqohf at 0pointer.de
Mon Jan 20 04:56:55 PST 2014


On Mon, 20.01.14 12:38, Simon McVittie (simon.mcvittie at collabora.co.uk) wrote:

> 
> On 17/01/14 20:06, Lennart Poettering wrote:
> > A thread which wants to do a synchronous message call would simply take
> > the lock, write the message or as much as it can of that, then unlock,
> > and poll() on the socket, and when it is writable/resable again, write
> > the rest, of course only after retaking the lock. It does this as long
> > as the message is not fully written or the reply not fully received. As
> > many threads as you like can do this in parallel. This would mean that
> > a message one thread incompletely writes could be finished by another
> > thread and so on. But that's totally OK.
> 
> In kdbus, maybe that's OK, but in the D-Bus-over-SOCK_STREAM currently
> described in the D-Bus Specification, it certainly isn't. Once you have
> started writing a message into the Unix or TCP stream, you must finish
> that message before starting the next. There is no framing/packetization
> beyond "each message has a byte-count near the beginning".

Yes, of course. But that's what I meant by the bit about that another
thread finishing the writing of your messages... To make this work it is
essential to store a reference for all messages to be written in the bus
object, until they are completely written. And before a thread can start
writing its own messages it needs to complete messages queued earlier
that have only be written incompletely yet or not at all.

> > Note that with libsystemd-bus we explicitly are not thread-safe, though
> > threads-aware. In contrast to gdbus and libdbus1 we don't want to play
> > locking games, so what shifted the focus from
> > one-shared-connection-per-process to
> > one-shared-connection-per-thread. We believe this suits the global
> > ordering model of dbus better, and makes our code a lot simpler.
> 
> I'm sure it makes your code a lot simpler, and I often wish libdbus
> could do the same.
> 
> However, please note that the global ordering model does not guarantee
> anything about ordering between connections that happen to share a
> process. When there was a bug in the well-known (session, starter,
> system) connection sharing in single-threaded dbus-glib, we saw it as
> real (and hard-to-diagnose!) bugs in Telepathy - message orderings that
> "clearly can't happen", resulting in, for instance, more than one
> call-UI window appearing for the same VoIP call. That doesn't directly
> apply to "one connection per thread", because that environment doesn't
> have ordering guarantees without external synchronization anyway, but
> encouraging multiple connections to the same well-known bus is still
> something I'm wary of.

Well, but I'd claim that as soon as you start working with threads, you
must be aware of asynchronicity issues. This is the case when you access
any kind of data structure the same way as when you dispatch
messages. If you start processing messages in parallel in multiple
threads, then you have to think about ordering constraints.

I'd argue having distinct connections for each thread actually improves
the logic here, since the dispatch ordering requirements you need to
deal with then are the same for each individual thread as otherweise for
the entire process, thus making things much simpler.

> I'm not saying that libsystemd-bus doesn't have a valid choice of
> trade-offs, but it is not a choice that is compatible with GDBus' and
> libdbus' existing API guarantees. You've chosen one of the possible
> routes, GDBus chose another.

Well, I doubt this is true, really. I mean, does gdbus actually stall
execution of the gdbus dispatch loop globally as long as any message is
in the process of being dispatched on the same connection? If it
doesn't, and messages get processed in parallell, then the ordering
thing is lost anyway...

Note that I am actually a big bliever in the global ordering feature of
dbus. Actually, we are emphasizing this even further with kdbus, since
as part of the timestamp metadata of each message there's now a global
monotonic sequence number that is assigned by the kernel, and maintained
for the entire system (or actually: per-OS-container), which even allows
reconstructing the global ordering between messages sent over different
busses! [Background: we added this because we thought it was the other
side of the medal when adding prioq support to kdbus, i.e. a scheme
which breaks with the strict chronoligic ordering in some cases,
depending on a msg priority value. We can only allow prioq logic if
there's always a way to reconstruct the original ordering of things. The
prioq stuff is now available in kdbus too btw, we added that in the last
days]

Lennart

-- 
Lennart Poettering, Red Hat


More information about the dbus mailing list