DBus Extensions

Sat Aug 11 08:55:25 PDT 2007

Hi

Havoc raised a few big questions in his last email.  I'm not really sure
how I feel yet about the union vs. intersection questions.  These are my
answers to the rest of them.  Probably not everyone will agree with me,
but this is what I think:

..

I believe that there are three things we need to consider when making
changes:

 * wire format changes

 * "application"-visibility of changes (ie: normal enduser apps)

 * sanity

..

Sanity

>From the first two points I will make a list of what I think must be
satisfied for a change to be considered "OK".  This doesn't mean that
any change that satisfies the criteria is something that we should do --
it merely means that it doesn't violate those rules.  We should only
actually do it if it is the sane and reasonable thing to do :)

..

Application-visibility

Having a type obtain a new possible value of NULL is merely a specific
case of something that we should never do.  That thing is to change the
meaning of the format for an existing client.

Any application-visible changes need to be done in a backwards
compatible way.  I believe that a useful way to think about this is with
interfaces.

Imagine we add a floating point type "f" to DBus.  As long as an
application doesn't consume any signals or methods that have "f" as part
of their interface then the application does not have a possibility of
becoming confused.  This is a safe change in terms of
application-visible changes.

The only problem is that the application may consume "too many" signals
by having match rules that are a little bit too lose.  One way to solve
this might be to implicitly limit match rules to only return matches for
messages that the client will understand.  I'm not sure what the
negotiation mechanism between the client and the library might look like
-- particularly in the case where the "client" is actually several
possibly-differently-abled clients (ie: different libraries in one
process).

Alternatively, we could come up with some stricter way of matching that,
by its very nature, ensures that clients will only receive messages that
they can handle.  Ideas are forcing clients to match on interface or on
incoming signal signature.

Once we figure out the small details then as long as we follow that
simple rule then we're OK to make changes.

*** We should only add new application-visible changes if they do not
*** break applications that mind their own business by sticking to an
*** old interface.

..

Wire format changes

This is more difficult -- but it doesn't have to be too bad.

I think that the most reasonable way to proceed here (excuse my bias) is
for DBus and DValue to use the same serialisation format (( and I really
like DValue's format :) )).

The bus daemon (because it is currently the only implementation) is the
easiest thing to update.  As such, it should support any serialisation
format that we believe to be sane.  The idea here is to make things
easier for the client libraries -- we want to keep the client library
implementations simple.

*** Make the server smarter in order to keep the client libraries
*** simple.  It's easier to fix bugs in one piece of code.

Each client (ie: connection) to the bus should have the following:

  * a list of understood serialisation formats (with parameters, see 
                                                below)
  * a preferred format (also parameterised)

The bus should be prepared to accept messages from the client in any
format that the bus understands.

Upon forwarding a message to the client (from some other client) the bus
should consult the destination client's list of understood serialisation
formats.  If the message is already in a format that is on the list then
the message should be sent.  If not, then it should be converted to the
preferred format.  This mechanism also allows us to deal with
endianness.

I believe that the normal case for this setup will be that a client
library lists a single understood format -- its preferred format.  The
client will only receive and send messages in this format.  This normal
case might even be so common that it makes sense to disallow the other
cases.

*** When adding new serialisation formats, we need not wait for everyone
*** to update client libraries.

A few words on how "worth it" it is:

What's the point in having a "more efficient" format if the bus
constantly has to do conversions?  There's still a point here, I think:
the bus currently looks through the entire message anyway (for
validation purposes) so it could just as easily do conversion at this
point.  In addition, the overhead of the bus doing this will always be
O(n).  Depending on the access patterns made by the client after
receiving the message, things could be much worse there.

The extra work on the server makes my life a bit nicer, and probably Rob
Taylor's and Ori's too.  Having the messages come over a socket to us in
DValue format is a gigantic win.  The fast lookups and other nice
formats of the DValue serialisation format will also make other people's
lives easier.

Before we proceed, though, it really makes sense that we try to get a
consensus on this format.  The normal case of bus operation should not
be that every message gets reserialised by the daemon.  Our goal should
be, at any given time, to have a certain message format that we actively
encourage implementers to use.  Ideally, most of the time, the
conversion facilities would cover the unusual cases of old client
libraries and mismatched endians.

*** We should only add new serialisation formats if we believe that they
*** have hope of being widely adopted.

..

Of course, new extensions to any serialisation format could be made very
easily, assuming that they satisfy the requirements above (ie: adding
new types is fine so long as old applications are not broken).

Implicit in this is that any additions only serve to increase the number
of byte sequences that represent valid messages.  All formerly-valid
byte sequences continue to correspond to exactly the values that they
did before.

The negotiation of which "new types" are supported forms the
parameterisation that I mentioned above.  You tell the bus:

"I support original dbus serialisation." (actually, this would be the
implicit case if you do no special negotiation at all)

or

"I support DValue serialisation, plus chocolate-sending capabilities."

If you merely say "I support DValue serialisation" and then somebody
sends you chocolate then you lose.

The semantics of "you lose" go something like this (but basically mean
that the client will never-ever be sent chocolate):

1) If someone attempts to send you chocolate as part of a method call
then you hear nothing and the sender gets a "server is pwned" error.

2) If someone attempts to send you chocolate as part of a method call
reply then you receive "server is pwned" error.

3) If someone attempts to send you a chocolate-carrying signal then you
just don't get it at all.

The reason for this is that if you send chocolate to someone who is
incapable of digesting it (libdbus-dog?) then you'll have quite a mess
indeed.

In the case that we have a chocolate-capable client library that somehow
knows that it is acting on behalf of a single application that doesn't
want to know about chocolate, it might even perform a simplification by
telling the bus that it doesn't support chocolate at all.

*** New types should only be added with explicit feature negotiation and
*** the bus should only send them to clients that have explicitly
*** requested them.

..

Now I follow with a question that has been bothering me for a little
while.  It's about validation.

Where should validation of message contents be done?

Is it appropriate for an application to overflow some buffer based on
"evil bytes" sent to it by a trusted bus daemon?

Is it appropriate for it to fail an assert()?  Throw an exception?

Must the error be caught at the time the message is received or might it
be caught later on while deconstructing the data object?

If the standard is that client should always validate bytes sent to them
by the daemon then why is the daemon also performing validation while
forwarding?

I'm really not sure what the correct thing to do is.

..

Also: something that I forgot to mention for a neat new idea for a type
we could have:  arbitrary precision signed integer.  I know -- shoot me
now :)

Cheers

On Fri, 2007-10-08 at 14:14 -0400, Havoc Pennington wrote:
> Hi,
> 
> Here are a bunch of questions related to extending the type system, just 
> throwing them out there so we can get some common assumptions and 
> baseline as we consider such extensions.
> 
> - one thing I'd like to see is that current application code remains 
> unchanged, that is, for existing dbus apis that declare a 
> currently-valid signature, NULL is still an impossible value.
>
>     (if we allowed type "s" to suddenly be NULL for example, then all
>      apps would suddenly become crashable)
> 
>     (and even for new APIs I think it's best to avoid nullable when
>      it's not needed, since it adds an extra check the app has to do,
>      which is why dbus doesn't support null right now)
> 
> - the macro "type system intersection" idea vs. "type system union" 
> question gives me some pause about allowing null integers (few languages 
> support nicely), vs. null strings (most languages support nicely)
> 
>     (I would think this intersection vs. union question is also the
>      guiding principle for evaluating the other type system extensions
>      people have proposed. I suspect intersection is ideal for public IPC
>      that lots of apps might use and union is ideal for more private
>      usage, so tough to make everyone happy.)
> 
>     (If it is not clear to people what I mean by intersection of type
>      systems different languages/bindings are using, vs. union of them,
>      I can explain further, but if it's clear I could use thoughts on
>      how to evaluate this tradeoff)
> 
> - what is the overall cost of elaborating the type system?
> 
>    (iow, if we did 4 or 5 type system extension proposals, how much
>     more code are we talking about once all the bindings are looked at,
>     how much harder do tools like dbus-monitor and dbus-inspector
>     become, etc.)
> 
> - is it valuable to fully sync the DValue format
>     http://live.gnome.org/DValue/Serialisation with the DBus wire
>    format, or does it not really matter?
> 
>    (one practical importance being, as we evolve the dbus wire format,
>     do we need to worry about the issues that primarily matter in
>     a file-on-disk context and don't matter much in an IPC context.
>     another practical question is whether we should have one spec
>     or spec chapter that covers both.)
> 
>   - how much do we care about compatibility?
> 
>     (libdbus/dbus-daemon have to keep all the current protocol support,
>      which means some of the proposed changes could make a pretty good
>      mess out of the code by adding multiple codepaths.)
> 
>     (if there are several reimplementations that don't mess with back
>      compat with the current protocol, then effectively as a community
>      we are breaking compat...
>      or if everyone does do back compat, it makes the same mess
>      I'd expect in libdbus)
> 
>     (if we don't care about compat much now, when do we start?)
> 
>   - how much effort will we spend on churn here, vs. how much code
>     and clunkiness will we save in apps?
> 
>     (I think some kind of nullable values might be a big win for
>      apps, but some other possible type system extensions might be
>      more pain than gain, so this could be a case-by-case question)
> 
> Sorry for some of the big-picture questions, but we've never really 
> extended dbus to date, so we have to pioneer this area in some respects.
> 
> Havoc
>