recursive types work

Havoc Pennington hp@redhat.com
Fri Jan 14 22:53:17 PST 2005


Hi,

Referring back to my previous mail:
http://lists.freedesktop.org/archives/dbus/2004-December/001836.html

I want to commit now so people can start to port and adapt. I tried to
port the GLib and Qt bindings myself but haven't tested them beyond
their "make check"

I'm sure doxygen is spewing a thousand warnings but the doc comments in
the source code of dbus-message.c should reflect the new reality. dbus-
specification.xml now has almost nothing to do with reality though.

If someone would like to review this patch, I'd appreciate it, but it's
a 733K patch so I don't really have expectations.

I finally made the unit tests pass again. That was ballpark 60 hours to
write the code and 40 hours to fix I'd guess a hundred or so bugs in my
new code. I'm not sure if this is an argument for or against unit
tests. ;-) Now I know exactly how much damage I do every time I type in
a couple thousand lines of code *without* tests. Life is best lived in
ignorance.

Even worse, I still have some of the tests disabled (the comprehensive
"break the message loader" tests). I'm fairly confident that things are
working for valid messages and libdbus talking to itself, but also
fairly confident that with this patch we're probably vulnerable to
untrusted data until those tests are put back and bugs fixed. So
DO NOT SHIP a snapshot of D-BUS with these latest changes, until we get
back to a stabilization point.

Another problem after I commit is that all the profiling and
optimization I did a while back has been undone; the new code is plenty
slow. So I have to fix that also. I want to commit a version that's
correct first though, and then look at speed. Hopefully there are a
couple of easy "doh!" in there.

Things are also broken right now if you talk from one endianness to a
different endianness. I know how to fix it, I just haven't written the
code yet.

On the plus side, libdbus is now 4K smaller than it was. Yay!

I'll comment below on what some of this ended up looking like:

On Wed, 2004-12-29 at 02:47 -0500, Havoc Pennington wrote:
>  - DICT remains for now

I took DICT out for the moment, because I'm sure of one thing: it will
just be an alias for ARRAY of STRUCT of { whatever, whatever } hinting
that the binding should use a map<> type object instead of an array type
object (and implying that the "keys" should be unique).

It would be more convenient in some cases to have this look like:
 recurse (DICT)
   recurse (DICT_ENTRY)
     read (whatever)
     read (whatever)

while right now it is:
  recurse (ARRAY)
    recurse (STRUCT)
     read (whatever)
     read (whatever)

However, it's really an annoying amount of extra code unless I think of
a clever trick.

If we had DICT on the wire it would probably be an extra code in the
signature indicating that the following array of struct should be
treated as a dict. However, this sort of hint should maybe be in
introspection data along with e.g. struct names like "this struct is
really a Point" or whatever.

One possibility is dbus_message_iter_array_is_dict() so you'd still
write the array-of-struct recursion code, but you'd just get a hint to
map it into a map type.

For now if you want to pretend that any array of struct of
string,variant is a dict I'm sure nobody will complain.

>      if (array is not empty)
>        recurse (ARRAY of ARRAY of BOOL)

I fixed the "if array is not empty", you can now recurse and there will
be no elements. You can't recurse into an element of an empty array
though, so at the moment if the elements of an empty array are arrays,
there's no way to find out what they are arrays of. (There's no API that
is, it's simple enough to implement, just hasn't been done.) Similarly
if the elements are structs you can't see what the struct fields would
be unless you have an element to look at.

> This implies that some sort of separate type signature iterator may be
> needed in the API contrasting with the value iterator.

We could fairly easily add dbus_message_iter_init_types() that iterates
over types, but doesn't let you read values. But I haven't done it. I'm
not sure whether people need this or not, if you need it for what you're
doing please post a description of what you're doing.

Normally if you're just checking the signature you can do that with
has_signature() and as always get_args() will do it for you.

>  - they will always return const data, byteswapping on the fly if 
>    required.

This is the main thing you have to fix when porting. (Byteswapping on
the fly not implemented yet, but just ignore that and we'll fix it.)

The other main thing to fix is that get_args() and append_args() are now
totally uniform in what the varargs arguments look like:

 get_args (TYPE_STRING, &v_STRING);
 append_args (TYPE_STRING, &v_STRING);

If you get memory corruption crashes, look for this first.

append_args() doesn't support string arrays anymore; this should be
fixed (get_args() does support it still). 

>  - the typesafe accessors will be dropped at least everywhere internal
>    in favor of:
>        uint32 value;
>        read (&iter, DBUS_TYPE_UINT32, &value)
>    the type DBUS_TYPE_UINT32 is specified only for error checking, 
>    since &iter knows what type it's looking at.
> 
>    Open question is whether to keep get_uint32() type of functions
>    for dbus-message.h, even while dropping them all internally.
>    They are strictly speaking convenience functions and thus should 
>    live in the bindings.

I dropped these convenience functions, in part because I found they were
not very convenient. It's much nicer in the bindings to treat types
generically.

dbus_message_iter_get_basic() will always return something that fits in
8 bytes and does not need to be freed, so you can do:
 dbus_uint64_t value;
 get_basic (type, &value);
 append_basic (type, &value);

without knowing what type "value" contains at compile time. This is why
for append things are passed in as &value instead of just "value" now.

Hopefully you can kill some annoying switch() statements. I think making
dbus_type_is_basic()/dbus_type_is_fixed()/dbus_type_is_container()
public might help with that too, but I didn't do it yet.

Further breakage plans:
 the main ones I think are to change bool to dbus_bool_t instead of 
 unsigned char, rename services to something better, and maybe take
 a pass through the naming of the API in general.

Havoc




More information about the dbus mailing list