DBus API problems & UTF-8

Havoc Pennington hp at redhat.com
Mon Jun 12 07:46:44 PDT 2006


Kimmo Hämäläinen wrote:
> For example, if dbus_message_iter_append_basic() returns FALSE, the
> caller cannot know whether 1) an invalid argument was provided, or 2)
> out-of-memory happened. However, the caller might want to handle
> situation 1 differently from 2.

The boolean error code means out of memory _only_. If you provide an 
invalid argument, that is _not_ a runtime error, it's a bug in your 
program. The dbus API contract is that behavior is _undefined_ if you 
provide an invalid argument (if compiled with "enable checks" dbus will 
be nice to you and print a warning and return, instead of crashing, but 
if compiled without disable checks it will just get confused and crash. 
in any case, once the warning is printed dbus makes no special attempt 
to keep things sane and may crash later or get confused.)

The boolean return after a warning is printed is selected arbitrarily; 
it could be true or false. Behavior is undefined at this point.

It makes no sense to provide a runtime error when an error should be 
avoided rather than handled.

If there's an API function where there are two _runtime errors_ (errors 
that have to be handled, not avoided) then there should be a DBusError 
provided to distinguish those. Let us know if there are cases like this, 
since it would be a bug.

However, invalid UTF-8, NULL arguments, etc. are not considered errors, 
they are simply not allowed. Don't pass them in.

> Btw. why on earth DBus has to limit valid string data to UTF-8? I see no
> reason why the string data should be even validated in the server (as it
> now does). Seems like another unnecessary limitation -- or perhaps a
> some kind of political statement (think of some very widely used Asian
> encodings).

There are three reasonable kinds of string:
  - Unicode strings
  - encoding-tagged strings (a string plus an encoding name transmitted
    separately)
  - binary data (not human readable)

Anything else is broken.

DBUS_TYPE_STRING is a unicode string, as is the String type in Java, Qt, 
GTK, Python, and well about everything else.

DBUS_TYPE_ARRAY of DBUS_TYPE_BYTE is binary data. (or you can use arrays 
of int or whatever you like)

If you want an encoding-tagged string then just pass an encoding name as 
one argument and a byte array with the string data. You can get a list 
of encoding names from "iconv -l"

Havoc


More information about the dbus mailing list