[RFC] dbus-python API (re)definition

Wed Sep 6 09:57:17 PDT 2006

On Wed, 06 Sep 2006 at 14:38:09 +0100, Simon McVittie wrote:
> git (branched from upstream):
>       http://people.freedesktop.org/~smcv/git/dbus-python/.git/

I forgot to mention that that branch also includes a simple
implementation of Matthew Johnson's D-Bus binding tests (in
test/cross-test-client.py and test/cross-test-server.py). I've only run them
using my Python client and server so far; I'm going to try Python <-> Java,
if I can get Sun Java and the Java D-Bus bindings to work on my i386.

A couple of the tests (the ones using Bytes) currently have failing
cases, because the D-Bus API appears to be unsure whether a Byte is more
similar to a single-character 8-bit string (mostly used in dbus-python
method/signal arguments, and arguably more Pythonic) or an int with restricted
range (used in dbus-python method returns and the _dbus_bindings.Byte type).

I believe we should probably support both on input, and settle on one or
the other to be used in method returns.

The generic Identity and IdentityArray tests (which use variants) make
no assertions about the type of the objects, which is probably wrong,
but at the moment the binding is slightly lossy (there's no way to tell
whether you got an Int16, a UInt16 or an Int32 in your variant
parameter, other than using the explicitly_pass_message decorator to get
the _dbus_bindings.Message object, which is API I want to throw away).

Here are the type mappings, which ought to be documented somewhere. My
apologies if I'm going over old ground here:

D-Bus to Python
===============

Currently implemented in MessageIter.get*

(e.g. in method arguments on a Python service and signal handler arguments
on a Python client, including when translating Variants to native types)

Byte -> int (should this be str of length 1?)
Boolean -> bool
Signature -> dbus.Signature (str subclass)
IntNN, UIntNN -> int or long, as decided by Pyrex
Double -> float
String -> unicode
ObjectPath -> dbus.ObjectPath (str subclass)
Dict -> dict
Array -> list of items whose type is given by these rules
Variant -> type given by these rules for contained item (recursively, if
           it contains a Variant!)
Struct -> tuple

It could be argued that all the method arguments and signal handler
arguments should be:

* the dbus-python types (e.g. dbus.Int16, which is a subclass of int)
  for maximum obviousness (at least from a D-Bus perspective) - this
  also means methods can know exactly what type was passed in a Variant

* native Python types where possible, as they are now (for maximum
  obviousness from the perspective of a Python programmer who doesn't know
  D-Bus at all, although such a programmer is likely to get confused by
  these bindings) - this means you can't know exactly what was in a Variant.

* either, as determined by a keyword argument to the method decorator
  or signal-binding function (dbus_classes=True? Which should be the
  default?)

Programmers may also want to arrange for arrays of bytes to be delivered
as a dbus.ByteArray (a str subclass) rather than having an array of 100
bytes arrive as 101 distinct Python objects; a keyword argument would
probably be the way out here, too.

Python to D-Bus
===============

There are several versions here.

1) Coercing a Python object to fit in a Variant argument
   Guessing the signature of a non-introspectable method? (I must admit
   I haven't yet checked whether dbus-python can call such methods)

   (MessageIter.python_value_to_dbus_sig and MessageIter.append)

   I've listed these below in the form bool -> Boolean.

2) Coercing a Python object to fit in an argument whose type is
   specified

   (MessageIter.append_*)

   I've listed these below as "also accepted".

There is a general issue that dbus-python doesn't treat subclasses of
built-in types the same as the corresponding built-in type. In my
opinion it should check the special cases (isinstance() of its own
types) first, then accept arbitrary subclasses of the built-in types as
being equivalent to the built-in type itself.

Boolean
-------
bool -> Boolean
dbus.Boolean or subclass -> Boolean
Also accepted: anything Pyrex will assign to a dbus_bool_t.

We should probably accept any Python object, following Python's own
rules for conversion to bool (to make sense this would require that
dbus.Variant is changed so that it has the truth-value of its contents,
which is easy - just give it a __nonzero__ method).

Byte
----
dbus.Byte (or subclass, theoretically) -> Byte
Also accepted: str of length 1.

Subclasses actually won't work due to a specific test in append_byte()
for "type(value) == Byte". This is not usually what you want.

IntNN, UIntNN
-------------
int -> Int32
       (this is wrong on 64-bit platforms - a Python int is
       implemented as a C long)
long -> Int64
        (this isn't necessarily right either, suppose your number
        fits in a UInt64 but not an Int64? A Python long is
        arbitrary-length, although obviously D-Bus has no way to
        transport very large numbers except as strings)
dbus.(U)IntNN or subclass -> (U)IntNN

Also accepted: anything Pyrex will coerce to the appropriate dbus_foo_t.
I'll have to investigate Pyrex to find out exactly what this means.

If this doesn't invoke int(), I'm not quite sure whether we should. On
the one hand this would work for arbitrary user-defined types if they
define __int__, but on the other hand we don't really want binding users to
be able to pass the string '100' as a numeric parameter and have it
work without any warning. Perhaps explictly reject strings, *then*
accept anything for which __int__ works?

Stringy types
-------------
str or unicode -> String
dbus.Signature or subclass -> Signature
dbus.ObjectPath or subclass -> ObjectPath
Also accepted for String: anything whose .encode('utf8') Pyrex will coerce to
(char *).

In the case of str objects, the latter leads to confusing
messages, as Python will try to decode them with the default codec
(usually ASCII) and re-encode them with UTF-8. We should probably at least
accept any UTF-8 string as-is, since by design strings of some other
character set that happen to be valid UTF-8 are rare.

In this case I think it's *not* a good idea to go calling str() on
arbitrary objects in order to coerce them into a String parameter - the
Unicode side of things will cause headaches (str() is commonly
implemented and is not guaranteed to have any particular encoding,
__unicode__() isn't commonly implemented, and unicode() basically falls
back to doing str(foo).decode('ascii') which will, in general fail).

Double
------
float -> Double
Also accepted: anything Pyrex will coerce to double.

See comments on advantages/disadvantages of calling int(), but here it's
float() and __float__().

Dict
----
dict -> Dictionary with key, value types given by an arbitrary item
        of the dict using this same algorithm
dbus.Dictionary -> Dictionary with key, value types given by either
                   the Dictionary constructor or an arbitrary item's
                   type

Struct
------
tuple -> Struct with item types given by the items' types
dbus.Struct or subclass -> Struct with item types given by the items' types

Array
-----
list -> Array of an item type given by the first item's type
dbus.Array or subclass -> Array of an item type given by either the
                          Array constructor, or the first item's type
dbus.ByteArray or subclass -> Array of Byte

Same comments about the ByteArray and Arrays/lists of Byte as given
above for Byte. Arrays of integers and (any) 8-bit strings should both be
acceptable alternate values for an Array-of-Byte parameter, IMO.

Variant
-------
dbus.Variant -> Variant with the same contents, guessed as above if
necessary

------------------------

Thoughts?

Thanks,
         Simon