recursive types, struct, custom, dict, etc.

Wed Jun 2 17:18:01 PDT 2004

On Wed, 2004-06-02 at 00:04, Olivier Andrieu wrote:
>  > If we introduced a variant type (pretend its code is "v") we could
>  > replace DICT with something like:
>  >  "('StringVariantMap'asav)"
> 
> rather something like "a('NamedPair'sv)", no ? (and 'v' is already
> taken by NIL, stands for void I guess).

Right, v was supposed to be void for nil. But if we had a variant I'd
use it for variant. ;-) As noted I'd sort of like to lose the nil type.
Or I guess "any" is also a candidate name for variant.

a(sv) and (asav) would both work as conventional representations of a
hash/map, agreed a(sv) is nicer.

>  > i.e. struct StringVariantMap { array<string>; array<variant>; }
> 
> array<struct {string, variant}> ?

Right, that also works and avoids having to verify the arrays are the
same length.

> The problem with this approach is that it goes against the idea of
> having a separate type signature. If you send a message of type "isi"
> for instance, the receiver should be able to understand messages of
> signature "vvv", "ivv", "vsv", "vvi", etc ... so much for the quick
> dispatching using the signature :)

A good point, but I think not a fatal flaw. It's just slightly more
complex than a strcmp to deal with it.

For the incoming signature you resolve it to an actual type by peeking
into the variants:

  ivv -> isi

For the signature you're matching against, "v" is effectively a wildcard
so "isi" can match a method expecting "vvv" or "ivv" or whatever.

You also have to handle that "i(si)" can match "iv" but "isi" can't
match "iv"

>  By the way, the API
> dbus_message_has_signature(message, signature) would not be very
> useful in this case, something like
> dbus_message_signature_compatible() with a more lenient check than
> strcmp would probably be necessary.

Right.

> That's the crux of the problem I think : having both non-parametrized
> union type (ANY) and a parametrized product type (STRUCT). ANY is a
> bit of pain to handle in a statically typed language (you have to
> dispatch on it in unmarshaling functions) and having it pratically
> eliminates the benefits of knowing the types appearing in the struct
> since any of them could be a variant. On the other hand, not having
> ANY may be a pain for dynamically typed language (in marshaling
> functions).

Right, both are useful. Note that e.g. Java does have both, in effect
(the base Object type functions as a variant type).

>  >  - maps more naturally to statically typed languages
> 
> not really (because of the variants)

I don't think variants break anything. If multiple possible types are
expected as arguments, you _want_ to expose variants to the statically
typed language via something like QVariant or GValue. In this case the
"IDL" specifies variant:

  void SetProperty (in STRING propName, in VARIANT propValue);

However, in other cases variants are used on the wire because a dynamic
language binding doesn't know what's in the "IDL" - but static bindings
still see a fixed type. So the actual method might be:

  INT32 GetCount ();

however, when the Python binding for this chains to a Python method, the
Python bindings know the Python method returned a Python int, but don't
know whether the "IDL" is:
  INT64 GetCount ();
  INT32 GetCount ();
  VARIANT GetCount ();
or even 
  DOUBLE GetCount ();
  UINT32 GetCount ();
as a result the Python bindings could simply throw a VARIANT containing
an int on the wire, and then the caller of GetCount() could perform the
"cast"

Similarly when Python passes an argument to SetCount(), it could pass a
variant on the wire.

Despite this variant on the wire, C/C++ would see a method that returns
a fixed integer type, and unpacking of the variant happens
automatically.

So we have two separate occasions where variants come up; in the first
we expect to genuinely have different types at runtime, and in the
second we're just making dynamically typed languages work smoothly.

>  >  - structs are probably sort of annoying to deal with in language
>  >    bindings
> 
> why ? I rather had the impression they were not a problem. Either
> handle them generically using introspection data and a big unmarshling
> function, or handle them statically using some sort of code generation
> (preprocessor, IDL compiler, etc.).

Right, I'm defining both of those as "sort of annoying" ;-)

Yes it's workable, but it's more work than dealing with primitive types.

> I'd say: add a non-parametrized STRUCT type,

non-parameterized STRUCT is the same thing as ARRAY of ANY, right?

One reason I would parameterize struct is efficiency for arrays of
structs. e.g. for struct { int, int } making that struct { any, any }
doubles the size of the value, and keeps you from demarshaling it in C
via a simple memcpy(). The typecheck also becomes a full array scan,
etc.

>  a non-parametrized ANY
> type

Agree we should just do ANY rather than union { int, bool } type of
stuff. i.e. we should do ANY-which-is-one-of-some-set, just keep it
simple. IPC already allows runtime failures, so static type _checking_
doesn't make sense; type checking is inherently dynamic. 

> , and forget about separating type signatures.

I'm not sold here. Also, Matthias really wanted to see this.

>  I think we need ANY
> because of DICTs and since having an ANY type constitutes an escape
> hatch that can make type signatures almost useless in some cases, I'm
> not sure separating type signatures is worth the effort.

The signatures still give us a nice speedup and allow method overloading
in the static-language-to-static-language case at least. One some level
we might consider the signature purely an optimization... well we've
already established that a system with only variant types will work,
that's what Lisp or Python has. It's just an efficiency issue to split
the type from the data.

I think a worthwhile optimization, though.

>  > The NIL type:
>  > 
>  >   NIL doesn't make a hell of a lot of sense as a *type*, really it's a 
>  >   value that's allowed in *some* languages to replace a value of any
>  >   type. I think we need to get rid of DBUS_TYPE_NIL since I can't make 
>  >   any sense out of it.
> 
> Well, I think it can be useful (see above, about optional arguments).

I guess the null value effectively has no type e.g. in C# so making it a
type is logical on some level. It's a type with no meaningful value
data.

I'm a bit concerned about people throwing around variants all the time
just to allow string-or-nil. This could be limited if we effectively
mandate an ANY in the IDL for this case, rather than allowing "string"
in the IDL. That would make string-or-nil all sucky in C/C++, but would
still leave string-or-nil pretty convenient in Python. I don't know.

Further or additional thoughts on all this welcome.

Havoc