[Xesam] Practical applicability issues

Thu Jun 19 14:08:41 PDT 2008

2008/6/19 Arun Raghavan <arunisgod at gmail.com>:
> Hello,
>
> On Thu, Jun 19, 2008 at 2:53 AM, Mikkel Kamstrup Erlandsen
> <mikkel.kamstrup at gmail.com> wrote:
>> 2008/6/18 Evgeny Egorochkin <phreedom.stdin at gmail.com>:
> [...]
>> This will add a new data type to the ontology, which will of course
>> break backwards compat. It is not a big thing for clients since they
>> will now just receive a dbus struct and then have to access a given
>> member of that to get at the data they did before. Server side will
>> have to do a little more work, but not necessarily a big deal of it,
>> depending on how they choose to implement structs.
>
> This could be a big deal. Would servers now be forced to either
> support all fields in the struct or none at all, or would
> vendor.ontology.fields need to be extended to subfields? In the latter
> case, we would need to define "default" values for unsupported fields
> (which is a little fugly to have to deal with in the server).

Whether or not to drop field support all together or simply fill in
default values would be up to the implementation. The session property
vendor.ontology.fields would not change, struct members would not be
regarded as "real" fields here. This can of course put applications a
bit in the dark about whether they can expect the struct fields to
really work. It is problematic to fit in with the current
introspection props (vendor.*) and I am strongly against changing the
API to be able to include structs. Tricky indeed.

> This is definitely going to make the Beagle Xesam implementation
> messy. Since our ontology is not nearly as complex as what Xesam
> defines, We just store a simple mapping from supported Xesam (leaf)
> fields to Beagle fields. If struct support is mandatory, we will need
> to add additional information to remember what fields are actually
> structs or part of some struct, aggregate the fields we support, fill
> in the rest with default values, and then return them. Clearly
> non-trivial and messy.

I agree that it is a bit messy. It will not require rocket science to
write this, but it is still non-trivial as you point out. Also it
incurs an additional processing overhead when returning field values.

>> <contains>
>>  <field name="workAddress" member="postalAddress"/>
>> </contains>
>>
>> Leaving out the 'member' attribute would mean all struct members. This
>> would not work on nested structs though (should we decide to support
>> those too).
>
> How would this be done in the User Query Language? "field.member", I presume?

Either that, or not at all.

> [...]
>> Next problem is our hierarchical query matching (ie that a query in
>> xesam:author should also match in child fields of xesam:author), it
>> could be that structs would somehow mess this up. This is also a
>> non-issue as it turns out. If we require that struct members are also
>> full fledged fields, fx continuing our example the postalAddress
>> member of the Address struct would be a child of
>> xesam:physicalAddress, we would have fully normal query expansion
>> given a query into a struct field (like the query fragment above).
>
> I was given to understand that "abstract" fields (fields from which
> some other field are derived) would not be part of hit.fields. Has
> this changed recently or was I wrong?

This has not changed. There was a discussion about this back in
february though:
http://lists.freedesktop.org/archives/xesam/2008-February/000098.html
(comments on this issue please in that thread) It was however
inconclusive. It swayed me to believe that we should scrap the concept
on abstract fields (for reasons mentioned in that thread), but there
has not been any formal decision on this.

>> Last issue I've thought about - hit data retrieval. To keep it short I
>> don't think we should change the API or session props. This means that
>> naming a struct field in hit.fields will always retrieve the entire
>> struct. Retrieval of individual struct members is hence not possible.
>
> I think that this could be limiting, unless the concerns I've raised
> above are resolved.
>
> [...]
>> MY OPINION
>> As stated when I started I am greatly concerned about breaking
>> anything at this point. OTOH there is a reason why it is a RC and not
>> 1.0. For me to give +1 it would require:
>>
>>  * We can add it without causing to much work
>>  * All server maintainers give +1
>>  * It has absolute minimal impact on the API
>>  * We document meticulously what changed and how consumers should react
>>  * It addresses (almost) all issues Evgeny has raised
>>
>> I think that what I've outlined above meets these, but I can very well
>> have missed something. What I am most unsure about is whether this
>> model actually solves our issues and if people are going to have
>> troubles implementing lists of structs.
>
> On a more general note, I (and I think I represent the sentiment in
> the Beagle camp here) feel that the spec is starting to deviate from
> its original goal, which AFAIK, was to have a standard way to talk to
> desktop search tools. I am not against the idea of being able to do
> more advanced stuff with the search spec, but the spec has to define
> some common minimum feature-set for servers to be able to meaningfully
> support Xesam and this bar seems to be getting raised significantly in
> this proposal.
>
> IMO, one of the coolest things about the spec, and particularly the
> API and query languages, is the elegant simplicity. Somehow, this
> change seems to be undoing that simplicity to some extent.

You are right. This is a classical example of feature creep, I fully admit that.

I agree that one of the appealing things of the spec is the apparent
simplicity, and we should strive to keep that intact. As I also
mentioned earlier it has been mentioned a few times that the ontology
was a bit complex (and I think you are of that opinon too?), and
structs might be able to make it simpler in some sense. This would be
sacrificing some of the API implementation simplicity as you point
out.

> Again, I do not want to imply that the spec should be constrained, but
> there should be some way in which the simplicity paradigm can be
> maintained in parallel with more advanced features/usage.

This has been the way hither to (witness session props and query
language extensions). Perhaps there is another way to accomplish the
same goals - being:

 * Easier standards compliance of the onto
 * Simplification of the onto
 * Simplified relation traversal (querying of struct members as I've described)

The last two points are heavily interconnected. It is the ability to
traverse a single relation that will allow us to for example split out
an Address object and have links in the Person object to these.

In conclusion let me point out that I am also very keen on keeping it
simple, but the complaints about the ontology concern me. I don't
think it would be catastrophic or anything to keep the ontology more
or less as is (except forgetting about abstract fields).

Cheers,
Mikkel