[Xesam] Practical applicability issues

Thu Jun 19 13:27:34 PDT 2008

Hello,

On Thu, Jun 19, 2008 at 2:53 AM, Mikkel Kamstrup Erlandsen
<mikkel.kamstrup at gmail.com> wrote:
> 2008/6/18 Evgeny Egorochkin <phreedom.stdin at gmail.com>:
[...]
> This will add a new data type to the ontology, which will of course
> break backwards compat. It is not a big thing for clients since they
> will now just receive a dbus struct and then have to access a given
> member of that to get at the data they did before. Server side will
> have to do a little more work, but not necessarily a big deal of it,
> depending on how they choose to implement structs.

This could be a big deal. Would servers now be forced to either
support all fields in the struct or none at all, or would
vendor.ontology.fields need to be extended to subfields? In the latter
case, we would need to define "default" values for unsupported fields
(which is a little fugly to have to deal with in the server).

This is definitely going to make the Beagle Xesam implementation
messy. Since our ontology is not nearly as complex as what Xesam
defines, We just store a simple mapping from supported Xesam (leaf)
fields to Beagle fields. If struct support is mandatory, we will need
to add additional information to remember what fields are actually
structs or part of some struct, aggregate the fields we support, fill
in the rest with default values, and then return them. Clearly
non-trivial and messy.

> <contains>
>  <field name="workAddress" member="postalAddress"/>
> </contains>
>
> Leaving out the 'member' attribute would mean all struct members. This
> would not work on nested structs though (should we decide to support
> those too).

How would this be done in the User Query Language? "field.member", I presume?

[...]
> Next problem is our hierarchical query matching (ie that a query in
> xesam:author should also match in child fields of xesam:author), it
> could be that structs would somehow mess this up. This is also a
> non-issue as it turns out. If we require that struct members are also
> full fledged fields, fx continuing our example the postalAddress
> member of the Address struct would be a child of
> xesam:physicalAddress, we would have fully normal query expansion
> given a query into a struct field (like the query fragment above).

I was given to understand that "abstract" fields (fields from which
some other field are derived) would not be part of hit.fields. Has
this changed recently or was I wrong?

> Last issue I've thought about - hit data retrieval. To keep it short I
> don't think we should change the API or session props. This means that
> naming a struct field in hit.fields will always retrieve the entire
> struct. Retrieval of individual struct members is hence not possible.

I think that this could be limiting, unless the concerns I've raised
above are resolved.

[...]
> MY OPINION
> As stated when I started I am greatly concerned about breaking
> anything at this point. OTOH there is a reason why it is a RC and not
> 1.0. For me to give +1 it would require:
>
>  * We can add it without causing to much work
>  * All server maintainers give +1
>  * It has absolute minimal impact on the API
>  * We document meticulously what changed and how consumers should react
>  * It addresses (almost) all issues Evgeny has raised
>
> I think that what I've outlined above meets these, but I can very well
> have missed something. What I am most unsure about is whether this
> model actually solves our issues and if people are going to have
> troubles implementing lists of structs.

On a more general note, I (and I think I represent the sentiment in
the Beagle camp here) feel that the spec is starting to deviate from
its original goal, which AFAIK, was to have a standard way to talk to
desktop search tools. I am not against the idea of being able to do
more advanced stuff with the search spec, but the spec has to define
some common minimum feature-set for servers to be able to meaningfully
support Xesam and this bar seems to be getting raised significantly in
this proposal.

IMO, one of the coolest things about the spec, and particularly the
API and query languages, is the elegant simplicity. Somehow, this
change seems to be undoing that simplicity to some extent.

Again, I do not want to imply that the spec should be constrained, but
there should be some way in which the simplicity paradigm can be
maintained in parallel with more advanced features/usage.

Cheers,
-- 
Arun Raghavan
(http://nemesis.accosted.net)
v2sw5Chw4+5ln4pr6$OFck2ma4+9u8w3+1!m?l7+9GSCKi056
e6+9i4b8/9HTAen4+5g4/8APa2Xs8r1/2p5-8 hackerkey.com