[Xesam] Practical applicability issues

Jamie McCracken jamie.mccrack at googlemail.com
Wed Jun 18 14:37:23 PDT 2008


On Wed, 2008-06-18 at 23:23 +0200, Mikkel Kamstrup Erlandsen wrote:
> 2008/6/18 Evgeny Egorochkin <phreedom.stdin at gmail.com>:
> > Hi all.
> >
> > I'd like to raise again a long-standing issue of structure support.
> >
> > Some of missing features that require structure support are actually being
> > _requested by potential users_(who are encouraged to chime in ;-) :
> >
> > * Geo tagging and geo information for addresses;
> >
> > * Status and presence information for contacts;
> >
> > * Adequate support of iCalendar spec, which is currently more or less
> > nonexistent.
> >
> > Needless to say support means not only storage and retrieval, but
> > filters/query conditions.
> >
> > This can't be resolved merely by introduction of some string format for a
> > field that represents structured data, since eg useful features like geo
> > proximity search won't work..
> >
> > I consider this to be a blocker issue that significantly limits applicability
> > of xesam and interoperability with platforms that don't have structure
> > support limitations.
> 
> Hi all,
> 
> A longish mail. Let me try to give you my understanding of the
> problem, its implications, some technical notes, and with all these
> things cleared up I will give you my detailed opinion on this matter.
> Let me first state that I am a bit torn on this issue. On one side I
> can see that we have some impractical limitations, and on the other
> side we already have implementations of Xesam and a stability promise
> to keep. This is an important issue, so please hang on.
> 
> No matter if we include this or not, it is important for all to
> understand the implications. So I urge you to discuss this properly.
> 
> EXPLANATION:
> I think it would be useful with some more explanation of the problem.
> Say we want to add geo tagging to the address(es) of the Person class.
> Currently Person has the address-related fields:
> 
> workPostalAddress (list of strings)
> homePostalAddress (list of strings)
> 
> Adding geo info: workGeoN, workGeoE, homeGeoN, homeGeoE. You can see
> where this is going, from 2 to 6 fields, and Person addresses is not
> the only place where this scheme plays out. Assume we instead define a
> struct (which Xesam currently does not support) called Address with
> the following fields:
> 
> postalAddress
> geoN
> geoE
> 
> Then person would be much simpler and structured (pun unavoidable):
> 
> workAdress (struct Address)
> homeAddress (struct Address)
> 
> COMPATIBILITY
> This will add a new data type to the ontology, which will of course
> break backwards compat. It is not a big thing for clients since they
> will now just receive a dbus struct and then have to access a given
> member of that to get at the data they did before. Server side will
> have to do a little more work, but not necessarily a big deal of it,
> depending on how they choose to implement structs.
> 
> We need to be able to query struct members also. This can luckily be
> done in a backwards compatible way (unless someone emplyes a very
> strict parser). The simplest solution is to add a 'member' attribute
> to the 'field' element. That way you can specify:
> 
> <contains>
>   <field name="workAddress" member="postalAddress"/>
> </contains>
> 
> Leaving out the 'member' attribute would mean all struct members. This
> would not work on nested structs though (should we decide to support
> those too).
> 
> TECHNICAL
> I had several technical concerns when I first started thinking about
> this. But as you will see I think all of them are solvable.
> 
> The first was how to implement structs in a flat field store (like fx
> Lucene). This is not that tricky. For each struct member in each
> category the following field name will be unique:
> 
>   <Cat.name>_<Structname>_<fieldname>
> 
> Fx:
> 
>   Person_workAddress_postalAddress
> 
> Then if a query should look at a struct field simply do query
> expansion like you would on our current hierarchy of fields. When a
> struct field should be retrieved the server would have to collect the
> relevant fields and roll them into a real dbus struct.
> 
> Next problem is our hierarchical query matching (ie that a query in
> xesam:author should also match in child fields of xesam:author), it
> could be that structs would somehow mess this up. This is also a
> non-issue as it turns out. If we require that struct members are also
> full fledged fields, fx continuing our example the postalAddress
> member of the Address struct would be a child of
> xesam:physicalAddress, we would have fully normal query expansion
> given a query into a struct field (like the query fragment above).
> 
> Next technical problem - nested structs and lists of structs. While
> nested structs should work fine as I outline above, I see no clear
> solution in the query language. So I would probably not recommend
> this. Then about lists of structs - some implementations, like Tracker
> I believe, store lists in a different way than normal fields. This
> might make it tricky to support lists of structs (which we would like
> if we want to allow multiple home addresses per person).
> 
> Last issue I've thought about - hit data retrieval. To keep it short I
> don't think we should change the API or session props. This means that
> naming a struct field in hit.fields will always retrieve the entire
> struct. Retrieval of individual struct members is hence not possible.
> 
> PROS
>  * Some people close to the project has voiced concern about the
> complexity and size of the ontology. Structs might induce some more
> order and coherence.
>  * Able to be data-compatible with other de-facto standards. Namely vcard
>  * Can be added without too much work (it is not relation traversing
> queries we talk about!)
> 
> CONS
>  * Break ontology (a fair bit) and query language (very little)
>  * Jeopardize project credibility by changing low level stuff like this in a RC
>  * More work
> 
> MY OPINION
> As stated when I started I am greatly concerned about breaking
> anything at this point. OTOH there is a reason why it is a RC and not
> 1.0. For me to give +1 it would require:
> 
>  * We can add it without causing to much work
>  * All server maintainers give +1
>  * It has absolute minimal impact on the API
>  * We document meticulously what changed and how consumers should react
>  * It addresses (almost) all issues Evgeny has raised
> 
> I think that what I've outlined above meets these, but I can very well
> have missed something. What I am most unsure about is whether this
> model actually solves our issues and if people are going to have
> troubles implementing lists of structs.


I (and tracker) dont have a problem with structs per se but nested
structs (struct within structs) would cause complications all round in
tracker so if we can limit it to say no nested structs then i think it
would be ok

It should be post xesam 1.0 obviously

jamie



More information about the Xesam mailing list