simple search api (was Re: mimetype standardisation by testsets)

Mikkel Kamstrup Erlandsen mikkel.kamstrup at
Sat Dec 30 22:16:18 EET 2006

2006/12/24, Jean-Francois Dockes <jean-francois.dockes at>:
> "James \"Doc\" Livingston" writes:
> > On Thu, 2006-12-21 at 19:07 +0100, Jean-Francois Dockes wrote:
> > > About 2), URI *is* an appropriate handle, and probably the best as
> long as
> > > we can't guarantee the stability of the result set (that is: *if* we
> need
> > > separate Query() and GetSnippets() calls, *then* the URI is probably
> the
> > > best identifier to ensure consistency).
> >
> > What if the URI of a given item can change? I'm think in particular of
> > the backends which handle "file got moved" in ways other the simple
> > "remove old and add new" method.
> >
> > If you had to retrieve the URI as a property rather than it being a
> > unique identifier there would still obviously be some issues for things
> > that are using the simple non-live interface, but potentially having
> > stale data is what you get from using a non-live query.
> >
> > Using an old URI as the unique identifier to retrieve other data (like
> > snippets or something) could lead to some  odd situations.

Which odd situations? If we are just talking about apps trying to get
metadata for non-existing files I think this should be something that the
app should cope with.

Oops, yes, you're right the URI can change too.

Whether an uri can "change" or not is merely a matter of definition. If an
object changes uri it might as well be regarded as another object all
together. The end user will see it as a rename/move but in the api I think
we should go for the delete-create metafor. Meaning that the is a one-to-one
correspondence between objects and uris.

Lacking stable item identifiers, the only solution I can see is to have the
> initial query create and return a unique query identifier.

I'm not sure I understand what you mean here...

We would then rely on the database/index manager to ensure that all
> activity related to this query identifier is either consistent or
> resulting
> in errors.
> The query string can't be used as a query identifier, except if we
> strictly
> renounce relating data from different calls. Well, at least we now know
> why
> it didn't look right :)

I really think that it is important that we use the same data structures in
the simple and the live apis (in this case the query responses). More than
that reusing the same data structures as many places as possible is a thing
I think will pay off.

The live api needs to be able to add and remove hits dynamically. The bare
minimum needed for this is unique object identifiers with respect to each
Query object. Better would be unique identifiers relative to the Session,
and the best would be globally unique identifiers (that would be uris).

I think the most intuitive query response structure in the live api is a map

  object1_id : { prop1 : [values...], prop2 : [values...] }
  object2_id : { prop1 : [values...], prop2 : [values...] }

the simple api is more straight forward with the return structure being a
_sorted_ list like:

  {"uri": [object1_uri], prop1: [values...], prop2: [values]}
  {"uri": [object2_uri], prop1: [values...], prop2: [values]}

There are a few cases to consider. Fx. the GetMetadata(in as uris, in as
properties) method in both interfaces. If we use the list-type query
response it will be necessary that the response is ordered identical to the
original requested uri list. This introduces some complexity for async
requests where the app will have to keep track the list of requested uris
until it gets a response and then pair the results up.

To get around this bookkeeping we could make the uri field mandatory in the
GetMetadata response, but then we'd actually be using the uris as doc ids,
and might as well have used a map in the first place.

I still tend heavily toward the a{sa{sas}} structure (map-like) since it
seems the most flexible of the two. It is also easily reusable for different
purposes (fx. metadata retrieval) which should make coding easier for the
client apps or bindings.


PS: Happy new year everybody.
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the xdg mailing list