[XESAM] API simplification?

Thu Jul 19 22:50:22 PDT 2007

2007/7/19, Jos van den Oever <jvdoever at gmail.com>:
>
> 2007/7/16, Mikkel Kamstrup Erlandsen <mikkel.kamstrup at gmail.com>:
> > I have a few suggestions for updates to the xesam search spec.
> >
> > * API:
> > Remove the session properties search.blocking and search.live. These
> seemed
> > to cause more confusion than I anticipated. These can be emulated in the
> > client side lib as far as my scribblings can tell. Anoter solution might
> > just be better documentation of course...
> >
> > Some of you now have actual experience with these, what is your feel?
> >
> > The reason for having these properties in the first place was to allow
> > easier usage of the dbus interface directly - ie not via a client lib.
> >
> > What this would mean for the api methods:
> >  * GetHits should always block until the requested number of hits has
> been
> > found or the entire index has been searched (in which case the
> SearchDone
> > signal will be emitted too).
> >  * CountHits should always block until the entire index has been
> searched
> >  * No other methods should block
> >
> > * Query Language:
> > I suggest we remove the "type" attributeon the query element. You can
> just
> > specify the Category- or StoredAs fields in you selectors.
>
> I completely agree on all suggestions.
> One more suggestion: the minimal interval between result signals
> should be sane or settable.

Valid point. To avoid signal spamming I take it. How about a session
property  hit.batch.size that is an integer determining how many hits the
server should collect before emitting HitsAdded. In case the entire index
has been searched but < hit.batch.size hits has been found HitsAdded should
be emitted(num_hits) right before SearchDone.

On the topic of remembering the hits.
> In ideal world, the server could be clever and get the right file from
> the hit number. In reality, this is quite hard. Atm the server should
> keep a vector with uris internally. I think we should allow the server
> to have a sane maximum of hits that are retrievable. E.b. CountHits
> might return 1 million, but you would only be able to retrieve the
> first 100k.

This makes sense given that  the scoring algorithms on servers are good
enough. But judging by the extraordinary amount of talent we have in the
server-side dev camp this is no problem of course :-)

How about a read-only session property search.maxhits? We could specify that
in order to be xesam compliant this value must be > 1000 or something - just
so that apps wont have to sanity checks galore.

This is actually a scalability issue. We should allow the search to
> modify the vector when the hit has not yet been retrieved and only
> guarantee reproducibility for hits that were retrieved already. In
> combination with a maximum history size this would handle most
> performance problems.

Yeah, we are handling the exact same problems at work :-) I think  we have
solved it here (atleast up to 100M or so), but it is not exactly client side
software...

Cheers,
Mikkel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/xdg/attachments/20070720/be4281ab/attachment.html