[XESAM] API simplification?
Mikkel Kamstrup Erlandsen
mikkel.kamstrup at gmail.com
Thu Jul 19 22:50:22 PDT 2007
2007/7/19, Jos van den Oever <jvdoever at gmail.com>:
> 2007/7/16, Mikkel Kamstrup Erlandsen <mikkel.kamstrup at gmail.com>:
> > I have a few suggestions for updates to the xesam search spec.
> > * API:
> > Remove the session properties search.blocking and search.live. These
> > to cause more confusion than I anticipated. These can be emulated in the
> > client side lib as far as my scribblings can tell. Anoter solution might
> > just be better documentation of course...
> > Some of you now have actual experience with these, what is your feel?
> > The reason for having these properties in the first place was to allow
> > easier usage of the dbus interface directly - ie not via a client lib.
> > What this would mean for the api methods:
> > * GetHits should always block until the requested number of hits has
> > found or the entire index has been searched (in which case the
> > signal will be emitted too).
> > * CountHits should always block until the entire index has been
> > * No other methods should block
> > * Query Language:
> > I suggest we remove the "type" attributeon the query element. You can
> > specify the Category- or StoredAs fields in you selectors.
> I completely agree on all suggestions.
> One more suggestion: the minimal interval between result signals
> should be sane or settable.
Valid point. To avoid signal spamming I take it. How about a session
property hit.batch.size that is an integer determining how many hits the
server should collect before emitting HitsAdded. In case the entire index
has been searched but < hit.batch.size hits has been found HitsAdded should
be emitted(num_hits) right before SearchDone.
On the topic of remembering the hits.
> In ideal world, the server could be clever and get the right file from
> the hit number. In reality, this is quite hard. Atm the server should
> keep a vector with uris internally. I think we should allow the server
> to have a sane maximum of hits that are retrievable. E.b. CountHits
> might return 1 million, but you would only be able to retrieve the
> first 100k.
This makes sense given that the scoring algorithms on servers are good
enough. But judging by the extraordinary amount of talent we have in the
server-side dev camp this is no problem of course :-)
How about a read-only session property search.maxhits? We could specify that
in order to be xesam compliant this value must be > 1000 or something - just
so that apps wont have to sanity checks galore.
This is actually a scalability issue. We should allow the search to
> modify the vector when the hit has not yet been retrieved and only
> guarantee reproducibility for hits that were retrieved already. In
> combination with a maximum history size this would handle most
> performance problems.
Yeah, we are handling the exact same problems at work :-) I think we have
solved it here (atleast up to 100M or so), but it is not exactly client side
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the xdg