[Wasabi] FOSDEM conclusions - finalizing the search spec
Mikkel Kamstrup Erlandsen
mikkel.kamstrup at gmail.com
Wed Mar 14 10:36:44 EET 2007
2007/3/13, jamie <jamiemcc at blueyonder.co.uk>:
>
> On Tue, 2007-03-13 at 20:05 +0100, Mikkel Kamstrup Erlandsen wrote:
> > 2007/3/13, jamie <jamiemcc at blueyonder.co.uk>:
> > On Tue, 2007-03-13 at 21:56 +0800, Fabrice Colin wrote:
> > > On 3/13/07, Mikkel Kamstrup Erlandsen
> > <mikkel.kamstrup at gmail.com> wrote:
> > > > Please give http://freedesktop.org/wiki/WasabiSearchLive a
> > > > good look before we set this in stone. It is the last call
> > if you have any
> > > > objections - I really mean it this time. Anything from
> > critisizing the
> > > > fundamental structure down to nitpicking on the session
> > property names is
> > > > welcome.
> > > >
> > > There's a couple of things I am not clear about :
> > >
> > > - "search.blocking : Whether or not calls will block until
> > the
> > > requested items are available."
> > > Do you really mean this ? Should NewSearch block ad vitam
> > eternam if
> > > there are no
> > > results for the given query ? ;-)
> > >
> > > - "CountHits (in s search, out i count) Returns the current
> > number of
> > > found hits. If
> > > search.blocking==true this call blocks until the index has
> > been fully searched."
> > > Shouldn't this read "if search.live==false this call
> > blocks..." ?
> > >
> > > - "These signals are only used if the session property
> > search.blocking is true."
> > > Again, shouldn't it be "if search.live is true" ?
> > >
> > > - GetState
> > > if the first string is "FULL_INDEX", shouldn't the second
> > string
> > > always be "100" ?
> > >
> > > - signal HitsAdded
> > > is count the number of new hits, or the new number of hits ?
> > I assume the latter
> > > since the example at the bottom shows a call to
> > "GetHits(session, count)" after
> > > receiving "HitsAdded(count)".
> > >
> > > - signal StateChanged
> > > An example would be welcome here. For indexers that monitor
> > sources, eg monitor
> > > the filesystem with inotify, the state will switch between
> > UPDATING
> > > and IDLE and/or FULL_INDEX very often. Is the indexer
> > supposed to send
> > > a signal every time ?
> > >
> > > - properties and field names
> > > You may want to clarify what differences, if any, there are
> > between
> > > properties and
> > > field names.
> > >
> >
> > On top of all that if this API were to be usable in our
> > tracker GUI we
> > would need the following:
> >
> > 1) in tracker the service type being searched is mandatory - I
> > would
> > prefer it to be a session property or even better a param in
> > the
> > NewSearch method. If it remains part of the xml then that bit
> > should be
> > mandatory in the xml schema/dtd
> >
> > Having it in a session property seems really odd, since it seems a
> > natural part of the query (ie. the query also contains "what to
> > query"). Putting it in a param to NewSearch also is not biggest desire
> > since the current approach where you only need a session and a query
> > to start a search is very clean. Currently a query is "self-contained"
> > - doesn't require anything else to be runnable, if it required
> > additional info to be useful, then that is a drawback (in my head
> > atleast).
> >
> > Making "type" a mandatory attribute on the query element could be fine
> > by me. I just fail to see the problem in defaulting to all. It would
> > not only be slow, but also undefined in which objects you search. But
> > why not allow it for convenience? It wouldn't require much
> > documentation to explain this.
> >
> >
> > 2) GetHits/GetHitData
> >
> > There are two use cases as far as tracker goes:
> >
> > a ) if i need metadata for all hits then it will always be
> > quicker to
> > have them in GetHits
> >
> > b) for things like our tile we need to fetch extra metadata
> > for a single
> > hit so GetHitData would only ever be used for a single hit not
> > multiple
> > ones - would be easier for us if that was changed to:
> >
> > GetHitData (in s ID, in as fields out av values)
> >
> > (I cant think of a single case where we would want to get
> > metadata
> > *separately* for more than one hit at a time)
> >
> > Well, the trick is that GetHitData is also used when you receive a
> > HitsModified signal. Then you re-fetch metadata for all the hit-ids.
> > Consider the case where I move a directory and I have 50 files inside
> > it all giving me matches (this will fire a HitsModified since moving
> > files just amount to changing the uri field of the hit).
> >
> >
> > 3) for separate snippets we would like to include a max length
> > of the
> > returned snippet so I'm not sure if a dedicated call for this
> > would be
> > better? Might not matter for a general purpose API like
> > Wasabi?
> >
> >
> > Well, generally Wasabi is designed around "sane defaults" (in many
> > places atleast). Wouldn't it suffice to return a "sanely sized"
> > snippet and let the UI trim it to an appropriate size?
>
> would not be easy for an app though (think of the case when you have
> multiple search terms highlighted in the snippet)
Good point.
I am only suggesting these because they are in important in tracker -
> not sure if they matter in Wasabi but could do?
We could put the preferred snippet length in a session property. Would that
suffice? You would not be able to set it per-search, but I am not sure that
is necessary anyway..?
Another thing we do in T-S-T, is get hit count grouped by service (would
> be slower to get a hit count for each type individually)
I assume you use the Tracker method[1] GetHitCount(in s service, in s
search_text, out i count) for this.
If you want the same functionality in wasabi you would probably have to use
a main session and a parallel "counter" session with hit.fields=[]. Then
each time a new hit type is found in the main session you fire of a query on
that type only in the counter session and use that to get the type specific
hit count.
Note that this sort of counting is really just a simple version of more
general information clustering. And if you want to do a more complete
clustering you will probably not be able to get around firing of parallel
searches anyway.
I leave it up to you to decide whether these are important enough to
> warrant wasabi support :)
>
Eeek, I'm not sure I got the balls for that :-) I would like to hear what
others think before I make any decisions.
Cheers,
Mikkel
[1]:
http://svn.gnome.org/viewcvs/tracker/trunk/data/tracker-introspect.xml?revision=530
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/xdg/attachments/20070314/09cd6fad/attachment.htm
More information about the xdg
mailing list