2007/3/13, jamie <<a href="mailto:jamiemcc@blueyonder.co.uk">jamiemcc@blueyonder.co.uk</a>>:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Tue, 2007-03-13 at 20:05 +0100, Mikkel Kamstrup Erlandsen wrote:<br>> 2007/3/13, jamie <<a href="mailto:jamiemcc@blueyonder.co.uk">jamiemcc@blueyonder.co.uk</a>>:<br>> On Tue, 2007-03-13 at 21:56 +0800, Fabrice Colin wrote:
<br>> > On 3/13/07, Mikkel Kamstrup Erlandsen<br>> <<a href="mailto:mikkel.kamstrup@gmail.com">mikkel.kamstrup@gmail.com</a>> wrote:<br>> > > Please give <a href="http://freedesktop.org/wiki/WasabiSearchLive">
http://freedesktop.org/wiki/WasabiSearchLive</a> a<br>> > > good look before we set this in stone. It is the last call<br>> if you have any<br>> > > objections - I really mean it this time. Anything from
<br>> critisizing the<br>> > > fundamental structure down to nitpicking on the session<br>> property names is<br>> > > welcome.<br>> > ><br>> > There's a couple of things I am not clear about :
<br>> ><br>> > - "search.blocking : Whether or not calls will block until<br>> the<br>> > requested items are available."<br>> > Do you really mean this ? Should NewSearch block ad vitam
<br>> eternam if<br>> > there are no<br>> > results for the given query ? ;-)<br>> ><br>> > - "CountHits (in s search, out i count) Returns the current
<br>> number of<br>> > found hits. If<br>> > search.blocking==true this call blocks until the index has<br>> been fully searched."<br>> > Shouldn't this read "if
search.live==false this call<br>> blocks..." ?<br>> ><br>> > - "These signals are only used if the session property<br>> search.blocking is true."<br>> > Again, shouldn't it be "if
search.live is true" ?<br>> ><br>> > - GetState<br>> > if the first string is "FULL_INDEX", shouldn't the second<br>> string<br>> > always be "100" ?
<br>> ><br>> > - signal HitsAdded<br>> > is count the number of new hits, or the new number of hits ?<br>> I assume the latter<br>> > since the example at the bottom shows a call to
<br>> "GetHits(session, count)" after<br>> > receiving "HitsAdded(count)".<br>> ><br>> > - signal StateChanged<br>> > An example would be welcome here. For indexers that monitor
<br>> sources, eg monitor<br>> > the filesystem with inotify, the state will switch between<br>> UPDATING<br>> > and IDLE and/or FULL_INDEX very often. Is the indexer<br>> supposed to send
<br>> > a signal every time ?<br>> ><br>> > - properties and field names<br>> > You may want to clarify what differences, if any, there are<br>> between<br>
> > properties and<br>> > field names.<br>> ><br>><br>> On top of all that if this API were to be usable in our<br>> tracker GUI we<br>> would need the following:
<br>><br>> 1) in tracker the service type being searched is mandatory - I<br>> would<br>> prefer it to be a session property or even better a param in<br>> the<br>> NewSearch method. If it remains part of the xml then that bit
<br>> should be<br>> mandatory in the xml schema/dtd<br>><br>> Having it in a session property seems really odd, since it seems a<br>> natural part of the query (ie. the query also contains "what to
<br>> query"). Putting it in a param to NewSearch also is not biggest desire<br>> since the current approach where you only need a session and a query<br>> to start a search is very clean. Currently a query is "self-contained"
<br>> - doesn't require anything else to be runnable, if it required<br>> additional info to be useful, then that is a drawback (in my head<br>> atleast).<br>><br>> Making "type" a mandatory attribute on the query element could be fine
<br>> by me. I just fail to see the problem in defaulting to all. It would<br>> not only be slow, but also undefined in which objects you search. But<br>> why not allow it for convenience? It wouldn't require much
<br>> documentation to explain this.<br>><br>><br>> 2) GetHits/GetHitData<br>><br>> There are two use cases as far as tracker goes:<br>><br>> a ) if i need metadata for all hits then it will always be
<br>> quicker to<br>> have them in GetHits<br>><br>> b) for things like our tile we need to fetch extra metadata<br>> for a single<br>> hit so GetHitData would only ever be used for a single hit not
<br>> multiple<br>> ones - would be easier for us if that was changed to:<br>><br>> GetHitData (in s ID, in as fields out av values)<br>><br>> (I cant think of a single case where we would want to get
<br>> metadata<br>> *separately* for more than one hit at a time)<br>><br>> Well, the trick is that GetHitData is also used when you receive a<br>> HitsModified signal. Then you re-fetch metadata for all the hit-ids.
<br>> Consider the case where I move a directory and I have 50 files inside<br>> it all giving me matches (this will fire a HitsModified since moving<br>> files just amount to changing the uri field of the hit).<br>
><br>><br>> 3) for separate snippets we would like to include a max length<br>> of the<br>> returned snippet so I'm not sure if a dedicated call for this<br>> would be
<br>> better? Might not matter for a general purpose API like<br>> Wasabi?<br>><br>><br>> Well, generally Wasabi is designed around "sane defaults" (in many<br>> places atleast). Wouldn't it suffice to return a "sanely sized"
<br>> snippet and let the UI trim it to an appropriate size?<br><br>would not be easy for an app though (think of the case when you have<br>multiple search terms highlighted in the snippet)</blockquote><div><br>Good point.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I am only suggesting these because they are in important in tracker -<br>not sure if they matter in Wasabi but could do?
</blockquote><div><br>We could put the preferred snippet length in a session property. Would that suffice? You would not be able to set it per-search, but I am not sure that is necessary anyway..?<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Another thing we do in T-S-T, is get hit count grouped by service (would<br>be slower to get a hit count for each type individually)</blockquote><div><br>I assume you use the Tracker method[1] GetHitCount(in s service, in s search_text, out i count) for this.
<br><br>If you want the same functionality in wasabi you would probably have to use a main session and a parallel "counter" session with hit.fields=[]. Then each time a new hit type is found in the main session you fire of a query on that type only in the counter session and use that to get the type specific hit count.
<br><br>Note that this sort of counting is really just a simple version of more general information clustering. And if you want to do a more complete clustering you will probably not be able to get around firing of parallel searches anyway.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I leave it up to you to decide whether these are important enough to<br>warrant wasabi support :)
<br></blockquote></div><br>Eeek, I'm not sure I got the balls for that :-) I would like to hear what others think before I make any decisions.<br><br>Cheers,<br>Mikkel<br><br>[1]: <a href="http://svn.gnome.org/viewcvs/tracker/trunk/data/tracker-introspect.xml?revision=530">
http://svn.gnome.org/viewcvs/tracker/trunk/data/tracker-introspect.xml?revision=530</a><br>