2007/3/13, jamie <<a href="mailto:jamiemcc@blueyonder.co.uk">jamiemcc@blueyonder.co.uk</a>>:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Tue, 2007-03-13 at 21:56 +0800, Fabrice Colin wrote:<br>> On 3/13/07, Mikkel Kamstrup Erlandsen <<a href="mailto:mikkel.kamstrup@gmail.com">mikkel.kamstrup@gmail.com</a>> wrote:<br>> > Please give <a href="http://freedesktop.org/wiki/WasabiSearchLive">
http://freedesktop.org/wiki/WasabiSearchLive</a> a<br>> > good look before we set this in stone. It is the last call if you have any<br>> > objections - I really mean it this time. Anything from critisizing the
<br>> > fundamental structure down to nitpicking on the session property names is<br>> > welcome.<br>> ><br>> There's a couple of things I am not clear about :<br>><br>> - "search.blocking
: Whether or not calls will block until the<br>> requested items are available."<br>> Do you really mean this ? Should NewSearch block ad vitam eternam if<br>> there are no<br>> results for the given query ? ;-)
<br>><br>> - "CountHits (in s search, out i count) Returns the current number of<br>> found hits. If<br>> search.blocking==true this call blocks until the index has been fully searched."<br>> Shouldn't this read "if
search.live==false this call blocks..." ?<br>><br>> - "These signals are only used if the session property search.blocking is true."<br>> Again, shouldn't it be "if search.live is true" ?
<br>><br>> - GetState<br>> if the first string is "FULL_INDEX", shouldn't the second string<br>> always be "100" ?<br>><br>> - signal HitsAdded<br>> is count the number of new hits, or the new number of hits ? I assume the latter
<br>> since the example at the bottom shows a call to "GetHits(session, count)" after<br>> receiving "HitsAdded(count)".<br>><br>> - signal StateChanged<br>> An example would be welcome here. For indexers that monitor sources, eg monitor
<br>> the filesystem with inotify, the state will switch between UPDATING<br>> and IDLE and/or FULL_INDEX very often. Is the indexer supposed to send<br>> a signal every time ?<br>><br>> - properties and field names
<br>> You may want to clarify what differences, if any, there are between<br>> properties and<br>> field names.<br>><br><br>On top of all that if this API were to be usable in our tracker GUI we<br>would need the following:
<br><br>1) in tracker the service type being searched is mandatory - I would<br>prefer it to be a session property or even better a param in the<br>NewSearch method. If it remains part of the xml then that bit should be<br>
mandatory in the xml schema/dtd</blockquote><div><br>Having it in a session property seems really odd, since it seems a natural part of the query (ie. the query also contains "what to query"). Putting it in a param to NewSearch also is not biggest desire since the current approach where you only need a session and a query to start a search is very clean. Currently a query is "self-contained" - doesn't require anything else to be runnable, if it required additional info to be useful, then that is a drawback (in my head atleast).
<br><br>Making "type" a mandatory attribute on the query element could be fine by me. I just fail to see the problem in defaulting to all. It would not only be slow, but also undefined in which objects you search. But why not allow it for convenience? It wouldn't require much documentation to explain this.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">2) GetHits/GetHitData<br><br>There are two use cases as far as tracker goes:<br>
<br>a ) if i need metadata for all hits then it will always be quicker to<br>have them in GetHits<br><br>b) for things like our tile we need to fetch extra metadata for a single<br>hit so GetHitData would only ever be used for a single hit not multiple
<br>ones - would be easier for us if that was changed to:<br><br>GetHitData (in s ID, in as fields out av values)<br><br>(I cant think of a single case where we would want to get metadata<br>*separately* for more than one hit at a time)
</blockquote><div><br>Well, the trick is that GetHitData is also used when you receive a HitsModified signal. Then you re-fetch metadata for all the hit-ids. Consider the case where I move a directory and I have 50 files inside it all giving me matches (this will fire a HitsModified since moving files just amount to changing the uri field of the hit).
<br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">3) for separate snippets we would like to include a max length of the<br>returned snippet so I'm not sure if a dedicated call for this would be
<br>better? Might not matter for a general purpose API like Wasabi?</blockquote><div><br><br>Well, generally Wasabi is designed around "sane defaults" (in many places atleast). Wouldn't it suffice to return a "sanely sized" snippet and let the UI trim it to an appropriate size?
<br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I dont think we can freeze the api until we have a working<br>implementation (which may uncover the need for more changes) - I plan on
<br>implementing it in tracker next month.</blockquote><div><br><br>I agree, and that's also why I have not pressed harder on this. I'm working on some Python gobject bindings+tools to help test Wasabi services. They will also include a dummy server implementation. Having a real service to search against would be really nice ofcourse :-)
<br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">things still blocking implementation:<br><br>1) list of applicable metadata names - I would suggest a mandatory set
<br>(IE metadata supported by all) and an optional set (this would always<br>return NULL if not supported)<br><br>2) list of applicable service types (emails, files, conversations etc)</blockquote><div><br><br>I deliberately didn't push this debate much lately because I wanted to hear what the Nepomuk guys had to say about this. Now that I know that they are open to having the Wasabi metadata fields map to their fully sematic types I think it should be safe to move on. - But yes, we are really starting to need this.
<br></div><br><br><br>Cheers,<br>Mikkel<br></div>