2006/11/27, Kevin Krammer <<a href="mailto:kevin.krammer@gmx.at">kevin.krammer@gmx.at</a>>:<div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> On Monday 27 November 2006 12:08, Mikkel Kamstrup Erlandsen wrote: I am not a searching or indexing expert, merely wanted to input some information regarding D-Bus sync/async calls :) > I think you raise a really good question Kevin. Let me  first introduce > some  terminology to ease the communication. > > Page Query: All results for a given query is returned in one chunk. This > call is still *async* since it is over dbus. This is how it is sugegstedin > on the WasabiDraft wiki page. > > Async Query: Query results trickle in as the search engine picks them up. > Ie all query results are not returned in one batch. I'd rather call it "Full" and "Partial" Query or Query with "Full" or "Partial" delivery.</blockquote><div> I was not trying to establish a convention, I just needed some words for it.  For what it's worth I think Full- and Partial Delivery are the best terms. However, for method names I actually think my names make more sense. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> In the page query the client can simulate an async query by requesting > several blocking queries with the same query string, but different > page-ranges. This gives a small problem with page ordering, but nothing > that the client app could not work around. The big benefit for page queries > is that server side sorting (score, relevance, date, whatever) is a > no-brainer for the client. Just append the "sort:<sorttype>" switch to the > query string. How long does a search service have to cache such a query - result combination?</blockquote><div> That's up to the implementation. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Or is searching so fast, that the same query can be re-done on every call? </blockquote><div> Again, some backends will have native caching capabilities, others won't. I think we should focus on keeping the interface easy to use for application developers, and leave the headaches to the search engine devs... Sorry guys :-)   </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> In the async query you have a sorting problem. The client cannot sort the > hits, unless each returned URI also has metadata associated with it (it > looks this stuff up with another dbus call). I see a huge benefit in > allowing the results to trickle in (and allows for canceling queries as > Kevin points out). The async query is also much more suitable for live > queries (in the sense of updating the query when the on-disk files change - > or are deleted/created). Would it be possible to associate a sorting key with each match? If so it could be part of the returned data, i.e. the result being an array of tuples of URI and key.</blockquote><div> I don't know if this would make sense actually... How would the backend know what the final sort order would be if it hasn't collected all hits? - I'm not ruling it out, I'm just not able to see how it would work out... </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> So what do I think? I see 2 options: > > 1) Change the Query method name to PageQuery and add another AsyncQuery > with a signature and behavior we need to think a bit about. > > 2) Don't change the org.freedesktop.search.simple interface, but create > another interface generally aimed at live queries - or maybe include this > in the "advanced" search interface when we get to defining that. A more advanced interface could be based on query objects, i.e. the client requests a remote peer object for a specific query and the service creates an handler object and returns the object path.</blockquote><div> Yeah, that could be an idea. This would not be a good idea for apps spawning tons of searches though. And I actually think we should pay close attention to catering for massive search requests. I can easily picture a future where there are some client or other that does a bunch of searches in the background showing relevant information to your current context... (just one example). </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">The client can then call this object's methods and listen to this object's signals, without needing to reference it with the query string at each call or on each signal. The object path will be the reference</blockquote><div> Again, I like the idea - but I see some problems with it though (as mentioned above). Maybe it should rather be a server side client proxy or something (that sounds like an oxymoron :-)). Where the remote object does not represent a query, but rather a dedicated connection. I know that this is possible with dbus, but I have never played around with it...  </div>Cheers, Mikkel </div>