2006/11/27, Kevin Krammer <<a href="mailto:kevin.krammer@gmx.at">kevin.krammer@gmx.at</a>>:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Monday 27 November 2006 12:08, Mikkel Kamstrup Erlandsen wrote:<br><br>I am not a searching or indexing expert, merely wanted to input some<br>information regarding D-Bus sync/async calls :)<br><br>> I think you raise a really good question Kevin. Let me first introduce
<br>> some terminology to ease the communication.<br>><br>> Page Query: All results for a given query is returned in one chunk. This<br>> call is still *async* since it is over dbus. This is how it is sugegstedin
<br>> on the WasabiDraft wiki page.<br>><br>> Async Query: Query results trickle in as the search engine picks them up.<br>> Ie all query results are not returned in one batch.<br><br>I'd rather call it "Full" and "Partial" Query or Query with "Full"
<br>or "Partial" delivery.</blockquote><div><br><br>I was not trying to establish a convention, I just needed some words for it. For what it's worth I think Full- and Partial Delivery are the best terms. However, for method names I actually think my names make more sense.
<br></div><br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> In the page query the client can simulate an async query by requesting
<br>> several blocking queries with the same query string, but different<br>> page-ranges. This gives a small problem with page ordering, but nothing<br>> that the client app could not work around. The big benefit for page queries
<br>> is that server side sorting (score, relevance, date, whatever) is a<br>> no-brainer for the client. Just append the "sort:<sorttype>" switch to the<br>> query string.<br><br>How long does a search service have to cache such a query - result
<br>combination?</blockquote><div><br>That's up to the implementation. </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Or is searching so fast, that the same query can be re-done on every call?
</blockquote><div><br>Again, some backends will have native caching capabilities, others won't. I think we should focus on keeping the interface easy to use for application developers, and leave the headaches to the search engine devs... Sorry guys :-)
<br> <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> In the async query you have a sorting problem. The client cannot sort the
<br>> hits, unless each returned URI also has metadata associated with it (it<br>> looks this stuff up with another dbus call). I see a huge benefit in<br>> allowing the results to trickle in (and allows for canceling queries as
<br>> Kevin points out). The async query is also much more suitable for live<br>> queries (in the sense of updating the query when the on-disk files change -<br>> or are deleted/created).<br><br>Would it be possible to associate a sorting key with each match?
<br>If so it could be part of the returned data, i.e. the result being an array of<br>tuples of URI and key.</blockquote><div><br><br>I don't know if this would make sense actually... How would the backend know what the final sort order would be if it hasn't collected all hits? - I'm not ruling it out, I'm just not able to see how it would work out...
<br><br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> So what do I think? I see 2 options:<br>><br>> 1) Change the Query method name to PageQuery and add another AsyncQuery
<br>> with a signature and behavior we need to think a bit about.<br>><br>> 2) Don't change the org.freedesktop.search.simple interface, but create<br>> another interface generally aimed at live queries - or maybe include this
<br>> in the "advanced" search interface when we get to defining that.<br><br>A more advanced interface could be based on query objects, i.e. the client<br>requests a remote peer object for a specific query and the service creates an
<br>handler object and returns the object path.</blockquote><div><br>Yeah, that could be an idea. This would not be a good idea for apps spawning tons of searches though. And I actually think we should pay close attention to catering for massive search requests. I can easily picture a future where there are some client or other that does a bunch of searches in the background showing relevant information to your current context... (just one example).
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">The client can then call this object's methods and listen to this object's<br>
signals, without needing to reference it with the query string at each call<br>or on each signal. The object path will be the reference</blockquote><div><br>Again, I like the idea - but I see some problems with it though (as mentioned above). Maybe it should rather be a server side client proxy or something (that sounds like an oxymoron :-)). Where the remote object does not represent a query, but rather a dedicated connection. I know that this is possible with dbus, but I have never played around with it...
<br> </div>Cheers,<br>Mikkel<br></div>