I updated the live search proposal on <a href="http://wiki.freedesktop.org/wiki/WasabiSearchLive">http://wiki.freedesktop.org/wiki/WasabiSearchLive</a> with a unified one (of simple and live). 2007/1/24, Magnus Bergman < <a href="mailto:magnus.bergman@observer.net">magnus.bergman@observer.net</a>>:<div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> On Sat, 20 Jan 2007 21:27:38 +0100 "Mikkel Kamstrup Erlandsen" <<a href="mailto:mikkel.kamstrup@gmail.com">mikkel.kamstrup@gmail.com</a>> wrote: > 2007/1/19, Magnus Bergman <<a href="mailto:magnus.bergman@observer.net"> magnus.bergman@observer.net</a>>: > > > > First some comments on the current draft[1] > > """"""""""""""""""""""""""""""""""""""""""" > > > >   As with the WasabiSearchSimple API[2] the session *is* the D-BUS > >   connection. So there really doesn't need to be an explicit session > >   object. It might be adequate to have one for the language > > bindings, but then the same thing goes for the simple API. > > I actually think the session should be explicit. Both language > bindings and actual server implementations would have an easier life > if it was explicit. I don't object to that. But in that case I think the same goes for the simple API. I assume sessions will map 1:1 to the dbus connection (bindings might want to hide the dbus connection in the session object). </blockquote><div> Ok, good. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">>   If the method GetMetadata should exist I think it would make more > >   sense to make it belong to a document object, rename it > > GetProperty and include it in the metadata storage API instead. > > > Yes, it looks out of place in the search interface. There does > however need to be a way to obtain the "expensive" hit metadata as > discussed in the thread about the simple api. > > >  And as I said before, I think it makes sense to treat queries and > >   searches as different objects, which means renaming Query.Start to > >   something like NewSearch. It also means that a query doesn't need > > to belong to anything (like the session), it could exist > > independently (unlike a search). I have left out possible functions > > dealing with queries (like constructing an XML query from a simple > > query string) since functions like that rather belong in a library. > > > I follow you on the search/query separation. Having NewSearch() > actually start the search gives some problems with the > SearchSetProperty() since it doesn't make much sense to change > properties on a running search. Spotlight has some similar methods > and they restart the search if you invoke them. The reason I included > a Query.Start - in current context Search.Start, was exactly that it > should be possible to set properties on a Search/Query before it was > actually run. If it doesn't make sense to change properties on a running search, then the function could be removed. But I think there might be cases then it does. Every property set before the search starts are just included in the XML query, right? So any function that sets properties for the query can never do anything else than modify the query on the client side. And I think such functions belong in a library. </blockquote><div> I removed the method from the search object. Session properties are not included in the query xml, but are set on the server separately. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> >   Apart from ShowConfiguration(), all functions of the simple API > seems > >   to be in the live API as well. > > > I moved simple/live.ShowCOnfiguration to a dbus interface > org.freedesktop.search.ui.ShowConfiguration, togeteher with a new > method ShowSearchTool. Please see > <a href="http://wiki.freedesktop.org/wiki/WasabiUI">http://wiki.freedesktop.org/wiki/WasabiUI</a> for the api spec proposal. > Sorry I did not find time to notify the list before now - spare my > life :-) > > ... So, would it be > >   possible and desirable to define the simple API as a subset of the > >   live API? > > > I have ambivalent feelings on this issue. Let me outline pros and > cons as I see them. I shall spare you my confusing thoughts and cut > to the cheese: > > Loose Idea for an Interface Merge: > Have a boolean session property called "block". If it is true, > GetHits() and CountHits() blocks until the desired info is available, > removing the need for signals. If there are less hits than requested > in by GetHits when the entire index have been searched, just return > the found items. Yes. In addition to the block property it might make sense to have a "live" property as well (meaning the search will never finish). Just because you don't want the live feature doesn't necessarily mean you want it to block.</blockquote><div> Yes that makes sense. I included it in the updated suggestion. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> > The simples use case, retrieving uri and dc:title, would then look > something like this (in pseudocode): > > session = NewSession() > SetProperty (session, "block", "true") > SetProperty (session, "properties", "uri ; dc:title") > > search = NewSearch (query_xml, session)  <-- search obj inherits > requested props from the session > hits = GetHits (search, 1000) > <show hits> > > count = HitCount (search) > <print: showing 1000 of *count* hits> > Close(search) > Close(session) Yes, that's pretty close to what I imagined too. In addition I think "block" should be true by default (to make simple searching even simpler). But what does "search obj inherits" mean?</blockquote><div> Agree on the "block" thing. I meant it as a reference to the (now removed) Search.Set/GetProperty method. When you create a new search object all properties from the session are "inherited". </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> > > The actual proposal > > """"""""""""""""""" > > > > SetProperty ( in s property , in s value ) > > > >     Set a global (session) property. This method can be used for > >     several things. > >       o Setting default properties for Query objects. > >       o Authentication/encryption > >       o Generally be flexible for future needs > >     * property: Name of the property. > >     * value: New value for the property. > > > > GetProperty ( in s property , out s value) > > > >     Get the value of a global (session) property. > >     * property: Name of the property. > >     * value: Current value of the property. > > > As noted above I still think we need a session handle. By using > handles we could even Get/SetProperty to take both a session- or a > search handle.  Like SetProperty(handle, prop, val). A common SetProperty function requires some magic, which might make it troublesome for some languages. It might be neat to have in some languages (using overloading) but I object to having it at this level.</blockquote><div> Agreed. Let's just have properties on the session only. Unless someone comes up with a real good example where something makes sense on the search only. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> > NewSearchFromXML ( in s query_xml , out s search ) > > > >     Start a new search from an XML query. > >     * query_xml: The query to execute. > >     * search: A handle that is used to uniquely identify this > > search. > > > If the searches/queries can have properties I think we need a > intermediate StartSearch() method. I can accept that if we decide to > only have session properties then to start the search right away. I don't really understand the need. This *is* the "StartSearch" method. Every property set before the search starts is included in the query (XML string). Or am I missing something?</blockquote><div> The updated proposal uses Search() to both create and start the search. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> > > SearchClose ( in s search) > > > Check. > > > > SearchSetProperty ( in s search , in s property , in s value) > > > > SearchGetProperty ( in s search , in s property , out s value) > > > I have a few remarks related to this above. > > > SearchCountHits ( in s search , out i count ) > > >  Check > > SearchGetHitProperties ( in s search, in i offset, in i limit, > >                          in as properties, out a{sa{sas}} response ) > > > I think it should be called GetHits. Why  list requested props here > if you also do it in the Set*Property()? Why do we need an offset? In > a live search I can't see any reason to re-request a given range of > hits. Didn't we agree that the return value should be without maps > and just arrays? My idea of listing the requested props in Set*Property() was more of limiting the set of properties that could be retrieved with this function (but defaults to every possible prop), including the expensive one(s). The typical case would be to call this function once to get the basic props, and then perhaps again to get other (expensive) ones. In order to be able to request expensive properties later, there has to be a function like this in one way or another, even if it has another name than this function. Instead of using an offset there could be a function for "seeking" in the search result, since you might want to go back and read some other properties. I don't have any strong feelings about this, but I think it's slightly easier (for the API user) to have an offset like this. I think it should be possible to re-request hits, since you actually get it for free. The server has to remember them anyway, otherwise it will be unable to tell then a document no longer matches the query, right?</blockquote><div> You can easily re-request hits with the updated proposal. Just GetHitData() with hit ids and wanted props. </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> The real reason why I left the maps instead of writing it as arrays is that I don't know the syntax, I'm perfectly happy with arrays. About the name, I don't think it matters with these requirements. But in one of the (commercial) search engine APIs I've used the hits were also objects (so you had to first get the hit from the search and then the property from the hit). The benefit from this approach is that the hit object can have a direct pointer to the query that caused it (because a search could be constructed from more than one query). And some quite complicated things related to highlighting. Imagine you extract and index the text from a word document, then you want to view it as a highlighted PDF-document. For this to work each hit needs some extra data (I wont go into detail). But these features will never be a part of this API so the naming doesn't matter as much I guess. But that was my reason for choosing the name.</blockquote><div> A language binding could easily map the search handle with "underlying" query xml. That way a language binding could provide a GetQuery() method on the Search object.  </div> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> signal SearchHitsAdded ( s search , i count) > > > > > > signal SearchHitsRemoved ( s search , ai offsets ) > > > > signal HitsHitsModified ( s search , ai offsets ) > > > Is this why you want to be able to refetch pages in GetHitProperties? > If I recall correct this signal is why I included the GetMetadata > method in the first place. Well, sort of. I think we need the functionality of what you called GetMetadata. The question is it all should be done by GetHitProperties, or if it's better to keep GetHitProperties simple and have an additional function as well. > How do you cater for snippets? If you again want to use the > GetHitProperties method I can see the solution, but I must say that > it appears inelegant to use GetHitPropeties like this - for results, > updates, and snippets. Using GetHitProperties was what I intended, yes. To me it appears elegant, but might very well just be me. I'm willing to consider other ideas. </blockquote></div> Well, I think the current proposal is more or less in the middle of our original different ideas... Cheers, Mikkel