simple search api (was Re: mimetype standardisation by testsets)

Sun Nov 26 16:48:50 EET 2006

It's quite amazing how parallel thinking can bring people to the same point
over a few days. I am in quite complete agreement with Fabrice's message
and most of Wasabi2 or the recent edits by Mikkel on Wasabi.

The initial stated goal for Wasabi is/was quite broad, "Unified dbus api
for desktop search" was the subject of the email I received. This is what
made me react quite energically to the proposal for a query language.

After a few days of thinking, I now think that there are 2 distincts
needs:

 1- A need for trivial enabling of text search in any (non-search)
    application, with minimal fuss, (better described by Fabrice in the
    quoted message). 

 2- A more hypothetical need for an interface that would allow 
    *search tool* front-end writers to make use of different search 
    back-ends.

The need for (1) is immediate and obvious, and maybe it should be made
clearer that this is the initial goal for the Wasabi project. 

This would remove any major objection on my part about the simple query
language on the WasabiDraft page.

In this perspective, I would even be in favor of removing the specification
in Wasabi2 that the on-bus language is structured. Let us just prefix the
interface as Simple or Trivial or whatever, and let the backend deal with
query string interpretation.

As to (2), maybe we can put this on the back-burner for the time
being. This is a more difficult project, and, if needed, it will be easy
enough to extend Wasabi with an alternative query interface, while keeping
compatibility for the non-search apps using the trivial interface.
(There probably should be some mention of it somewhere on the project page
though, to make it clear that we differentiate the two perspectives).

Under the assumption that we are more or less in agreement, I have a few
more comments about the simple interface:

- About the query language, and just for the record, the syntax described
  on WasabiDraft is more the one from Beagle than the one from Lucene
  (which defaults to ORing, not ANDing the terms I think). This is probably
  and appropriately more intuitive for end-users. 

- I think that there are too many mandatory things in the 'switch'
  list. For example, the 'group' concept, while interesting, is not
  necessarily common back-end functionality. The 'author' switch might not
  be so commonly supported either. I think that we should, specify a set of
  standard switches, specify also that back-ends should just ignore
  switches they don't know or support, and let them do their best.

- Should we provide one/several sample parsers to turn the query string
  into a simple data structure ? This might help back-end adapter writers,
  and also help remove any possible ambiguity from the language definition.

Regards,
JF

PS: about Beagle and XML. There was a question about this somewhere, so I
had a look this morning. From what I could see, XML is used only
implicitely in Beagle as the serialization form for communication between
the front-end and the beagled daemon. As this kind of serialization is
supported directly by the standard C# libraries, it's more or less
transparent. The libbeagle C front-end library has to generate and parse
the XML "by hand". I don't think that there is anything else than this
ad-hoc use of XML, and I didn't see anything ressembling a document
definition anywhere.

Fabrice Colin writes:
 > On 11/24/06, Jean-Francois Dockes <jean-francois.dockes at wanadoo.fr> wrote:
 > > Here follow my impressions after reading the Wasabi Draft document.
 > > ...
 > > Ok, enough for now, my only hope here is to restart thinking about the
 > > query language.
 > >
 >
 > I have given some thought to this over the weekend and here's what I reckon.
 > 
 > We do need a simple text string-based query language.  The way I see it,
 > the main goal of Wasabi is to allow to plug any personal search system
 > into existing applications (file managers, toolkits' file chooser
 > dialogs, cataloguing software, etc...). These applications typically
 > only have a basic search user interface, i.e. a text field and maybe
 > some knobs that can be tweaked.
 >
 > Once we have sorted out the dbus interface, these apps will only have to
 > make a couple of method calls and pass the string entered by the
 > user. We should try to make it as easy as possible to run searches; any
 > parsing/formatting that's necessary on the part of these apps will add
 > complexity. The more complex it is, the less widely it will be adopted.
 > Since most end-users are familiar with the query format supported by
 > popular Web engines, we should go for something similar.  Leo mentioned
 > Lucene's query language. While I agree we should avoid tying anything to
 > a particular search toolkit, a subset of that query language might make
 > sense. If need be, an ABNF grammar would remove ambiguities.
 >
 > 
 > On the other hand, I agree with Jean-Francois that a more powerful query
 > language is better in the medium to long term. I don't know which is the
 > most appropriate.
 > 
 > I think a dual approach, as proposed on the second draft, makes sense.
 > 
 > Fabrice
 > _______________________________________________
 > xdg mailing list
 > xdg at lists.freedesktop.org
 > http://lists.freedesktop.org/mailman/listinfo/xdg