simple search api (was Re: mimetype standardisation by testsets)

Mon Nov 27 14:00:55 EET 2006

2006/11/26, Jean-Francois Dockes <jean-francois.dockes at wanadoo.fr>:
>
>
> It's quite amazing how parallel thinking can bring people to the same
> point
> over a few days. I am in quite complete agreement with Fabrice's message
> and most of Wasabi2 or the recent edits by Mikkel on Wasabi.
>
> The initial stated goal for Wasabi is/was quite broad, "Unified dbus api
> for desktop search" was the subject of the email I received. This is what
> made me react quite energically to the proposal for a query language.
>
> After a few days of thinking, I now think that there are 2 distincts
> needs:
>
> 1- A need for trivial enabling of text search in any (non-search)
>     application, with minimal fuss, (better described by Fabrice in the
>     quoted message).
>
> 2- A more hypothetical need for an interface that would allow
>     *search tool* front-end writers to make use of different search
>     back-ends.
>
> The need for (1) is immediate and obvious, and maybe it should be made
> clearer that this is the initial goal for the Wasabi project.
>
> This would remove any major objection on my part about the simple query
> language on the WasabiDraft page.
>
> In this perspective, I would even be in favor of removing the
> specification
> in Wasabi2 that the on-bus language is structured. Let us just prefix the
> interface as Simple or Trivial or whatever, and let the backend deal with
> query string interpretation.
>
> As to (2), maybe we can put this on the back-burner for the time
> being. This is a more difficult project, and, if needed, it will be easy
> enough to extend Wasabi with an alternative query interface, while keeping
> compatibility for the non-search apps using the trivial interface.
> (There probably should be some mention of it somewhere on the project page
> though, to make it clear that we differentiate the two perspectives).
>
> Under the assumption that we are more or less in agreement, I have a few
> more comments about the simple interface:

You can count me in on that assumption :-)

- About the query language, and just for the record, the syntax described
>   on WasabiDraft is more the one from Beagle than the one from Lucene
>   (which defaults to ORing, not ANDing the terms I think). This is
> probably
>   and appropriately more intuitive for end-users.

Which is? AND or OR? This is kind of a religious thing I guess. At wotk we
had a huge flame fest about this :-) We ended up ANDing... This was the
ruling from our usability consultants, I work at a library, and the same
usability rules might not apply to desktop search...

- I think that there are too many mandatory things in the 'switch'
>   list. For example, the 'group' concept, while interesting, is not
>   necessarily common back-end functionality. The 'author' switch might not
>   be so commonly supported either. I think that we should, specify a set
> of
>   standard switches, specify also that back-ends should just ignore
>   switches they don't know or support, and let them do their best.

Yeah, I admit that the current requirements was a bit arbitrary, I just
needed to put something there...
The idea with author and title and such was to take dublin core into
consideration (which should probably not be mandatory for indexers).

The group switch is another deal. As I've pointed out before I think this is
a really really useful switch to application developers, and as it has been
pointed out elsewhere in this thread it is not always and easy task to group
files.

- Should we provide one/several sample parsers to turn the query string
>   into a simple data structure ? This might help back-end adapter writers,
>   and also help remove any possible ambiguity from the language
> definition.

There are several ways to do this. One idea could be to provide stand alone
libs for QT and GObject... Ie. no "bindings" just pure implementations.

> PS: about Beagle and XML. There was a question about this somewhere, so I
> had a look this morning. From what I could see, XML is used only
> implicitely in Beagle as the serialization form for communication between
> the front-end and the beagled daemon. As this kind of serialization is
> supported directly by the standard C# libraries, it's more or less
> transparent. The libbeagle C front-end library has to generate and parse
> the XML "by hand". I don't think that there is anything else than this
> ad-hoc use of XML, and I didn't see anything ressembling a document
> definition anywhere.

Cool. Good that you took the time to look at this. Now we can dismiss that
option with confidence. Thanks.

Cheers,
Mikkel

Fabrice Colin writes:
> > On 11/24/06, Jean-Francois Dockes < jean-francois.dockes at wanadoo.fr>
> wrote:
> > > Here follow my impressions after reading the Wasabi Draft document.
> > > ...
> > > Ok, enough for now, my only hope here is to restart thinking about the
>
> > > query language.
> > >
> >
> > I have given some thought to this over the weekend and here's what I
> reckon.
> >
> > We do need a simple text string-based query language.  The way I see it,
>
> > the main goal of Wasabi is to allow to plug any personal search system
> > into existing applications (file managers, toolkits' file chooser
> > dialogs, cataloguing software, etc...). These applications typically
> > only have a basic search user interface, i.e. a text field and maybe
> > some knobs that can be tweaked.
> >
> > Once we have sorted out the dbus interface, these apps will only have to
> > make a couple of method calls and pass the string entered by the
> > user. We should try to make it as easy as possible to run searches; any
> > parsing/formatting that's necessary on the part of these apps will add
> > complexity. The more complex it is, the less widely it will be adopted.
> > Since most end-users are familiar with the query format supported by
> > popular Web engines, we should go for something similar.  Leo mentioned
> > Lucene's query language. While I agree we should avoid tying anything to
>
> > a particular search toolkit, a subset of that query language might make
> > sense. If need be, an ABNF grammar would remove ambiguities.
> >
> >
> > On the other hand, I agree with Jean-Francois that a more powerful query
>
> > language is better in the medium to long term. I don't know which is the
> > most appropriate.
> >
> > I think a dual approach, as proposed on the second draft, makes sense.
> >
> > Fabrice
> > _______________________________________________
> > xdg mailing list
> > xdg at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/xdg
> _______________________________________________
> xdg mailing list
> xdg at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/xdg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/xdg/attachments/20061127/6e896395/attachment.htm