simple search api (was Re: mimetype standardisation by testsets)

Mon Nov 20 22:39:22 EET 2006

2006/11/20, Jos van den Oever <jvdoever at gmail.com>:

> > > > org.freedesktop.search.simple.query ( in s query, in i
> > > > offset, in i limit , out as hits ):
> > > >  What is the general consumer of this method? I don't see many. Only
> > stuff
> > > > like deskbar-applet or a general search tool would use it. Maybe
> adding
> > a
> > > > parameter to specify a list of groups the hits should match (or
> maybe
> > > > specifying mimetypes). This argument could be "*" or something to
> get
> > all
> > > > kinds of results. I suggest changing the signature to:
> > > > query ( in s query, in as groups, in i offset, in i limit , out as
> hits
> > )
> > > Interesting suggestion. It does make things quite a bit more
> > > complicated. Because you'd need to define the groups. We've not talked
> > > about the query language yet ( we need to, but i'm assuming we're
> > > going to use something similar to what Beagle and Strigi already use,
> > > which is almost the same), but you also just expand the query like
> > > this: "holiday" -> "holiday mimetype:video/*" before sending it to the
> > > search-engine. That seems much better defined than a list of vaguely
> > > termed groups. I do not object to having such names for the user to
> > > see though.
> >
> >
> > Yeah,  there are a few decisions to make here.  How much to put in the
> query
> > language and how much to put in the api. I think and expressive query
> > language is a good idea. However your example above doesn't fir all
> cases
> > well. What If I want to search for all "Documents" containing the word
> > "parser". The Documents group could fx. be files of mime types:
> >
> > application/msword
> > application/pdf
> > application/postscript
> > application/vnd.ms-excel
> > application/vnd.oasis.opendocument.text
> > application/vnd.sun.xml.writer
> > text/plain
> > text/html
> >
> > calling "parser mimetype:application/*" would probably not yield the
> desired
> > results. Maybe also having an option to search like "parser
> group:documents"
> > would be good? The Spotlight API has the notion of groups like this for
> > example.
> This notion of groups is very valuable for a nice user interface. It
> is however not relevant for the simplest form of search engine. The
> group designation of a file is usually not stored directly in the
> database, but inferred over the mimetype. For complex groups the query
> might look something like (application/msword OR application/pdf OR
> ...). Making such a list part of a search API makes it hard to agree
> on the mimetypes. I do not oppose a wrapper API the knows about the
> groups and expands a group-enabled-query, but I dont think we should
> put this in the simple API. The group(s) to which a file belongs is
> just another type of (inferred) metadata and i dont think we should
> treat is specially.

Given that it would be part of the search language it cannot be ruled out of
the simple api, unless we restrict the simple api to only support a subset
of the query language (which I don't think is a good idea).

It could be introspectable which switches was supported in the language,
such as a GetSupportedQuerySwitches(out as), but that doesn't seem to fit in
a "simple" api.

Also what about items that don't have a mimetype as such, conversations,
emails, attachment, contacts, etc. How would an application search my
Contacts for "Jos"? If this called for an advanced api, that seems strage..?

My concern is that we limit the simple api too much to be of any real value.

Cheers,
Mikkel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/xdg/attachments/20061120/f30845d3/attachment.htm