simple search api (was Re: mimetype standardisation by testsets)

Tue Nov 28 21:18:42 EET 2006

2006/11/28, Mikkel Kamstrup Erlandsen <mikkel.kamstrup at gmail.com>:
> 2006/11/28, Joe Shaw <joeshaw at novell.com>:
>
> > On Tue, 2006-11-28 at 07:15 +0100, Mikkel Kamstrup Erlandsen wrote:
> > > I think everybody wants that (atleast I do). However the idea about
> > > org.freedesktop.search.simple was to have a *simple* interface. Here
> > > the simple only applies to the end user-app developers.
> >
> > Yeah, I can certainly appreciate that.
> >
> > >  1) Targetted at apps where searching is not the main functionality,
> > > fx. a music browser, or filemanager.
> > >  2) There should be no query building on the application end. Just
> > > pass the string as entered by the user to quuery method.
> > >  3) Should be dead easy to drop in your app with only a few lines of
> > > code
> >
> > I agree in principle, but I disagree with point number #2 here.  One
> > reason is that it's generally not a good idea to programmatically build
> > a string query.  They can be error prone but more importantly they're
> > just slow.
> >
> > Take a look at the Beagle backend in Nautilus:
> >
> >
> http://cvs.gnome.org/viewcvs/nautilus/libnautilus-private/nautilus-search-engine-beagle.c?annotate=1.2.2.1
>
> >
> > (roughly around line 180)
> >
> > The API is still very simple, but we set some important details on the
> > query to limit its scope:
> >
> >         * We only want files; no emails, IM logs, addressbook contacts,
> >         etc.
> >
> >         * We want a maximum of 1000 hits.
> >
> >         * Optionally, we may only want a set of certain MIME types.
> >
> > And we also obviously set the string the user typed in, which is
> > actually a query language string, so they can add more advanced
> > properties, like 'author:"Charles Dickens"'.
>
>
>
> I think I see where the disagreement comes from. Currently the premise for
> the simple search api has been that it didn't need any language bindings.
> Applications would certainly wrap the interface in a native object allowing
> for mainloop integration, but it didn't need any code from the Wasabi
> project to work.
>
> >
> > > The problem is that 2) warrents a simple search language to be defined
> > > to make much sense... 3) is a bit against the nature of your
> > > suggestion.
> >
> > I think the simplicity of the Beagle API belies your assertion on point
> > 3.  Moreover, not making it asynchronous is going to drastically reduce
> > the usefulness in GUI apps.  There are dozens of examples in GNOME alone
> > that show this.
> >
> > Writing responsive GUI apps is Just Hard.  The Beagle API has tried to
> > make this as simple as possible.  You can implement a fully working
> > asynchronous Beagle client that fits in with the GLib main loop in about
> > 100 lines:
> >
> >
> http://cvs.gnome.org/viewcvs/beagle/libbeagle/examples/beagle-search.c?annotate=1.4
>
> >
> > We could add a synchronous API to this -- which wouldn't have the
> > benefits of Live Queries -- but to still have it work correctly in a GUI
> > app you would have to do all of your searches in a thread.  At least in
> > C, that's definitely harder to do than integrating with the main loop.
>
>
>
> Again such an api needs bindings for the toolkit being used. I'm not against
> this - actually on the contrary, I just think that we should also have an
> api that is easy to use even though you don't use the bindings for your
> platform.
>
> I still think it is possible to write a responsive ui with with the paging
> queries instead of the fully async ones. An application need not request
> 1000 hits at a time it could send 10 queries requesting the hit ranges n*100
> to (n+1)*100 for n=0..9. An example of this is the Tracker search tool and
> the Tracker deskbar plugin (actually they only perform one query, but
> still).
>
> Another concern I have (that might be based in ignorance) is that a dbus api
> with lots of temporary objects (queries) might be a lot of work for search
> engines not having something close to this already. Maybe it is not that
> hard to keep track of the objects, I really don't have much experience
> here...

One thing about live queries is worth discussing. What happens on file
changes. Beagle's live queries are nice for watching the filesystem
for updates. To implement such a feature efficiently, the daemon
should not remember all the results it has already sent. Because if it
did, it would potentially take a lot of memory.

So if we define an asynchroneous query we should allow the daemon to
send results more than once.