shared wasabi implementation

Mikkel Kamstrup Erlandsen mikkel.kamstrup at
Sun Feb 18 03:32:34 PST 2007

2007/2/16, Joe Shaw <joeshaw at>:
> Hi,
> Mikkel Kamstrup Erlandsen wrote:
> >     For external filters, I have shamelessly borrowed Beagle's
> >     external-filters.xml
> >     with some modifications. Built-in filters register what MIME types
> >     they support
> >     when the corresponding dynamic library is loaded.
> >
> >
> > For reference:
> >
> > It still appears not to cover the two cases I mention - emails in a
> > database in a hidden directory, indexing of webpages+urls as you browse.
> > Anyway - a good starting point. Perhaps Joe can shed some light on why
> > this was left out..?
> The external filters were added so that people could index file types
> not supported internally without having to code up support for them.
> The two cases you mention aren't file types, they're data sources.
> (Mail is handed by our mail filter, and web pages by our HTML filter
> already.)

Yes, I was a bit unclear. What I was trying to say was really "You can only
specify filters, not data sources.".

For people who want to index their data externally we provide
> an indexing service.  Apps can do one of two things: they can make an
> RPC call and pass in a document and metadata to be indexed, or they can
> drop the file into ~/.beagle/ToIndex with a control file that describes
> its metadata and Beagle will automatically index it.  (This latter
> method is how the Beagle Firefox extension works.)

What kind of rpc is available?

Dropping files in a special directory sounds like a thing that most indexers
could support. Perhaps this can be standardized. Is there a place where I
can find documentation/examples/code for this?

We could maybe create an external data source backend, but since the
> sources are so specific, all it would amount to would be calling some
> sort of script that did the crawling and used one of the two methods
> above to signal Beagle.  Unlike the external filters, there hasn't been
> any demand for it, and fitting it in to the scheduler so that it didn't
> peg the indexer or fill up the disk would be tough to do externally.

I'm not sure I understand what you are saying. Is it that polling many
external data source "handles" would be to heavy?

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the xdg mailing list