simple search api (was Re: mimetype standardisation by testsets)
Mikkel Kamstrup Erlandsen
mikkel.kamstrup at gmail.com
Thu Nov 23 21:46:06 EET 2006
2006/11/23, Jean-Francois Dockes <jean-francois.dockes at wanadoo.fr>:
> mikkel.kamstrup at gmail.com (Mikkel Kamstrup Erlandsen) writes:
> > magnus.bergman at observer.net (Magnus Bergman) writes:
> > > One thing that English users seldom consider is the usages of several
> > > languages. Which language is being used is important to know in order
> > > to decide what stemming rules to use, and which stop-words use (in
> > > English "the" is a stop-word while it in Swedish means tea and is
> > > something that is adequate to search for). People using other
> > > are very often multi lingual (using English as well). Therefore it is
> > > interesting to know which language the query is in (search engines
> > > might also be able to translate queries to search in document written
> > > in different languages).
> > This is a good point. However I suggest leaving this up to the actual
> > implementations. After all it is an indexing time question what stemmer
> > use when indexing a document...
> This is not true. An indexer can chose to perform stem processing at query
> time. Recoll is one, but I don't think it's the only one. There are quite
> good reasons to do so.
Right. In my sleepy haze last night I was not thinking straight :-) I've put
some more detail in my answer to Fabrice's post.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the xdg