[Xesam] Why is vendor.maxhits read-only?

Mikkel Kamstrup Erlandsen mikkel.kamstrup at gmail.com
Tue Dec 18 12:31:51 PST 2007


On 18/12/2007, Joe Shaw <joe at joeshaw.org> wrote:
> Hi,
>
> On 12/18/07, Mikkel Kamstrup Erlandsen <mikkel.kamstrup at gmail.com> wrote:
> > Very valid use case indeed. However I am not sure why Beagle has
> > problems delivering more than 100 hits/search.
>
> It's simply the default.  It can be turned up, but that requires using
> the Beagle API.

Yeah, I understand that. I was wondering why the default is not just
MAXINT? Why is it that it require "quite a bit more memory and CPU" as
you say below? This is not my usual experience using Lucene on several
millions of docs.


> Whether this is something that should be done in our adaptor or
> elsewhere is unclear.  Getting more documents requires quite a bit
> more memory and CPU, and Beagle doesn't have a "paging" model.  You
> get all the hits (asynchronously) up to the maximum number per
> backend.
>
> > Perhaps this is something that is of lesser issue when clients use the
> > xesam API - because the hit reading is sequential. The doc data can be
> > resolved on request since beagled knows how many docs are requested...
> > Perhaps a Beagle dev can enlighten us?
>
> I'm not sure I understand, sorry.

Here's another shot then :-) This get semi-Lucene-technical so hang on...

By "sequential" I mean that GetHits(count) is basically a read()
method that reads sequentially through the result set.

When you receive your results from Lucene you basically get an
iterator over the doc ids. Then I assume from your description that
you fetch the relevant (stored?) fields for the first 100 hits in that
iterator.

When beagled/xesam-adaptor gets a GetHits(150) it can say "uh, I don't
have that many hits cached, I better build the next 50 too". It can
then hold on to the iterator in case any more hits are requested (or
maybe even pre-fetch the next 100). When the search is closed it can
discard the iterator and cached hits.

Cheers,
Mikkel


More information about the Xesam mailing list