You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2007/04/04 17:33:35 UTC
Re: [jira] Created: (LUCENE-851) Pruning
On Mar 29, 2007, at 7:44 PM, Ning Li wrote:
> If a query requires top-K results, isn't it
> sufficient to find top-K results in each segment and merge them to
> return the overall top-K results?
They are merged by collecting them into a HitQueue.
> Early termination happens in
> finding top-K results in one segment. Assuming each document has a
> static score, document ids are assigned in the same order of their
> static scores within a segment. If a top-K query is scored by the same
> static score, query processing on a segment can stop as soon as the
> first K results are found.
Indeed, that's exactly how the loop in Scorer_collect() works.
> As to the indexing side, applications should be able to pick such a
> static score? If Lucene score function is used, norm is a good
> candidate? (One tricky thing with norm is that it is updatable.)
I would argue that only a single mechanism based on indexed, non-
tokenized fields should be used to determine sort order. Sort order
based upon norms is easy for the user to fake using a dedicated field
at a small cost, so library-level support is not needed.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org