You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by lucenegal <hi...@yahoo.com> on 2009/01/22 02:43:23 UTC

Pagination in Lucene

Does the Lucene support pagination for search results ? Some of the
documentation suggests to requery for each page. The results can be 1M + in
my case , what is general recommendation in this situation ? 
-- 
View this message in context: http://www.nabble.com/Pagination-in-Lucene-tp21593739p21593739.html
Sent from the Lucene - General mailing list archive at Nabble.com.


RE: Pagination in Lucene

Posted by Steven A Rowe <sa...@syr.edu>.
Hi lucenegal,

You'll get much quicker/better responses if you use the
java-user@lucene.apache.org list instead of this list, which has a
relatively small audience.

On 01/21/2009 at 8:43 PM, lucenegal wrote:
> Does the Lucene support pagination for search results ? Some of the
> documentation suggests to requery for each page. The results can be 1M
+
> in my case , what is general recommendation in this situation ?

Lucene's (indirect) support for pagination of search results (the Hits
class) has been deprecated as of version 2.4.0 and will be removed in
version 3.0.0:

<http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Hits.h
tml>

Hits provides an iterator over the results, caching a fixed sized window
of 200 hits (roughly - not sure of this number).  When all of the docs
in cache have been iterated over, the search is performed again, and the
cache is populated with the next window of hits from the complete list
of hits, if there are more hits available.  In your case of 1M+ hits,
the query would be re-executed 1M+/200 = 5K+ times!
 
If you look at the top of the javadocs for the Hits class at the link
above, a non-deprecated alternative is given.  Essentially, you must
take control of the results caching/pagination yourself.

See an example of this in Lucene's SearchFiles demo, in the
doPagingSearch() method (at the bottom of the file):

<http://svn.apache.org/viewvc/lucene/java/tags/lucene_2_4_0/src/demo/org
/apache/lucene/demo/SearchFiles.java?view=markup>

Mark Harwood has posted a class called HitPageCollector, which manages
some of the details for you, here:

<http://markmail.org/message/wlmoznq6mpxjkbav>

Steve