You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Satuluri, Venu_Madhav" <Ve...@deshaw.com> on 2006/03/21 13:18:26 UTC
Improving search performance
Hi,
I am looking for ways to improve the performance of lucene search in our
app. Lucene performance is visibly slow when there are a lot of
documents to be returned (performance almost seems directly proportional
to the number of documents returned by Searcher). However, we show 20
results per page, so it seems to be a waste of time to get all, say,
60,000 documents when all I need are the nth 20 (i.e. if user requests
4th page of results, I just need documents 61 to 80). I've tried using
the Searcher.search() method that returns TopFieldDocs. This method
works much faster than the ordinary Searcher.search() that returns all
the results. The trouble with this method is that I cant use it to get
an arbitrary portion of the results, I can only get the top few docs.
Our index size is around 50 MB. Its optimized every one hour. I have
tried a RAMDirectory, but even though using this improves performance by
upto 2 times, its not good enough. Also our index gets modified by
multiple processes so keeping the RAMDirectory up-to-date is a hassle.
Thanks,
Venu
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Improving search performance
Posted by Grant Ingersoll <gs...@syr.edu>.
I am not sure why you are getting all 60k docs at a time. If you use
the Hits object, it caches the top 50 or so, but doesn't retrieve all
the documents at once.
Also, what are the size of your fields and how many fields do you have
per document?
Have you done any profiling to find the bottlenecks? An index size of
50mb is actually pretty small for Lucene, perhaps you can share more
about your setup.
-Grant
Satuluri, Venu_Madhav wrote:
> Hi,
>
> I am looking for ways to improve the performance of lucene search in our
> app. Lucene performance is visibly slow when there are a lot of
> documents to be returned (performance almost seems directly proportional
> to the number of documents returned by Searcher). However, we show 20
> results per page, so it seems to be a waste of time to get all, say,
> 60,000 documents when all I need are the nth 20 (i.e. if user requests
> 4th page of results, I just need documents 61 to 80). I've tried using
> the Searcher.search() method that returns TopFieldDocs. This method
> works much faster than the ordinary Searcher.search() that returns all
> the results. The trouble with this method is that I cant use it to get
> an arbitrary portion of the results, I can only get the top few docs.
>
> Our index size is around 50 MB. Its optimized every one hour. I have
> tried a RAMDirectory, but even though using this improves performance by
> upto 2 times, its not good enough. Also our index gets modified by
> multiple processes so keeping the RAMDirectory up-to-date is a hassle.
>
> Thanks,
> Venu
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
--
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org