You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Satuluri, Venu_Madhav" <Ve...@deshaw.com> on 2006/03/21 13:18:26 UTC

Improving search performance

Hi,

I am looking for ways to improve the performance of lucene search in our
app. Lucene performance is visibly slow when there are a lot of
documents to be returned (performance almost seems directly proportional
to the number of documents returned by Searcher). However, we show 20
results per page, so it seems to be a waste of time to get all, say,
60,000 documents when all I need are the nth 20 (i.e. if user requests
4th page of results, I just need documents 61 to 80).  I've tried using
the Searcher.search() method that returns TopFieldDocs. This method
works much faster than the ordinary Searcher.search() that returns all
the results. The trouble with this method is that I cant use it to get
an arbitrary portion of the results, I can only get the top few docs.

Our index size is around 50 MB. Its optimized every one hour. I have
tried a RAMDirectory, but even though using this improves performance by
upto 2 times, its not good enough. Also our index gets modified by
multiple processes so keeping the RAMDirectory up-to-date is a hassle.

Thanks,
Venu



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Improving search performance

Posted by Grant Ingersoll <gs...@syr.edu>.
I am not sure why you are getting all 60k docs at a time.  If you use 
the Hits object, it caches the top 50 or so, but doesn't retrieve all 
the documents at once.

Also, what are the size of your fields and how many fields do you have 
per document? 

Have you done any profiling to find the bottlenecks?  An index size of 
50mb is actually pretty small for Lucene, perhaps you can share more 
about your setup.

-Grant

Satuluri, Venu_Madhav wrote:
> Hi,
>
> I am looking for ways to improve the performance of lucene search in our
> app. Lucene performance is visibly slow when there are a lot of
> documents to be returned (performance almost seems directly proportional
> to the number of documents returned by Searcher). However, we show 20
> results per page, so it seems to be a waste of time to get all, say,
> 60,000 documents when all I need are the nth 20 (i.e. if user requests
> 4th page of results, I just need documents 61 to 80).  I've tried using
> the Searcher.search() method that returns TopFieldDocs. This method
> works much faster than the ordinary Searcher.search() that returns all
> the results. The trouble with this method is that I cant use it to get
> an arbitrary portion of the results, I can only get the top few docs.
>
> Our index size is around 50 MB. Its optimized every one hour. I have
> tried a RAMDirectory, but even though using this improves performance by
> upto 2 times, its not good enough. Also our index gets modified by
> multiple processes so keeping the RAMDirectory up-to-date is a hassle.
>
> Thanks,
> Venu
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org