You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Aigner, Thomas" <TA...@WescoDist.com> on 2006/03/22 22:33:53 UTC

Lookup Issues

Howdy all,
	I am having a performance issue.  When I do a search for items,
getting more information takes a long time.
Ex. If there are 1M hits (I know, why look for that many or even allow
it, but let's say we return 1M hits).  When the user wants to see the
last 25, it takes a LONG time to return (45seconds sometimes),
eventhough it only takes a second to bring back the hits object.  I read
somewhere that the first 100 documents are stored in memory, but is
there some issue looping through end records?

Ex below would be fromHits = 990000 and toHits = 990025 so I would
return 25 records or so. 


for (int i = fromHits; ((i <= toHits) && (i <= hits.length() -1)); i++)
{
		  		
	  Document doc = hits.doc(i);
	  System.out.println(doc.get("ldesc").replaceAll(" ~ ","~"));
	  			
}

Thanks all,
Tom


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lookup Issues

Posted by Doug Cutting <cu...@apache.org>.
The Hits-based search API is optimized for returning earlier hits.  If 
you want the lowest-scoring matches, then you could reverse-sort the 
hits, so that these are returned first.  Or you could use the 
TopDocs-based API to retrieve hits up to your "toHits".  (Hits-based 
search is implemented using TopDocs-based search.)

Generally speaking, returning the 999,990-1,000,000th ranked hits is 
inherently much more expensive than returning the 0-9th ranked hits. 
TopDocs will minimize this expense.  This expense is (in part) the 
reason that web search engines won't show you more than the first 1000 hits.

Doug

Aigner, Thomas wrote:
> Howdy all,
> 	I am having a performance issue.  When I do a search for items,
> getting more information takes a long time.
> Ex. If there are 1M hits (I know, why look for that many or even allow
> it, but let's say we return 1M hits).  When the user wants to see the
> last 25, it takes a LONG time to return (45seconds sometimes),
> eventhough it only takes a second to bring back the hits object.  I read
> somewhere that the first 100 documents are stored in memory, but is
> there some issue looping through end records?
> 
> Ex below would be fromHits = 990000 and toHits = 990025 so I would
> return 25 records or so. 
> 
> 
> for (int i = fromHits; ((i <= toHits) && (i <= hits.length() -1)); i++)
> {
> 		  		
> 	  Document doc = hits.doc(i);
> 	  System.out.println(doc.get("ldesc").replaceAll(" ~ ","~"));
> 	  			
> }
> 
> Thanks all,
> Tom
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org