You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Aigner, Thomas" <TA...@WescoDist.com> on 2006/03/22 22:33:53 UTC
Lookup Issues
Howdy all,
I am having a performance issue. When I do a search for items,
getting more information takes a long time.
Ex. If there are 1M hits (I know, why look for that many or even allow
it, but let's say we return 1M hits). When the user wants to see the
last 25, it takes a LONG time to return (45seconds sometimes),
eventhough it only takes a second to bring back the hits object. I read
somewhere that the first 100 documents are stored in memory, but is
there some issue looping through end records?
Ex below would be fromHits = 990000 and toHits = 990025 so I would
return 25 records or so.
for (int i = fromHits; ((i <= toHits) && (i <= hits.length() -1)); i++)
{
Document doc = hits.doc(i);
System.out.println(doc.get("ldesc").replaceAll(" ~ ","~"));
}
Thanks all,
Tom
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lookup Issues
Posted by Doug Cutting <cu...@apache.org>.
The Hits-based search API is optimized for returning earlier hits. If
you want the lowest-scoring matches, then you could reverse-sort the
hits, so that these are returned first. Or you could use the
TopDocs-based API to retrieve hits up to your "toHits". (Hits-based
search is implemented using TopDocs-based search.)
Generally speaking, returning the 999,990-1,000,000th ranked hits is
inherently much more expensive than returning the 0-9th ranked hits.
TopDocs will minimize this expense. This expense is (in part) the
reason that web search engines won't show you more than the first 1000 hits.
Doug
Aigner, Thomas wrote:
> Howdy all,
> I am having a performance issue. When I do a search for items,
> getting more information takes a long time.
> Ex. If there are 1M hits (I know, why look for that many or even allow
> it, but let's say we return 1M hits). When the user wants to see the
> last 25, it takes a LONG time to return (45seconds sometimes),
> eventhough it only takes a second to bring back the hits object. I read
> somewhere that the first 100 documents are stored in memory, but is
> there some issue looping through end records?
>
> Ex below would be fromHits = 990000 and toHits = 990025 so I would
> return 25 records or so.
>
>
> for (int i = fromHits; ((i <= toHits) && (i <= hits.length() -1)); i++)
> {
>
> Document doc = hits.doc(i);
> System.out.println(doc.get("ldesc").replaceAll(" ~ ","~"));
>
> }
>
> Thanks all,
> Tom
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org