You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by headhunter <jl...@conet.de> on 2006/07/25 15:49:07 UTC

Limit number of search results

Hello,

I am looking for a way to limit the number of search results I retrieve when
searching.

I am only interested in (let's say) the first ten hits of a query.. maybe I
want to look at hits ten..twenty to, but usually only the first results are
important. 

Right now lucene searches through the entire index, returning way more than
the desired ten documents. 

Any way to limit this?

Thanks for answers,
Johannes
-- 
View this message in context: http://www.nabble.com/Limit-number-of-search-results-tf1998377.html#a5485639
Sent from the Lucene - Java Users forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by Miles Barr <mi...@magpie.net>.
headhunter wrote:

>I guess the recommended way to implement paging of results is to do your own
>query-results caching, right? Or does lucene also do this for me?
>

The other guys have covered caching of results in a general way, so I 
won't go into that.

For a search application I've written I have a separate class that acts 
as the model for a search. Basically it allows you to say how many 
results you want per page, and set the page number. Then you can have it 
spit out some XML for the results that should be visible, this includes 
title, search summary, link, etc. Similarly it can provide the necessary 
information to generate pagination links.

This object holds the reference to the Hits object, and is kept around 
for the duration of that user session. So I do cache the results and 
hence only execute the search once, but this is more as a consequence of 
how I modelled the interaction rather than a desire to cache.




Miles


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by headhunter <jl...@conet.de>.

Chris Hostetter wrote:
> 
> [..]
> 
> : In the first case: there is no uneccessary work.  Lucene must look at
> : every matching docId in order to determing which docs should be the
> first
> : 10.
> [..]
> 
Yes, you are right. Haven't thought of that :)

'Bout the second thing: You're right too.. I can indeed do other
optimimations which will work just fine !

Thanks for all your help!

Johannes


-- 
View this message in context: http://www.nabble.com/Limit-number-of-search-results-tf1998377.html#a5499379
Sent from the Lucene - Java Users forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by Chris Hostetter <ho...@fucit.org>.
: I'm still a little worried about doing uneccesarry work - this is totally
: different from what I know when working with DBMS.

What are you describing as "uneccesarry work" examining every document
even though you only care about the first 10, or re-executing the search
when you want results 11-20 ?

In the first case: there is no uneccessary work.  Lucene must look at
every matching docId in order to determing which docs should be the first
10.

In the second case: if you feel you have a need to cache the results of a
search so that fetching page#2 can be fast that's entirely your call --
but the mantra of performance optimizations is don't do it untill you need
to.  Typically questions about caching and performance can only be
answered once you have some detailed performance numbers about your
specific use cases (which requires documenting exactly what your specific
use cases are, and then testing them) ... otherwise you're just guessing
and your "optimizations" could very well make things worse.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by headhunter <jl...@conet.de>.
Hello Daniel,

thank you for your answer. 

I'm still a little worried about doing uneccesarry work - this is totally
different from what I know when working with DBMS. 

Johannes
-- 
View this message in context: http://www.nabble.com/Limit-number-of-search-results-tf1998377.html#a5498842
Sent from the Lucene - Java Users forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by Daniel Naber <lu...@danielnaber.de>.
On Mittwoch 26 Juli 2006 08:24, headhunter wrote:

> Is it recommended to do the search again - discarding the uninteresting
> values - because lucene caches the results, or just because lucene is so
> damn fast?

Lucene is fast enough in 99% of the cases. Caching is only done by the 
operating system on the I/O level.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by headhunter <jl...@conet.de>.
Hello,

this really doesn't answer my question ;)

I've indeed read the FAQ (though I couldn't believe this point ;) .

Is it recommended to do the search again - discarding the uninteresting
values - because lucene caches the results, or just because lucene is so
damn fast?

Johannes
-- 
View this message in context: http://www.nabble.com/Limit-number-of-search-results-tf1998377.html#a5498319
Sent from the Lucene - Java Users forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by Daniel Naber <lu...@danielnaber.de>.
On Mittwoch 26 Juli 2006 07:55, headhunter wrote:

> I guess the recommended way to implement paging of results is to do your
> own query-results caching, right?

http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-81ddcb6ef8573197a77e0c7b56b44cb27e6d7f09

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by headhunter <jl...@conet.de>.
Hello Miles,

thanks for your answer. 

I guess the recommended way to implement paging of results is to do your own
query-results caching, right? Or does lucene also do this for me?


Johannes
-- 
View this message in context: http://www.nabble.com/Limit-number-of-search-results-tf1998377.html#a5498091
Sent from the Lucene - Java Users forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Limit number of search results

Posted by Miles Barr <mi...@magpie.net>.
headhunter wrote:

>I am looking for a way to limit the number of search results I retrieve when
>searching.
>
>I am only interested in (let's say) the first ten hits of a query.. maybe I
>want to look at hits ten..twenty to, but usually only the first results are
>important. 
>
>Right now lucene searches through the entire index, returning way more than
>the desired ten documents. 
>
>Any way to limit this?
>  
>

Lucene only loads the first 100 hits from the index, the rest of the 
results are lazy loaded. I don't think you can reduce this number 
without changing the code.




Miles

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org