You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Le...@emc.com on 2006/09/15 00:12:11 UTC

RE: best way to get specific results

Hi,

I have the same situation where Im interested in returning a subset of
results from the whole set, such as results 500 to 550. However, I have
already implemented a Filter that will return the results I want without
additional query processing needed (i.e. no need to use the
IndexSearcher.search(Query, Filter) method). Im just wrapping a
ConstantScoreQuery around the filter, and passing it into the
IndexSearcher.search(Query) method to return a Hits object. Then Im
asking for the 500th to 550th doc in the Hits object. 

Would such a case still cause Hits to rexecute over and over again on
higher numbered results? Or is this different because Im using a
ConstantScoreQuery, so it just uses the BitSet from the filter to
determine the results more quickly instead of executing a Query? 

Would I still be better off using TopDocs returned from
IndexSearcher.search(Weight, Filter, nDocs) to get the results? Since
TopDocs only returns the doc ids and I need field information, is the
common method to use the IndexReader.document(id) to fetch field
information from each doc id returned by TopDocs?

Thanks!!
Gary

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: Thursday, May 25, 2006 11:22 AM
To: java-user@lucene.apache.org
Subject: Re: best way to get specific results


: if a query returns 1000 results, the user is interested only in the
: results between 500&550. the way I implemented it is run a normal
query
: using IndexSercher.search(Query()) and then get the specified
documents
: out of the hits object. I am wondering if there is a more efficient
way
: than this, is using TopDocs better than the hits object, knowing that
: some users may need more than a 1000 docs back in one query?.

generally speaking, yes TopDocs (or TopFieldDocs) are better then Hits
if you plan on acessing morethen the first 100 or so results .. Hits
will reexecute your search over and over as you ask for higher numbered
results, while with TopDocs you search is executed once, and you are
given only the Doc IDs of the first N docs you asked for, with no other
processing done behind the scenes  (in your case, it sounds like N would
be 550, and you'd start accessing the ScoreDoc[] at 500.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: best way to get specific results

Posted by Le...@emc.com.
Thanks, I definitely missed this. Makes it a lot more simpler to use...

Appreciate your help Chris.

Gary 

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: Monday, September 18, 2006 08:26 AM
To: java-user@lucene.apache.org
Subject: RE: best way to get specific results


: Thanks for the info on this. Since I should use the search function
that
: returns TopDocs, I was wondering what was the proper way to create a
: Weight object to pass into the search function.

I think you are getting too hung up on the method summary section of the
IndexSearcher javadocs ... IndexSearcher also supports all of the
methods in the Searcher interface, like...

   TopDocs d = searcher.search(myQuery, (Filter)null, 100);




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: best way to get specific results

Posted by Chris Hostetter <ho...@fucit.org>.
: Thanks for the info on this. Since I should use the search function that
: returns TopDocs, I was wondering what was the proper way to create a
: Weight object to pass into the search function.

I think you are getting too hung up on the method summary section of the
IndexSearcher javadocs ... IndexSearcher also supports all of the methods
in the Searcher interface, like...

   TopDocs d = searcher.search(myQuery, (Filter)null, 100);




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: best way to get specific results

Posted by Le...@emc.com.
Thanks for the info on this. Since I should use the search function that
returns TopDocs, I was wondering what was the proper way to create a
Weight object to pass into the search function. 

There are 2 functions in the Query class that I see: createWeight and
weight, which both return a Weight object. Is there a difference between
these 2, and which one should I use?

Once I have a Weight object named, for example, w, do I just use the
search function as this?

IndexSearcher is = new IndexSearcher(fsDir);
Query q = ...
Weight w = q.createWeight(is); or Weight w = q.weight(is);
is.search(w, null, 100); // no filter, want top 100 docs

For the case with a filter, would it be:
is.search(w, f, 100);

Thanks
Gary

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: Saturday, September 16, 2006 07:22 AM
To: java-user@lucene.apache.org
Subject: RE: best way to get specific results

: IndexSearcher.search(Query, Filter) method). Im just wrapping a
: ConstantScoreQuery around the filter, and passing it into the
: IndexSearcher.search(Query) method to return a Hits object. Then Im
: asking for the 500th to 550th doc in the Hits object.
:
: Would such a case still cause Hits to rexecute over and over again on
: higher numbered results? Or is this different because Im using a

yes... it doesn't matter what type of query you use ... Hits is not a
good idea if you wnat results really far down the list.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: best way to get specific results

Posted by Chris Hostetter <ho...@fucit.org>.
: IndexSearcher.search(Query, Filter) method). Im just wrapping a
: ConstantScoreQuery around the filter, and passing it into the
: IndexSearcher.search(Query) method to return a Hits object. Then Im
: asking for the 500th to 550th doc in the Hits object.
:
: Would such a case still cause Hits to rexecute over and over again on
: higher numbered results? Or is this different because Im using a

yes... it doesn't matter what type of query you use ... Hits is not a good
idea if you wnat results really far down the list.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org