You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2009/09/17 23:51:35 UTC

Re: Retrieving a field from all result docuemnts & couple of more queries

: You will need to get SolrIndexSearcher.java and modify following:-
: 
: public static final int GET_SCORES             =       0x01;

No.  Do not do that.  There is no reason for anyone, to EVER modify that 
line of code. Absolutely NONE!!!!  

If you've made that change to your version of Solr, pelase start a new 
thread on solr-user explaining your goal, and what things you tried before 
ultimately amking that change, because i garuntee you that if you are 
willing to modify java files to change that line, there will be a more 
general purpose reusable way to solve your goal besides that (which won't 
silently break alot of other functionality)

: > No, I don't wish to put a custom Similarity.  Rather, I want an
: > equivalent of HitCollector where I can bypass the scoring altogether.
: > And I prefer to do it by changing the configuration.

...there is no pure configuration way to obtain the same logic you could 
get from a custom HitCollector.  You haven't elaborated on what exactly 
your HitCollector looked like, but so far you've mentioned that it 
ignored the scores, and used the FieldCache to get a field value w/o 
dealing with stored fields -- you can achieve something roughly 
functionally similar by writing a custom RequestHandler that uses 
SolrIndexSearcher.getDocSet (which skips scoring and sorting) and then 
iterate over that DocSet and fetch the values you want from the 
FieldCache.

or you could write a RequestHandler that uses your HitCollector as is -- 
but then you aren't really leveraging any value from Solr at all, the 
previous suggestion has the value add of utilizing Solr's filterCache for 
frequent queries (which can be really handy if your queries can be 
easily broken apart into pieces and dealt with using DocSet 
union/intersection operations -- like q/fq are dealt with in 
SearchHandler) 


-Hoss

Re: Retrieving a field from all result docuemnts & couple of more queries

Posted by Chris Hostetter <ho...@fucit.org>.

: As I mentioned previously, I prefer to do this with as little java
: code as possible. That's the motivation for me to take a look at solr.

I understand, but as i already said "there is no pure configuration way to 
obtain the same logic you could get from a custom HitCollector"

you can get the same behavior you currently have, with the same 
existing efficiencies plus take advantage of the solr filter cache by 
writting writing a custom RequestHandler (or SearchCOmponent) that would 
be about 5 lines long... get the DocSet from the searcher for your parsed 
query (you can reuse the existing Solr QueryParser framework and 
utilities) then iterate over the DocSet and add each FieldCache value to 
the response.

FWIW: I would encourage you to try using Solr as is, w/o any custom code 
or messing with the field cache and just set "fl=yourField" and see if the 
performance is satisfactory to you.  it will still do scoring but you 
might be suprised how fast stored fields can be returned (under the covers 
solr uses a FieldSelector contain just the "fl" fields)



-Hoss

Re: Retrieving a field from all result docuemnts & couple of more queries

Posted by Shashikant Kore <sh...@gmail.com>.

Hoss,

As I mentioned previously, I prefer to do this with as little java
code as possible. That's the motivation for me to take a look at solr.

Here is the code snippet.

OpenBitSet resultBitset = new OpenBitSet(this.searcher.maxDoc());

this.searcher.search(query, new HitCollector() {
				@Override
				public void collect(int docID, float arg1) {
					resultBitset.set(docID);
				}
});

Then I retrieve the stored field and look up the results present in
the resultBitset.

int[] docIDs = FieldCache.DEFAULT.getInts(this.luceneIndex.reader,
FIELD_DOCUMENT_ID);

I need to do this as I need all the matching results, but order is not
important (for this search.) In the index, the content field has term
vector with it, which I can't drop. There are other types of searches
where relevance ranking is required.

Can I achieve the same with Solr?

Thanks,

--shashi

On Fri, Sep 18, 2009 at 3:21 AM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : You will need to get SolrIndexSearcher.java and modify following:-
> :
> : public static final int GET_SCORES             =       0x01;
>
> No.  Do not do that.  There is no reason for anyone, to EVER modify that
> line of code. Absolutely NONE!!!!
>
> If you've made that change to your version of Solr, pelase start a new
> thread on solr-user explaining your goal, and what things you tried before
> ultimately amking that change, because i garuntee you that if you are
> willing to modify java files to change that line, there will be a more
> general purpose reusable way to solve your goal besides that (which won't
> silently break alot of other functionality)
>
> : > No, I don't wish to put a custom Similarity.  Rather, I want an
> : > equivalent of HitCollector where I can bypass the scoring altogether.
> : > And I prefer to do it by changing the configuration.
>
> ...there is no pure configuration way to obtain the same logic you could
> get from a custom HitCollector.  You haven't elaborated on what exactly
> your HitCollector looked like, but so far you've mentioned that it
> ignored the scores, and used the FieldCache to get a field value w/o
> dealing with stored fields -- you can achieve something roughly
> functionally similar by writing a custom RequestHandler that uses
> SolrIndexSearcher.getDocSet (which skips scoring and sorting) and then
> iterate over that DocSet and fetch the values you want from the
> FieldCache.
>
> or you could write a RequestHandler that uses your HitCollector as is --
> but then you aren't really leveraging any value from Solr at all, the
> previous suggestion has the value add of utilizing Solr's filterCache for
> frequent queries (which can be really handy if your queries can be
> easily broken apart into pieces and dealt with using DocSet
> union/intersection operations -- like q/fq are dealt with in
> SearchHandler)
>
>
> -Hoss
>
>