You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by tedsolr <ts...@sciquest.com> on 2015/01/19 22:28:48 UTC

How to return custom collector info

I am investigating possible modifications to the CollapsingQParserPlugin that
will allow me to collapse documents based on multiple fields. In a quick
test I was able to make this happen with two fields, so I assume I can
expand that to N fields. 

What I'm missing now is the extra data I need per group - the count of
collapsed docs and a summation on one numeric field. With single field
collapsing I could get this info from the standard stats component by using
tagging/excluding on the post filter and setting a stats facet field. Once
there are multiple fields, I lose the "free" stats info since faceting only
works with one field.

So I'm looking for advice on where/when to collect the extra data, and how
to transport it back to the caller. My first thought is to compute the info
in the collect() method of the DelegatingCollector, and store it with the
filter (somehow) so it can be retrieved in a later custom SearchComponent.
But I've read it is NOT a good idea to get a document within the collect()
method. What is the right way (place) to access a doc field value (not the
ordinal)?

I read a post by Joel B. where he said you could get access to a
ResponseBuilder directly from a post filter via a static SolrRequestInfo
call. Does this mean I could compute the extra data I need in the post
filter, AND write it out to the response (from the finish() method I guess)?
No need for a custom SearchComponent? I was thinking I would have to follow
the ExpandComponent model to get the data from the filter, then write it out
in the process() method.

This is my first attempt at customizing Solr so I may not be expressing
myself clearly. Thank you for any pointers you can provide.
(using Solr 4.9)



--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to return custom collector info

Posted by tedsolr <ts...@sciquest.com>.
I was confused because I couldn't believe my jars might be out of sync. But
of course they were. I had to create a new eclipse project to sort it out,
but that exception has disappeared. Sorry for the confusing post.



--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502p4180877.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to return custom collector info

Posted by tedsolr <ts...@sciquest.com>.
Joel,

Thank you for the links. The AnalyticsQuery is just the thing I need to
return custom stats in the response.

What I'm struggling with now, is how to read the doc field values. I've been
following the CollapsingQParserPlugin model of accessing the field cache in
the Query class getAnalyticsCollector() method, then passing the values to
the delegating collector. Every time I try to read what I assume is a field
value I get a NoSuchMethodError:

(in the getAnalyticsCollector() method ...)
BinaryDocValues fieldValues =
FieldCache.DEFAULT.getTerms(searcher.getAtomicReader(), "ID", false);
("ID" happens to be my uniqueKey in the schema.xml - its a string that is
stored and indexed and single valued)

(in the Collector ...)
BytesRef bRef = new BytesRef();
fieldValues.get(doc, bRef); // this line throws the error

I of course do not want to read stored values in the collector() method, so
where have I gone wrong trying to read the field cache (or doc values - I
have not tried to enable them and am not sure how to do so)?



--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502p4180686.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to return custom collector info

Posted by Joel Bernstein <jo...@gmail.com>.
Here is actually the a more useful link for understanding how the
AnalyticsQuery works:
http://heliosearch.org/solrs-new-analyticsquery-api/

Joel Bernstein
Search Engineer at Heliosearch

On Mon, Jan 19, 2015 at 4:57 PM, Joel Bernstein <jo...@gmail.com> wrote:

> You may want to take a look at the AnalyticsQuery:
> http://heliosearch.org/custom-analytics-engine/
>
> This is an extension to the PostFIlter API that gives you direct access to
> the ResponseBuilder.
>
> Joel Bernstein
> Search Engineer at Heliosearch
>
> On Mon, Jan 19, 2015 at 4:28 PM, tedsolr <ts...@sciquest.com> wrote:
>
>> I am investigating possible modifications to the CollapsingQParserPlugin
>> that
>> will allow me to collapse documents based on multiple fields. In a quick
>> test I was able to make this happen with two fields, so I assume I can
>> expand that to N fields.
>>
>> What I'm missing now is the extra data I need per group - the count of
>> collapsed docs and a summation on one numeric field. With single field
>> collapsing I could get this info from the standard stats component by
>> using
>> tagging/excluding on the post filter and setting a stats facet field. Once
>> there are multiple fields, I lose the "free" stats info since faceting
>> only
>> works with one field.
>>
>> So I'm looking for advice on where/when to collect the extra data, and how
>> to transport it back to the caller. My first thought is to compute the
>> info
>> in the collect() method of the DelegatingCollector, and store it with the
>> filter (somehow) so it can be retrieved in a later custom SearchComponent.
>> But I've read it is NOT a good idea to get a document within the collect()
>> method. What is the right way (place) to access a doc field value (not the
>> ordinal)?
>>
>> I read a post by Joel B. where he said you could get access to a
>> ResponseBuilder directly from a post filter via a static SolrRequestInfo
>> call. Does this mean I could compute the extra data I need in the post
>> filter, AND write it out to the response (from the finish() method I
>> guess)?
>> No need for a custom SearchComponent? I was thinking I would have to
>> follow
>> the ExpandComponent model to get the data from the filter, then write it
>> out
>> in the process() method.
>>
>> This is my first attempt at customizing Solr so I may not be expressing
>> myself clearly. Thank you for any pointers you can provide.
>> (using Solr 4.9)
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>

Re: How to return custom collector info

Posted by Joel Bernstein <jo...@gmail.com>.
You may want to take a look at the AnalyticsQuery:
http://heliosearch.org/custom-analytics-engine/

This is an extension to the PostFIlter API that gives you direct access to
the ResponseBuilder.

Joel Bernstein
Search Engineer at Heliosearch

On Mon, Jan 19, 2015 at 4:28 PM, tedsolr <ts...@sciquest.com> wrote:

> I am investigating possible modifications to the CollapsingQParserPlugin
> that
> will allow me to collapse documents based on multiple fields. In a quick
> test I was able to make this happen with two fields, so I assume I can
> expand that to N fields.
>
> What I'm missing now is the extra data I need per group - the count of
> collapsed docs and a summation on one numeric field. With single field
> collapsing I could get this info from the standard stats component by using
> tagging/excluding on the post filter and setting a stats facet field. Once
> there are multiple fields, I lose the "free" stats info since faceting only
> works with one field.
>
> So I'm looking for advice on where/when to collect the extra data, and how
> to transport it back to the caller. My first thought is to compute the info
> in the collect() method of the DelegatingCollector, and store it with the
> filter (somehow) so it can be retrieved in a later custom SearchComponent.
> But I've read it is NOT a good idea to get a document within the collect()
> method. What is the right way (place) to access a doc field value (not the
> ordinal)?
>
> I read a post by Joel B. where he said you could get access to a
> ResponseBuilder directly from a post filter via a static SolrRequestInfo
> call. Does this mean I could compute the extra data I need in the post
> filter, AND write it out to the response (from the finish() method I
> guess)?
> No need for a custom SearchComponent? I was thinking I would have to follow
> the ExpandComponent model to get the data from the filter, then write it
> out
> in the process() method.
>
> This is my first attempt at customizing Solr so I may not be expressing
> myself clearly. Thank you for any pointers you can provide.
> (using Solr 4.9)
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-return-custom-collector-info-tp4180502.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>