You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brian Whitman <br...@echonest.com> on 2008/10/03 22:13:30 UTC

RequestHandler that passes along the query

Not sure if this is possible or easy: I want to make a requestHandler that
acts just like select but does stuff with the output before returning it to
the client.
e.g. http://url/solr/myhandler?q=type:dog&sort=legsdesc&shards=dogserver1;dogserver2

When myhandler gets it, I'd like to take the results of that query as if I
sent it to select, then do stuff with the output before returning it. For
example, it would add a field to each returned document from an external
data store.

This is sort of like an UpdateRequestProcessor chain thing, but for the
select side. Is this possible?

Alternately, I could have my custom RequestHandler do the query. But all I
have in the RequestHandler is a SolrQueryRequest. Can I pass that along to
something and get a SolrDocumentList back?

Re: RequestHandler that passes along the query

Posted by Brian Whitman <br...@echonest.com>.
The issue I think is that process() is never called in my component, just
distributedProcess.
The server that hosts the component is a separate solr instance from the
shards, so my guess is process() is only called when that particular solr
instance has something to do with the index. distributedProcess() is called
for each of those stages, but the last stage it is called for is
GET_FIELDS.

But the WritingDistributedSearchComponents page did tip me off to a new
function, finishStage, that is called *after* each stage is done and does
exactly what I want:

  @Override

  public void finishStage(ResponseBuilder rb) {

    if(rb.stage == ResponseBuilder.STAGE_GET_FIELDS) {

      SolrDocumentList sd = (SolrDocumentList) rb.rsp.getValues().get(
"response");

      for (SolrDocument d : sd) {

        rb.rsp.add("second-id-list", d.getFieldValue("id").toString());

      }

    }

  }






On Sat, Oct 4, 2008 at 1:37 PM, Ryan McKinley <ry...@gmail.com> wrote:

> I'm not totally on top of how distributed components work, but check:
> http://wiki.apache.org/solr/WritingDistributedSearchComponents
>
> and:
>  https://issues.apache.org/jira/browse/SOLR-680
>
> Do you want each of the shards to append values?  or just the final result?
>  If appending the values is not a big resource hog, it may make sense to
> only do that in the main "process" block.  If that is the case, I *think*
> you just implement: process(ResponseBuilder rb)
>
> ryan
>
>
>
> On Oct 4, 2008, at 1:06 PM, Brian Whitman wrote:
>
>  Sorry for the extended question, but I am having trouble making
>> SearchComponent that can actually get at the returned response in a
>> distributed setup.
>> In my distributedProcess:
>>
>>   public int distributedProcess(ResponseBuilder rb) throws IOException {
>>
>> How can I get at the returned results from all shards? I want to get at
>> really the rendered response right before it goes back to the client so I
>> can add some information based on what came back.
>>
>> The TermVector example seems to get at rb.resultIds (which is not public
>> and
>> I can't use in my plugin) and then sends a request back to the shards to
>> get
>> the stored fields (using ShardDoc.id, another field I don't have access
>> to.)
>> Instead of doing all of that I'd like to just "peek" into the response
>> that
>> is about to be written to the client.
>>
>> I tried getting at rb.rsp but the data is not filled in during the last
>> stage (GET_FIELDS) that distributedProcess gets called for.
>>
>>
>>
>> On Sat, Oct 4, 2008 at 10:12 AM, Brian Whitman <br...@echonest.com>
>> wrote:
>>
>>  Thanks grant and ryan, so far so good. But I am confused about one thing
>>> -
>>> when I set this up like:
>>>
>>>  public void process(ResponseBuilder rb) throws IOException {
>>>
>>> And put it as the last-component on a distributed search (a defaults
>>> shard
>>> is defined in the solrconfig for the handler), the component never does
>>> its
>>> thing. I looked at the TermVectorComponent implementation and it instead
>>> defines
>>>
>>>   public int distributedProcess(ResponseBuilder rb) throws IOException {
>>>
>>> And when I implemented that method it works. Is there a way to define
>>> just
>>> one method that will work with both distributed and normal searches?
>>>
>>>
>>>
>>> On Fri, Oct 3, 2008 at 4:41 PM, Grant Ingersoll <gsingers@apache.org
>>> >wrote:
>>>
>>>  No need to even write a new ReqHandler if you're using 1.3:
>>>> http://wiki.apache.org/solr/SearchComponent
>>>>
>>>>
>>>
>

Re: RequestHandler that passes along the query

Posted by Ryan McKinley <ry...@gmail.com>.
I'm not totally on top of how distributed components work, but check:
http://wiki.apache.org/solr/WritingDistributedSearchComponents

and:
  https://issues.apache.org/jira/browse/SOLR-680

Do you want each of the shards to append values?  or just the final  
result?  If appending the values is not a big resource hog, it may  
make sense to only do that in the main "process" block.  If that is  
the case, I *think* you just implement: process(ResponseBuilder rb)

ryan


On Oct 4, 2008, at 1:06 PM, Brian Whitman wrote:

> Sorry for the extended question, but I am having trouble making
> SearchComponent that can actually get at the returned response in a
> distributed setup.
> In my distributedProcess:
>
>    public int distributedProcess(ResponseBuilder rb) throws  
> IOException {
>
> How can I get at the returned results from all shards? I want to get  
> at
> really the rendered response right before it goes back to the client  
> so I
> can add some information based on what came back.
>
> The TermVector example seems to get at rb.resultIds (which is not  
> public and
> I can't use in my plugin) and then sends a request back to the  
> shards to get
> the stored fields (using ShardDoc.id, another field I don't have  
> access to.)
> Instead of doing all of that I'd like to just "peek" into the  
> response that
> is about to be written to the client.
>
> I tried getting at rb.rsp but the data is not filled in during the  
> last
> stage (GET_FIELDS) that distributedProcess gets called for.
>
>
>
> On Sat, Oct 4, 2008 at 10:12 AM, Brian Whitman <br...@echonest.com>  
> wrote:
>
>> Thanks grant and ryan, so far so good. But I am confused about one  
>> thing -
>> when I set this up like:
>>
>>  public void process(ResponseBuilder rb) throws IOException {
>>
>> And put it as the last-component on a distributed search (a  
>> defaults shard
>> is defined in the solrconfig for the handler), the component never  
>> does its
>> thing. I looked at the TermVectorComponent implementation and it  
>> instead
>> defines
>>
>>    public int distributedProcess(ResponseBuilder rb) throws  
>> IOException {
>>
>> And when I implemented that method it works. Is there a way to  
>> define just
>> one method that will work with both distributed and normal searches?
>>
>>
>>
>> On Fri, Oct 3, 2008 at 4:41 PM, Grant Ingersoll  
>> <gs...@apache.org>wrote:
>>
>>> No need to even write a new ReqHandler if you're using 1.3:
>>> http://wiki.apache.org/solr/SearchComponent
>>>
>>


Re: RequestHandler that passes along the query

Posted by Brian Whitman <br...@echonest.com>.
Sorry for the extended question, but I am having trouble making
 SearchComponent that can actually get at the returned response in a
distributed setup.
In my distributedProcess:

    public int distributedProcess(ResponseBuilder rb) throws IOException {

How can I get at the returned results from all shards? I want to get at
really the rendered response right before it goes back to the client so I
can add some information based on what came back.

The TermVector example seems to get at rb.resultIds (which is not public and
I can't use in my plugin) and then sends a request back to the shards to get
the stored fields (using ShardDoc.id, another field I don't have access to.)
Instead of doing all of that I'd like to just "peek" into the response that
is about to be written to the client.

I tried getting at rb.rsp but the data is not filled in during the last
stage (GET_FIELDS) that distributedProcess gets called for.



On Sat, Oct 4, 2008 at 10:12 AM, Brian Whitman <br...@echonest.com> wrote:

> Thanks grant and ryan, so far so good. But I am confused about one thing -
> when I set this up like:
>
>   public void process(ResponseBuilder rb) throws IOException {
>
> And put it as the last-component on a distributed search (a defaults shard
> is defined in the solrconfig for the handler), the component never does its
> thing. I looked at the TermVectorComponent implementation and it instead
> defines
>
>     public int distributedProcess(ResponseBuilder rb) throws IOException {
>
> And when I implemented that method it works. Is there a way to define just
> one method that will work with both distributed and normal searches?
>
>
>
> On Fri, Oct 3, 2008 at 4:41 PM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> No need to even write a new ReqHandler if you're using 1.3:
>> http://wiki.apache.org/solr/SearchComponent
>>
>

Re: RequestHandler that passes along the query

Posted by Brian Whitman <br...@echonest.com>.
Thanks grant and ryan, so far so good. But I am confused about one thing -
when I set this up like:

  public void process(ResponseBuilder rb) throws IOException {

And put it as the last-component on a distributed search (a defaults shard
is defined in the solrconfig for the handler), the component never does its
thing. I looked at the TermVectorComponent implementation and it instead
defines

    public int distributedProcess(ResponseBuilder rb) throws IOException {

And when I implemented that method it works. Is there a way to define just
one method that will work with both distributed and normal searches?



On Fri, Oct 3, 2008 at 4:41 PM, Grant Ingersoll <gs...@apache.org> wrote:

> No need to even write a new ReqHandler if you're using 1.3:
> http://wiki.apache.org/solr/SearchComponent
>

Re: RequestHandler that passes along the query

Posted by Grant Ingersoll <gs...@apache.org>.
No need to even write a new ReqHandler if you're using 1.3: http://wiki.apache.org/solr/SearchComponent

On Oct 3, 2008, at 4:13 PM, Brian Whitman wrote:

> Not sure if this is possible or easy: I want to make a  
> requestHandler that
> acts just like select but does stuff with the output before  
> returning it to
> the client.
> e.g. http://url/solr/myhandler?q=type:dog&sort=legsdesc&shards=dogserver1;dogserver2
>
> When myhandler gets it, I'd like to take the results of that query  
> as if I
> sent it to select, then do stuff with the output before returning  
> it. For
> example, it would add a field to each returned document from an  
> external
> data store.
>
> This is sort of like an UpdateRequestProcessor chain thing, but for  
> the
> select side. Is this possible?
>
> Alternately, I could have my custom RequestHandler do the query. But  
> all I
> have in the RequestHandler is a SolrQueryRequest. Can I pass that  
> along to
> something and get a SolrDocumentList back?

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: RequestHandler that passes along the query

Posted by Ryan McKinley <ry...@gmail.com>.
take a look at the SearchComponent interface:
http://wiki.apache.org/solr/SearchComponent

with that, you can inject extra fields into each document before  
passing them on

ryan


On Oct 3, 2008, at 4:13 PM, Brian Whitman wrote:

> Not sure if this is possible or easy: I want to make a  
> requestHandler that
> acts just like select but does stuff with the output before  
> returning it to
> the client.
> e.g. http://url/solr/myhandler?q=type:dog&sort=legsdesc&shards=dogserver1;dogserver2
>
> When myhandler gets it, I'd like to take the results of that query  
> as if I
> sent it to select, then do stuff with the output before returning  
> it. For
> example, it would add a field to each returned document from an  
> external
> data store.
>
> This is sort of like an UpdateRequestProcessor chain thing, but for  
> the
> select side. Is this possible?
>
> Alternately, I could have my custom RequestHandler do the query. But  
> all I
> have in the RequestHandler is a SolrQueryRequest. Can I pass that  
> along to
> something and get a SolrDocumentList back?