You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Jan Høydahl (JIRA)" <ji...@apache.org> on 2010/01/21 17:49:54 UTC

[jira] Commented: (SOLR-1093) A RequestHandler to run multiple queries in a batch

    [ https://issues.apache.org/jira/browse/SOLR-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803356#action_12803356 ] 

Jan Høydahl commented on SOLR-1093:
-----------------------------------

Parallel execution of multiple queries is just one use case in a family of many others, and I agree with Lance's post in the list that it would be better to make an extensible component.

Other similar use cases often requested: multi source federation, factor in ad service, select sources based on query analysis, select sources based on results, non-solr sources, result modification based on content in result, query abstraction layer/templating

The common goal is to make an abstraction layer on top of search sources which can handle search-close functionality and thus not need implement this in all the front-ends. Other products which try to fill this role are: FAST Unity, Comperio Front, Sesat (sesat.no)

Perhaps the /multi req.handler could be the start of such a framework, where the first plugin to implement is the parallel queries use-case.

To be able to handle a high count for "n" without hitting HTTP GET limitaions, and get a cleaner syntax for complex cases, the handler could accept the request as a POST. Pseudo post content, could be JSON or custom:
<steps>
  <branch type="list">
    <src name="web">qt=dismax&q=$q&amp;rows=10&amp;facet=true&amp;facet.fl=mimetype</src>
    <src name="google">q=$q</src>
    <src name="yp">q=category:$q^10 OR company:$q&amp;rows=3</src>
    <src name="wp">q=$q&amp;rows=3</src>
    <src name="ads">q=$q</src>
  </multi>
</steps>

The result list would then consist of five entries named web, yp, google, wp and ads.
Each "branch" and "src" would be pre-defined in config, specifying the implementing class and any defaults. indeed, the whole POST could be pre-configured, only needing to supply a &steps= param to identify which "template" to choose, using $variables for q etc.
The class implmenting "steps" simply calls each sub step in sequence, passing the request and response objects. This provides a simple framework for future extensions, pre- or post-processing.
The class implementing "branch" of type "list" would spawn all sub queries as threads and include each source result in a list.
Another implementation type of "branch" could merge (federate) results instead of stacking them.
The class implementing a "src" would be a thin wrapper which simply dispatches the query to the Search RequestHandler. Other implementations of "src" could be wrappers for external engines like Google or ad servers.

My intention is not to suggest a huge component, but consider if a smart interface design could enable very powerful  extension possibility which will be useful in almost all portal type applications.

> A RequestHandler to run multiple queries in a batch
> ---------------------------------------------------
>
>                 Key: SOLR-1093
>                 URL: https://issues.apache.org/jira/browse/SOLR-1093
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Noble Paul
>             Fix For: 1.5
>
>
> It is a common requirement that a single page requires to fire multiple queries .In cases where these queries are independent of each other. If there is a handler which can take in multiple queries , run them in paralll and send the response as one big chunk it would be useful
> Let us say the handler is  MultiRequestHandler
> {code}
> <requestHandler name="/multi" class="solr.MultiRequestHandler"/>
> {code}
> h2.Query Syntax
> The request must specify the no:of queries as count=n
> Each request parameter must be prefixed with a number which denotes the query index.optionally ,it may can also specify the handler name.
> example
> {code}
> /multi?count=2&1.handler=/select&1.q=a:b&2.handler=/select&2.q=a:c
> {code}
> default handler can be '/select' so the equivalent can be
> {code} 
> /multi?count=2&1.q=a:b&2.q=a:c
> {code}
> h2.The response
> The response will be a List<NamedList> where each NamedList will be a response to a query. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.