You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Thorsten Scherler <th...@juntadeandalucia.es> on 2006/12/22 13:29:10 UTC

filter input from multiple fields

Hi all, 

I am looking for some information about which is the best way to write a
MultipleFieldRequestHandler.java.

My use case is that I have a form (the next version of
http://andaluciajunta.es/portal/aj-bojaBuscador/0,22815,,00.html) where
you limit the query via a couple of different fields.

Imaging I have a form like  
<form action="select/" method="get">
      <input name="start" value="0" type="hidden"/>
      <input name="rows" value="10" type="hidden"/> term:
      <input type="text" name="q"/>
      <br/>
      <table>
        <tr>
          <td colspan="3">between dates:</td>
        </tr>
        <tr>
          <td>
            <input type="text" name="startDate"/>
          </td>
          <td>y</td>
          <td>
            <input type="text" name="endDate"/>
          </td>
        </tr>
      </table>
      <input type="submit" value="buscar"></input> </form>

Using the StandardRequestHandler without prior processing would result
that "startDate" and "endDate" would be ignored since they are not
within the query string and are not solr standard param that got
processed.

Meaning I can either construct the query string (including this filter)
on the client side or with a request Handler, right?

JavaScript on the client side is not a possibility for my client so here
I am to write my first RequestHandler.

I could write a request handler special for my client, but I think a
more generic solution is more beneficial for all of us.

So my idea is to define:
<requestHandler name="dateRange"
  class="solr.MultipleFieldRequestHandler">
  <lst name="defaults">
    <str name="echoParams">explicit</str>
    <str name="filterFields">startDate endDate</str>
    <str name="startDate">*</str>
    <str name="endDate">*</str>
    <str name="fq">date:[${startDate} TO ${endDate}]</str>
  </lst>
</requestHandler>

Where filterFields defines the array of fields that can be used to limit
the super set of docs. If this fields are null for the given request
then we use they defined standard e.g. <str name="startDate">*</str> in
the fq. Within the fq ${...} has to be parsed with the actual value of
the variable. Since we defined default values that should work fine,
right?

The only thing that a simpler more basic solution could be to just tweak
the standard handler to 
a) get all params and store them in a hashmap (already done)
b) change SolrPluginUtils.parseFilterQueries to parse the fq string and
replace ${startDate} with the corresponding key="startDate"
req.getParam(key). General speaking implement a variable substitution
parser (which I would have to do as well for the MFRH).

What do you think is the better approach write a RequestHandler or
extend the standard one or there even an easier way?

TIA for any infos.

salu2


Re: variable substitution in lucene query params (was Re: filter input from multiple fields)

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Wed, 2006-12-27 at 22:53 -0800, Chris Hostetter wrote:
> : > directly to have a new "void setParamVariables(SolrParams p)" method.  if
> : > it's called (with non null input), then any string that SolrQueryParser
> : > instance is asked to parse would first be preprocessed looking for the ${}
> : > pattern and pulling the values out of the SOlrParams instance.
> : >
> :
> : When does the setParamVariables(SolrParams p) get called? What should
> : happen in this method?
> 
> i was thinking it would be called by the request handerly just after
> construction -- it would modify the internal state of the QueryParser just
> like some of the other setters do., for use in the parse method.
> 
> : not sure whether I understand.
> :
> : You mean
> 
> bingo ... and then just dd the code to handleRequest that uses
> "substitution" if non null.
> 
> 

Ok, thanks  I will have a look and submit a patch.

Thanks for your feedback.

salu2
thorsten

> 
> -Hoss
> 


Re: variable substitution in lucene query params (was Re: filter input from multiple fields)

Posted by Chris Hostetter <ho...@fucit.org>.
: > directly to have a new "void setParamVariables(SolrParams p)" method.  if
: > it's called (with non null input), then any string that SolrQueryParser
: > instance is asked to parse would first be preprocessed looking for the ${}
: > pattern and pulling the values out of the SOlrParams instance.
: >
:
: When does the setParamVariables(SolrParams p) get called? What should
: happen in this method?

i was thinking it would be called by the request handerly just after
construction -- it would modify the internal state of the QueryParser just
like some of the other setters do., for use in the parse method.

: not sure whether I understand.
:
: You mean

bingo ... and then just dd the code to handleRequest that uses
"substitution" if non null.



-Hoss


variable substitution in lucene query params (was Re: filter input from multiple fields)

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Sat, 2006-12-23 at 18:41 -0800, Chris Hostetter wrote:
> : I did a small hack and it works like a charm without the above mentioned
> : handler. I only activated variable substitution for the FQ for testing
> : if you think that is a nice feature I can activate it for the rest.
> 
> As i said in my other reply ... i think you should reconsider the approach
> you are taking towards your end goal -- 

Yes, I will. Thanks for the headsup.


> but in general, this of allowing
> variable substitution in the lucene query params seems pretty slick to me
> ... a more general solution might be to modify the SolrQueryParser
> directly to have a new "void setParamVariables(SolrParams p)" method.  if
> it's called (with non null input), then any string that SolrQueryParser
> instance is asked to parse would first be preprocessed looking for the ${}
> pattern and pulling the values out of the SOlrParams instance.
> 

When does the setParamVariables(SolrParams p) get called? What should
happen in this method?

> request handlers could then either pass their main params (if they wanted
> to allow kitchen sink param substitution) or if they want to be more
> robust (ie: Standard and DisMax), they could do what you describe: have a
> configured list of param
> names that would be used to construct a new instance of SOlrParams
> explicitly for the SolrQueryParser -- but i would think that would be be a
> good use for a new seperate init param in the solrconfig, it's not hte
> kind of thing you'd ever want to let the client specify.
> 

not sure whether I understand.

You mean
<requestHandler name="standard" class="solr.StandardRequestHandler">
    <!-- default values for the SolrQueryParser -->
     <lst name="substitution">
        <str name="startDate">*</str>
        <str name="endDate">*</str>
      </lst>
    <!-- default values for query parameters -->
      <lst name="defaults">
        <str name="echoParams">explicit</str>
        <str name="fq">bojaDate:[${startDate} TO ${endDate}]</str>
      </lst>
  </requestHandler>

...
public class StandardRequestHandler implements SolrRequestHandler, SolrInfoMBean {

  ...
  public void init(NamedList args) {
...
o = args.get("substitution");
    if (o != null && o instanceof NamedList) {
      substitution = SolrParams.toSolrParams((NamedList)o);
    }
...
 }
}

> The reason this really seems cool to me is because it the format/params
> passing could work in either order: the format could be specificed in the
> config with params coming from the client, or the config could list a big
> long list of "constant" params that the client could then use however they
> want by specifying a format that used them.

Yeah, I will open an issue.

Cheers for the feedback.

salu2


Re: filter input from multiple fields

Posted by Chris Hostetter <ho...@fucit.org>.
: I did a small hack and it works like a charm without the above mentioned
: handler. I only activated variable substitution for the FQ for testing
: if you think that is a nice feature I can activate it for the rest.

As i said in my other reply ... i think you should reconsider the approach
you are taking towards your end goal -- but in general, this of allowing
variable substitution in the lucene query params seems pretty slick to me
... a more general solution might be to modify the SolrQueryParser
directly to have a new "void setParamVariables(SolrParams p)" method.  if
it's called (with non null input), then any string that SolrQueryParser
instance is asked to parse would first be preprocessed looking for the ${}
pattern and pulling the values out of the SOlrParams instance.

request handlers could then either pass their main params (if they wanted
to allow kitchen sink param substitution) or if they want to be more
robust (ie: Standard and DisMax), they could do what you describe: have a
configured list of param
names that would be used to construct a new instance of SOlrParams
explicitly for the SolrQueryParser -- but i would think that would be be a
good use for a new seperate init param in the solrconfig, it's not hte
kind of thing you'd ever want to let the client specify.

The reason this really seems cool to me is because it the format/params
passing could work in either order: the format could be specificed in the
config with params coming from the client, or the config could list a big
long list of "constant" params that the client could then use however they
want by specifying a format that used them.



-Hoss


Re: filter input from multiple fields

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Fri, 2006-12-22 at 13:29 +0100, Thorsten Scherler wrote:
> Hi all, 
> 
> I am looking for some information about which is the best way to write a
> MultipleFieldRequestHandler.java.


I did a small hack and it works like a charm without the above mentioned
handler. I only activated variable substitution for the FQ for testing
if you think that is a nice feature I can activate it for the rest.

Index: src/java/org/apache/solr/util/SolrPluginUtils.java
===================================================================
--- src/java/org/apache/solr/util/SolrPluginUtils.java  (revision
489649)
+++ src/java/org/apache/solr/util/SolrPluginUtils.java  (working copy)
@@ -59,7 +59,10 @@
  */
 public class SolrPluginUtils {

-  /**
+  private static final String VARIABLE_DETERMINATOR_CLOSE = "}";
+  private static final String VARIABLE_DETERMINATOR_OPEN = "${";
+
+/**
    * Set defaults on a SolrQueryRequest.
    *
    * RequestHandlers can use this method to ensure their defaults are
@@ -819,13 +822,24 @@
     SolrQueryParser qp = new SolrQueryParser(s.getSchema(), null);
     for (String q : in) {
       if (null != q && 0 != q.trim().length()) {
-        out.add(qp.parse(q));
+        out.add(qp.parse(substitute(q, req)));
       }
     }
     return out;
   }

-  /**
+  private static String substitute(String q, SolrQueryRequest req) {
+    if (q.contains(VARIABLE_DETERMINATOR_OPEN)){
+        String beforeVariable
=q.substring(0,q.indexOf(VARIABLE_DETERMINATOR_OPEN)) ;
+        String variable =
q.substring(q.indexOf(VARIABLE_DETERMINATOR_OPEN)+VARIABLE_DETERMINATOR_OPEN.length(),q.indexOf(VARIABLE_DETERMINATOR_CLOSE)) ;
+        String afterVariable
=q.substring(q.indexOf(VARIABLE_DETERMINATOR_CLOSE)+VARIABLE_DETERMINATOR_CLOSE.length()) ;
+        String variableValue= req.getParams().get(variable);
+        q = substitute(beforeVariable+variableValue+afterVariable,
req);
+    }
+    return q;
+}
+
+/**

salu2

> 
> My use case is that I have a form (the next version of
> http://andaluciajunta.es/portal/aj-bojaBuscador/0,22815,,00.html) where
> you limit the query via a couple of different fields.
> 
> Imaging I have a form like  
> <form action="select/" method="get">
>       <input name="start" value="0" type="hidden"/>
>       <input name="rows" value="10" type="hidden"/> term:
>       <input type="text" name="q"/>
>       <br/>
>       <table>
>         <tr>
>           <td colspan="3">between dates:</td>
>         </tr>
>         <tr>
>           <td>
>             <input type="text" name="startDate"/>
>           </td>
>           <td>y</td>
>           <td>
>             <input type="text" name="endDate"/>
>           </td>
>         </tr>
>       </table>
>       <input type="submit" value="buscar"></input> </form>
> 
> Using the StandardRequestHandler without prior processing would result
> that "startDate" and "endDate" would be ignored since they are not
> within the query string and are not solr standard param that got
> processed.
> 
> Meaning I can either construct the query string (including this filter)
> on the client side or with a request Handler, right?
> 
> JavaScript on the client side is not a possibility for my client so here
> I am to write my first RequestHandler.
> 
> I could write a request handler special for my client, but I think a
> more generic solution is more beneficial for all of us.
> 
> So my idea is to define:
> <requestHandler name="dateRange"
>   class="solr.MultipleFieldRequestHandler">
>   <lst name="defaults">
>     <str name="echoParams">explicit</str>
>     <str name="filterFields">startDate endDate</str>
>     <str name="startDate">*</str>
>     <str name="endDate">*</str>
>     <str name="fq">date:[${startDate} TO ${endDate}]</str>
>   </lst>
> </requestHandler>
> 
> Where filterFields defines the array of fields that can be used to limit
> the super set of docs. If this fields are null for the given request
> then we use they defined standard e.g. <str name="startDate">*</str> in
> the fq. Within the fq ${...} has to be parsed with the actual value of
> the variable. Since we defined default values that should work fine,
> right?
> 
> The only thing that a simpler more basic solution could be to just tweak
> the standard handler to 
> a) get all params and store them in a hashmap (already done)
> b) change SolrPluginUtils.parseFilterQueries to parse the fq string and
> replace ${startDate} with the corresponding key="startDate"
> req.getParam(key). General speaking implement a variable substitution
> parser (which I would have to do as well for the MFRH).
> 
> What do you think is the better approach write a RequestHandler or
> extend the standard one or there even an easier way?
> 
> TIA for any infos.
> 
> salu2
> 


Re: filter input from multiple fields

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Sat, 2006-12-23 at 18:27 -0800, Chris Hostetter wrote:
> : Imaging I have a form like
> : <form action="select/" method="get">
> 	...
> :             <input type="text" name="startDate"/>
> 	..
> :             <input type="text" name="endDate"/>
> 
> : Using the StandardRequestHandler without prior processing would result
> : that "startDate" and "endDate" would be ignored since they are not
> : within the query string and are not solr standard param that got
> : processed.
> 
> Typically it's not recommended do have your front end users/clients
> hitting Solr directly sa part of an HTML form submit ... 

Yeah, I am starting to understand better the architecture behind solr
reading the ml.


> the more
> conventional way to think of it is that Solr is a backend service, which
> your applicationn can talk to over HTTP -- if you were dealing with a
> database, you wouldn't expect that you could generate an HTML form for
> your clients and then have them submit that form in some way that resulted
> in their browser using JDBC (or ODBC) to communicate directly with your
> database, their client would communicate with your App, which would
> validate their input, impose some security checks on the input, and then
> execute the underlying query to your database -- working wtih Solr should
> be very similar, it just so happens thta instead of using JDBC or some
> other binary protocol, Solr uses HTTP, and you *can* talk to it directly
> from a web browser, but that's relaly more of a debugging feature then
> anything else.

Thanks for this clarification, I will incorporate it into the
documentation. 

salu2

> 
> 
> 
> -Hoss
> 


Re: filter input from multiple fields

Posted by Chris Hostetter <ho...@fucit.org>.
: Imaging I have a form like
: <form action="select/" method="get">
	...
:             <input type="text" name="startDate"/>
	..
:             <input type="text" name="endDate"/>

: Using the StandardRequestHandler without prior processing would result
: that "startDate" and "endDate" would be ignored since they are not
: within the query string and are not solr standard param that got
: processed.

Typically it's not recommended do have your front end users/clients
hitting Solr directly sa part of an HTML form submit ... the more
conventional way to think of it is that Solr is a backend service, which
your applicationn can talk to over HTTP -- if you were dealing with a
database, you wouldn't expect that you could generate an HTML form for
your clients and then have them submit that form in some way that resulted
in their browser using JDBC (or ODBC) to communicate directly with your
database, their client would communicate with your App, which would
validate their input, impose some security checks on the input, and then
execute the underlying query to your database -- working wtih Solr should
be very similar, it just so happens thta instead of using JDBC or some
other binary protocol, Solr uses HTTP, and you *can* talk to it directly
from a web browser, but that's relaly more of a debugging feature then
anything else.



-Hoss