You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2006/09/08 20:53:42 UTC

additive params

I was talking to a coworker yesterday who has recently started using hte
DisMax handler ... she was annoyed that the "fq" param could only be
specified once (which was changed as part of the SimpleFacet patch so
that's all good) but she was also annoyed that having an "fq" default in
the solrconfig was completely overridden by an "fq" in the URL --
something that's still true with the recent changes.  There's currently no
way to specify some params in the config that can be "added to" by the
params in the URL.

Generalized support for being able to say "The default value list for
param P is A B C, and at query time I want to add X and Y and eliminate B"
seems like a a really hard problem to solve, but i was thinking of a
simpler approach that might still achieve the more common case where we
want values specified in a config to allways be "appended" to multivalue
params using somehting like this psuedo-java...

public class MergedParams extends DefaultSolrParams {
  public String[] getParams(String param) {
    return params.getParams(param).addAll(defaults.getParams(param));
  }
}

and request handlers could support this using something like...

  <requestHandler name="dismax" class="solr.DisMaxRequestHandler" >
    <lst name="appended">
      <str name="fq">inStock:true</str>
    </lst>
    <lst name="default">
      <str name="fq">color:red</str>
      <str name="fq">price:[* TO 100]</str>
    </lst>
 </requestHandler>

...with the existing DefaultSolrParams wrapped in an MergedParams using
the "appended" section.  Now a request for "/select/?qt=dismax&q=foo"
would get all three "fqs" applied, but a request for
"/select/?qt=dismax&q=foo&fq=color:blue" would only have two filters
applied (must be inStock, must be blue) ... this would be particularly
handy when trying to use "fq" to globally partition your index, as well as
provide "drill down" filtering options.

The same MergedParams could even support a <lst name="prepended"> set
of params which would work exactly the same for multivalue params (like
fq), but would allow the config to dictate the values of single value
params (like "rows", "facet") so thatthey can't be overidden at query
time.


What do you guys think?


-Hoss


Re: additive params

Posted by Chris Hostetter <ho...@fucit.org>.
: We should at least have some cautionary comments in the solrconfig.xml
: explaining that "appends" can't be overridden.

oh no doubt about that ... i wouldn't even consider putting any in the
"standard" requestHandler -- just include them on other instances of
StandardRequesHandler.



-Hoss


Re: additive params

Posted by Yonik Seeley <yo...@apache.org>.
On 9/10/06, Chris Hostetter <ho...@fucit.org> wrote:
>
> : The usecase you gave of setting reasonable append defaults for lists
> : seems like the most common one.  If the only configuration option
> : available to people is to have non-overridable appends, it will get
> : used as convenience (or error) and end up being a pain.
> :
> : I guess I'd like to see both problems solved so one mechanism isn't abused.
>
> solving both would be nice .. but one has a solution that is both easy to
> express and easy to implement -- the other is complex enough that we
> aren't sure what a good way of expressing it would be, nor do we have an
> easy way to implement it .. should we not do the first untill we've
> figured out hte second?

I think one needs to understand the second enough to assess how it
fits with the first.  I think it's OK in this case.

We should at least have some cautionary comments in the solrconfig.xml
explaining that "appends" can't be overridden.

-Yonik

Re: additive params

Posted by Chris Hostetter <ho...@fucit.org>.
: The usecase you gave of setting reasonable append defaults for lists
: seems like the most common one.  If the only configuration option
: available to people is to have non-overridable appends, it will get
: used as convenience (or error) and end up being a pain.
:
: I guess I'd like to see both problems solved so one mechanism isn't abused.

solving both would be nice .. but one has a solution that is both easy to
express and easy to implement -- the other is complex enough that we
aren't sure what a good way of expressing it would be, nor do we have an
easy way to implement it .. should we not do the first untill we've
figured out hte second?




-Hoss


Re: additive params

Posted by Yonik Seeley <yo...@apache.org>.
On 9/8/06, Chris Hostetter <ho...@fucit.org> wrote:
> : > The main use case is to allow configuration of a requestHandler that the
> : > clients don't have to be aware of at all it "just works" ..
> :
> : That sounds fine then for things that are optinally appended, but not
> : being able to override that doesn't make sense unless it's for some
> : security purpose.
>
> But if i'm the onwer of the index shouldn't that be my decision?

Yes, if that's what they really want.

The usecase you gave of setting reasonable append defaults for lists
seems like the most common one.  If the only configuration option
available to people is to have non-overridable appends, it will get
used as convenience (or error) and end up being a pain.

I guess I'd like to see both problems solved so one mechanism isn't abused.

> if i
> want people to be able to have free reign over my index, i'll do this...
>
>   <requestHandler name="dismax" class="solr.DisMaxRequestHandler" />

Yeah, that would work, but most people will take an evolutionary
approach, adding defaults to existing handlers I think.  Having
separate handlers for this is also a take-it-or-leave-it approach...
if you want to override just one useful default, you need to supply
them all.


> : Parameters that must always be appended seem to be the exception
> : rather than the rule.
> : But if that's the only mechanism to provide good defaults for
> : appendable lists, people might use it and unnecessarily lock down the
> : view of their data, even if it's not what they *really* meant to do.
>
> why would it be the only mechanism?

Because we don't currently have any mechanism to specify an additive
default.  If you added something like <appends>, it *would* be the
only mechanism ;-)

-Yonik

Re: additive params

Posted by Chris Hostetter <ho...@fucit.org>.
: > The main use case is to allow configuration of a requestHandler that the
: > clients don't have to be aware of at all it "just works" ..
:
: That sounds fine then for things that are optinally appended, but not
: being able to override that doesn't make sense unless it's for some
: security purpose.

But if i'm the onwer of the index shouldn't that be my decision?  if i
want people to be able to have free reign over my index, i'll do this...

  <requestHandler name="dismax" class="solr.DisMaxRequestHandler" />

...now they can specify anything they want.  if i want to give them
helpful defaults to make their life easy, i can do this...

  <requestHandler name="easy" class="solr.DisMaxRequestHandler" >
    <lst name="defaults">
     <float name="tie">0.01</float>
     <str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5</str>
     <str name="pf">text^0.2 features^1.1 name^1.5 manu^1</str>
     <str name="bf">ord(poplarity)^0.5</str>
     <str name="fq">inStock:true</str>
    </lst>
   </requestHandler>

...and they can use it as is, or "tweak" any of the params at query time.

If i have a client that says "i want and easy way to search all the
products that people can buy, and i need to let the user add options to
filter by price" i can setup this for them...

  <requestHandler name="buyable" class="solr.DisMaxRequestHandler" >
    <lst name="defaults">
     <float name="tie">0.01</float>
     <str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5</str>
     <str name="pf">text^0.2 features^1.1 name^1.5 manu^1</str>
     <str name="bf">ord(poplarity)^0.5</str>
    </lst>
    <lst name="append">
     <str name="fq">inStock:true</str>
    </lst>
   </requestHandler>

...and tell them "use '/select/?qt=buyable&q=userinput' if the user wants
to filter on price add '&fq=price:[A TO B]'".

If i don't have the "append" params option, then i have to tell the client
"if you want to filter on price, you'll have to put '&fq=inStock:true' in
every request" at which point they now have to know something they didn't
know before (that there are products in the index which aren't for in
stock and thus not for sale) and they have to do something they don't want
to have to mess with (putting an extra parameter in)

If i want to change my index later by adding a new field of "forSale"
(independent of "inStock"; maybe our biz model now allows us to stock up
on products that we don't want to sell, or stat doing pre-sell products
that we don't have in stock yet) I need to go to all of my clients using
qt=buyable and say "remember that rule i told you about putting
'&fq=inStock:true' in the request whenever you filtered, you need to
change your code to use '&fq=forSale:true'" ... that's not neccessary with
"appended" init params, becuase i can just change the solrconfig.

: Although inStock:true makes sense as a default, it doesn't make sense
: not being able to ask about all items or items that aren't in stock if
: you need to.  The convenience of not having to know about some of the
: params (and providing them by default), is much different than locking
: things down by always adding those params.

that's the bueaty of being able to configure the same request handler in
the solrconfig more then once with different names -- the person
configuring the index can provide multiple options.  In my example above,
the "dismax" instance, the "easy" instance and the "buyable" instance can
all coexist in harmony.  If a client wants defaults but wants to be able
ot override them -- they can do that (as long as the index owner
configures it that way).

: Parameters that must always be appended seem to be the exception
: rather than the rule.
: But if that's the only mechanism to provide good defaults for
: appendable lists, people might use it and unnecessarily lock down the
: view of their data, even if it's not what they *really* meant to do.

why would it be the only mechanism? ... i'm not suggesting we get rid of
defaults, i'm suggesting we add another way to specify "default-ish"
params that get added to what is currently used (regardless of whether
what's currently being used comes from a default or a query param)

: Does that make sense (putting the power in the index config) for
: anything other than security filters?

the instock vs forsale example i gave above isn't really a security issue
as much as it is a "changing partition" issue ... currently the data is
partitioned one way (instock true or false) and in the future it might be
partitioned in another way (forsale true or false) and we want to be able
to change it without are clients needing to know.

the boosting example i gave in my last message also isn't really about
security -- it's about imposing biz rules on scoring that can be augmented
by user options but nevercompletley overridden.

: Security filters would normally be added at a higher level based on
: user-id or something, *or* could be based on custom logic in a custom
: request handler.

hey wait a minute, ithought i was the guy allways arguing that people
should just write their own request handlers? :) ... if you have complex
rules (security or otherwise) that require your own custom request handler
then by all means people should write them ... but this seems like a
pretty strait forward way to let people get more power out of the existing
handlers through configuration.



-Hoss


Re: additive params

Posted by Yonik Seeley <yo...@apache.org>.
On 9/8/06, Chris Hostetter <ho...@fucit.org> wrote:
>
> : What's the usecase here, and the downside to having to provide the
> : full fq list in the URL?
> : Is this simply to shorten the request URLs?
>
> The main use case is to allow configuration of a requestHandler that the
> clients don't have to be aware of at all it "just works" ..

That sounds fine then for things that are optinally appended, but not
being able to override that doesn't make sense unless it's for some
security purpose.

Although inStock:true makes sense as a default, it doesn't make sense
not being able to ask about all items or items that aren't in stock if
you need to.  The convenience of not having to know about some of the
params (and providing them by default), is much different than locking
things down by always adding those params.

> : Of course once you make it so that something is always appended, that
> : will annoy someone else who wants to completely override it :-)
>
> True -- but that becomes a decision the person defining the configs can
> make.  If they want it to be overridable they put it in a "defaults"
> list as they can now, if they want it to allways be appended to the query
> params they can put it in a (new) seperate list.

Parameters that must always be appended seem to be the exception
rather than the rule.
But if that's the only mechanism to provide good defaults for
appendable lists, people might use it and unnecessarily lock down the
view of their data, even if it's not what they *really* meant to do.

> ignoring the issues of URL escaping and what exactly it would look like,
> something that would allways be possible if we add a method to SolrParams
> get all param names (HttpServletRequest already has this) but it puts the
> power in the hands of the client generating the request -- what i'm
> suggesting is putting more power in the hands of the people configuring
> the index.

Does that make sense (putting the power in the index config) for
anything other than security filters?

> There's certainly no reason we can't do both -- what you're describing
> would just affect the relationship of the request params with the
> <lst name="defaults"> when they get bundled into a DefaultSolrParams;
> what i'm describing would be the relationsihp between *that* set of params and a
> new "<lst name="appended"> set.

I guess I'm still not understanding the usecases... If one has
defaults that can be appended to, when would one use <appended>?.
Security filters would normally be added at a higher level based on
user-id or something, *or* could be based on custom logic in a custom
request handler.

-Yonik

Re: additive params

Posted by Chris Hostetter <ho...@fucit.org>.
: "more like this", and so on.  Maybe what we want is a way to have
: named sets of parameter settings in the configuration file and allow
: a request to use a pre-set by a simple name, keeping the HTTP request
: slimmer.

we can do that right now by registering hte same handler many times with
differnet defaults...

: You know, like pressing "1" on the radio to get WNRN 91.9 :)  (except
: with volume, treble, bass, etc settings built into that button also)

  <requestHandler name="WNRN" class="my.RadioRequestHandler">
   <lst name="defaults">
     <str name="station">WNRN</str>
     <int name="volume">8</int>
     <float name="treble">0.65</float>
     ...
   </lst>
  </requestHandler>
  <requestHandler name="KROQ" class="my.RadioRequestHandler">
   <lst name="defaults">
     <str name="station">KROQ</str>
     <int name="volume">11</int>
     <float name="treble">0.35</float>
     ...
   </lst>
  </requestHandler>




-Hoss


Re: additive params

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 8, 2006, at 5:27 PM, Chris Hostetter wrote:
> : i'm +1 on this idea that the server should be configurable in ways
> : the client cannot affect, if so desired, or configured such that the
> : client has control over all the parameters - but let's make the
> : controllability of parameters configurable itself.  surely there are
>
> You totally lost me there ... "controllability of parameters  
> configurable
> itself" ... can you clarify what you mean by that?

Sorry, sounds awkward in re-reading what I wrote myself.  I think you  
and I are saying the same thing... allow parameters to have default  
configuration server-side, allowing them to be overridden by request  
parameters is how it currently works (right?) but perhaps some  
parameters don't make sense in some uses for a client to control, so  
locking them off to strictly the server-side setting should be  
possible.   I'm not sure where this use case would be either, but we  
are approaching quite a lot of parameters a client would be sending  
to get search, sort, start, limit, highlighting, faceting, and soon  
"more like this", and so on.  Maybe what we want is a way to have  
named sets of parameter settings in the configuration file and allow  
a request to use a pre-set by a simple name, keeping the HTTP request  
slimmer.

You know, like pressing "1" on the radio to get WNRN 91.9 :)  (except  
with volume, treble, bass, etc settings built into that button also)

	Erik


Re: additive params

Posted by Chris Hostetter <ho...@fucit.org>.
: i'm +1 on this idea that the server should be configurable in ways
: the client cannot affect, if so desired, or configured such that the
: client has control over all the parameters - but let's make the
: controllability of parameters configurable itself.  surely there are

You totally lost me there ... "controllability of parameters configurable
itself" ... can you clarify what you mean by that?



-Hoss


Re: additive params

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 8, 2006, at 4:40 PM, Chris Hostetter wrote:
> ignoring the issues of URL escaping and what exactly it would look  
> like,
> something that would allways be possible if we add a method to  
> SolrParams
> get all param names (HttpServletRequest already has this) but it  
> puts the
> power in the hands of the client generating the request -- what i'm
> suggesting is putting more power in the hands of the people  
> configuring
> the index.

i'm +1 on this idea that the server should be configurable in ways  
the client cannot affect, if so desired, or configured such that the  
client has control over all the parameters - but let's make the  
controllability of parameters configurable itself.  surely there are  
some projects (spring and/or hivemind perhaps?) that have this  
concept?  ant has some very nice outside-in configurability with  
properties and "overriding" for example, but you can generally always  
control ant properties from the command-line as the last say on  
things.  this provides a very layered mechanism.  but not quite the  
model we want for solr parameterization.

> There's certainly no reason we can't do both -- what you're describing
> would just affect the relationship of the request params with the
> <lst name="defaults"> when they get bundled into a DefaultSolrParams;
> what i'm describing would be the relationsihp between *that* set of  
> params and a
> new "<lst name="appended"> set.

i have to plead that i'm not following all of these parameter issues  
closely, other than to adjust my ruby code to match the changes with  
the highlighter, and i've just tried the (wonderful!) faceted  
feature.  i'd like to add a "more like this" facility along these  
lines, at least i have that within my custom request handler.  this  
architecture pleases me!

	Erik




Re: additive params

Posted by Chris Hostetter <ho...@fucit.org>.
: What's the usecase here, and the downside to having to provide the
: full fq list in the URL?
: Is this simply to shorten the request URLs?

The main use case is to allow configuration of a requestHandler that the
clients don't have to be aware of at all it "just works" ... in the case
of "fq" it can be for partitioning the index as opposed to filtering down
the results from user options.  (ie: a requestHandler with a permanent
"fq=inStock:true" which the client can be ignorant of when adding
additional "fq" params based on user selected facets to filter on like
price and color)

FQ is hte biggest example, but the same mentality could applied to other
multivalue params.  Imagine if we make the "bf" param of dismax multivalue
(it was allways intended to be but SolrQueryRequest didn't suport that
when i wrote it and i haven't quite figured out hte best backwards
compatible way to change it now) you could configure a hardcoded boost on
"date" to reflect a biz logic bias for "newer" documents you want to
instill in the results, and the client could allow users to specify
additional boosts based on functions of price, or size, etc -- without the
client needing to even be aware of the orriginal "date" boost function.


You could think of it as just shortening the URLs, but we're finding that
in practice it's more important then that.  As it stands now the front end
clients have to be completely aware of the current requestHandler configs
in order to "add" to them -- which means either that information needs to
be on the server and in the client, or it only lives in the client (but
now any new client has to know about that rule as well), or the clients
need to use debugQuery=1 and parse the info about which filter queries are
currently being applied.

: Of course once you make it so that something is always appended, that
: will annoy someone else who wants to completely override it :-)

True -- but that becomes a decision the person defining the configs can
make.  If they want it to be overridable they put it in a "defaults"
list as they can now, if they want it to allways be appended to the query
params they can put it in a (new) seperate list.

It gives the person configuring the server the ability to say "for
each requestHandler: these things can be overriden, these things can be
added to (and possibly: these things are fixed and can not be modified by
the client)

: An alternative is to provide some sort of syntactic support for
: specifying the append operation rather than the overwrite operation
: w.r.t. defaults.
:
: fq+=color:red
:
: I'm not sure how hard it would be to support that exact syntax w/
: SolrParams and HttpServletRequest though.

ignoring the issues of URL escaping and what exactly it would look like,
something that would allways be possible if we add a method to SolrParams
get all param names (HttpServletRequest already has this) but it puts the
power in the hands of the client generating the request -- what i'm
suggesting is putting more power in the hands of the people configuring
the index.

There's certainly no reason we can't do both -- what you're describing
would just affect the relationship of the request params with the
<lst name="defaults"> when they get bundled into a DefaultSolrParams;
what i'm describing would be the relationsihp between *that* set of params and a
new "<lst name="appended"> set.



-Hoss


Re: additive params

Posted by Yonik Seeley <yo...@apache.org>.
On 9/8/06, Chris Hostetter <ho...@fucit.org> wrote:
> I was talking to a coworker yesterday who has recently started using hte
> DisMax handler ... she was annoyed that the "fq" param could only be
> specified once (which was changed as part of the SimpleFacet patch so
> that's all good) but she was also annoyed that having an "fq" default in
> the solrconfig was completely overridden by an "fq" in the URL --

What's the usecase here, and the downside to having to provide the
full fq list in the URL?
Is this simply to shorten the request URLs?

> something that's still true with the recent changes.  There's currently no
> way to specify some params in the config that can be "added to" by the
> params in the URL.

Of course once you make it so that something is always appended, that
will annoy someone else who wants to completely override it :-)

An alternative is to provide some sort of syntactic support for
specifying the append operation rather than the overwrite operation
w.r.t. defaults.

fq+=color:red

I'm not sure how hard it would be to support that exact syntax w/
SolrParams and HttpServletRequest though.

It could also be done more at the "application level", putting some
sort of indicator in the value that the param should be appended, not
overwritten.

fq=+color:red

Of course '+' is reserved by Lucene, and by url encoding, but you get the idea.

-Yonik

> Generalized support for being able to say "The default value list for
> param P is A B C, and at query time I want to add X and Y and eliminate B"
> seems like a a really hard problem to solve, but i was thinking of a
> simpler approach that might still achieve the more common case where we
> want values specified in a config to allways be "appended" to multivalue
> params using somehting like this psuedo-java...
>
> public class MergedParams extends DefaultSolrParams {
>   public String[] getParams(String param) {
>     return params.getParams(param).addAll(defaults.getParams(param));
>   }
> }
>
> and request handlers could support this using something like...
>
>   <requestHandler name="dismax" class="solr.DisMaxRequestHandler" >
>     <lst name="appended">
>       <str name="fq">inStock:true</str>
>     </lst>
>     <lst name="default">
>       <str name="fq">color:red</str>
>       <str name="fq">price:[* TO 100]</str>
>     </lst>
>  </requestHandler>
>
> ...with the existing DefaultSolrParams wrapped in an MergedParams using
> the "appended" section.  Now a request for "/select/?qt=dismax&q=foo"
> would get all three "fqs" applied, but a request for
> "/select/?qt=dismax&q=foo&fq=color:blue" would only have two filters
> applied (must be inStock, must be blue) ... this would be particularly
> handy when trying to use "fq" to globally partition your index, as well as
> provide "drill down" filtering options.
>
> The same MergedParams could even support a <lst name="prepended"> set
> of params which would work exactly the same for multivalue params (like
> fq), but would allow the config to dictate the values of single value
> params (like "rows", "facet") so thatthey can't be overidden at query
> time.
>
>
> What do you guys think?
>
>
> -Hoss