You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Ryan McKinley <ry...@gmail.com> on 2007/02/22 05:28:42 UTC

sorting parameter changes?

Currently, the standard query handler takes a "smart" sort parameter:

  q=query; sort

and dismax requires a separate sort parameter:

 q=query&sort=sort

Any reason not to have the standard handler accept sort=?  The change
of parameters is a needless gotcha for switching between the two
handlers.  (well, it got me anyway!)

I vote we add the "sort=" parameter and perhaps deprecate the ';' syntax

thoughts?

ryan

Re: sorting parameter changes?

Posted by Ryan McKinley <ry...@gmail.com>.
>
> i think it will impact a lot of people ... everyone currently using the
> standard request handler (which is certainly more then the number of
> people using json, and we made sure there was a backwards compatible
> option for them)
>
> i'm certianly okay however with saying that if they want the backwards
> compatibility, they have to add <bool name="semiColonSort">true</bool> to
> the defaults in their solrconfig.xml.
>

sounds good to me

Re: sorting parameter changes?

Posted by Chris Hostetter <ho...@fucit.org>.
: > i'm certianly okay however with saying that if they want the backwards
: > compatibility, they have to add <bool name="semiColonSort">true</
: > bool> to
: > the defaults in their solrconfig.xml.
:
: i haven't read all the messages on this thread, so forgive me if this
: was already mentioned... how about a <legacy> or <deprecated> area of
: the config file where this setting could live.

it's something you'd probably want to control on a per handler instance
basis ... you might want qt=standard to use it, but not qt=instock where
instock is defined as StandardRequestHandler with an appended fq.



-Hoss


Re: sorting parameter changes?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 23, 2007, at 12:20 AM, Chris Hostetter wrote:

> i'm certianly okay however with saying that if they want the backwards
> compatibility, they have to add <bool name="semiColonSort">true</ 
> bool> to
> the defaults in their solrconfig.xml.

i haven't read all the messages on this thread, so forgive me if this  
was already mentioned... how about a <legacy> or <deprecated> area of  
the config file where this setting could live.

Re: sorting parameter changes?

Posted by Chris Hostetter <ho...@fucit.org>.
: > it could .. but it wouldn't need to be specified as part of every request
: > it could be baked into the solrconfig .. we could even make the default
: > behavior be to not bother with the ";" syntax, and document in the
: > CHANGES.txt that people who want the old behavior need to explicitly add
: > some default param to get it (we already have a comment to the same affect
: > regarding json.nl=map)
: >
:
: If you all are up for it, i think the cleanest thing is to require the
: sort param and not support the ';'
:
: but I don't have a sense of who or how this will affect anyone.

i think it will impact a lot of people ... everyone currently using the
standard request handler (which is certainly more then the number of
people using json, and we made sure there was a backwards compatible
option for them)

i'm certianly okay however with saying that if they want the backwards
compatibility, they have to add <bool name="semiColonSort">true</bool> to
the defaults in their solrconfig.xml.



-Hoss


Re: sorting parameter changes?

Posted by Ryan McKinley <ry...@gmail.com>.
>
> it could .. but it wouldn't need to be specified as part of every request
> it could be baked into the solrconfig .. we could even make the default
> behavior be to not bother with the ";" syntax, and document in the
> CHANGES.txt that people who want the old behavior need to explicitly add
> some default param to get it (we already have a comment to the same affect
> regarding json.nl=map)
>

If you all are up for it, i think the cleanest thing is to require the
sort param and not support the ';'

but I don't have a sense of who or how this will affect anyone.

Re: sorting parameter changes?

Posted by Chris Hostetter <ho...@fucit.org>.
: I agree that a user of the sort parameter wouldn't want their query
: string munged as a side effect of not supplying sort, or setting it to
: blank, on some request.  But having an explicit parameter to control
: every aspect of legacy behavior could certainly lead to nasty parameter
: bloat...

it could .. but it wouldn't need to be specified as part of every request
it could be baked into the solrconfig .. we could even make the default
behavior be to not bother with the ";" syntax, and document in the
CHANGES.txt that people who want the old behavior need to explicitly add
some default param to get it (we already have a comment to the same affect
regarding json.nl=map)

: How about officially deprecating the ;sort behavior, bumping the current
: version attribute to 2.3 or 2.5, and using that to control whether the
: sort==null fork is chosen?  I haven't been able to keep track of the

"version" is a param that (to date) only influences the XmlResponseWriter
(aka: standard) it has never been used to influence execution logic.



-Hoss


Re: sorting parameter changes?

Posted by "J.J. Larrea" <jj...@panix.com>.
At 9:42 PM -0800 2/21/07, Chris Hostetter wrote:
>yeah ... the semantics discussed previously for backwards compatibility
>were...
>
> 1) if sort param, use it and do no special ; parsing
> 2) if no sort param, check for ; and extract sort

FWIW, in an in-progress refactoring of StandardRequestHandler and DisMaxRequestHandler, I made (but did not have a chance to test) the following trivial change:

      // get sort specification, if any, from sort=<sort> param or ;<sort> appended to query
      Sort sort = U.getSort(req);
      if( sort == null ) {
        // legacy mode, where sreq is query;sort
        List<String> commands = StrUtils.splitSmart(sreq,';');
        qs = commands.size() >= 1 ? commands.get(0) : "";
        if (commands.size() >= 2) {
          QueryParsing.SortSpec sortSpec = QueryParsing.parseSort(commands.get(1), req.getSchema());
          if (sortSpec != null) {
            sort = sortSpec.getSort();
          }
        }
      }
     

>..but i'm wondering if we want an explicit param indicating that the old
>behavior should be supressed -- that way having no sort param can default
>to sort on score just like with dismax (and it won't break if the query
>happens to have a ; in it).

I agree that a user of the sort parameter wouldn't want their query string munged as a side effect of not supplying sort, or setting it to blank, on some request.  But having an explicit parameter to control every aspect of legacy behavior could certainly lead to nasty parameter bloat...

How about officially deprecating the ;sort behavior, bumping the current version attribute to 2.3 or 2.5, and using that to control whether the sort==null fork is chosen?  I haven't been able to keep track of the zillions of improvements in the past few weeks submitted by Ryan et alia, but perhaps there are other API-busting changes which could benefit from being lumped into a version bump?

The fly in the ointment is that in the sample solrconfig the example request handlers have no version default (or it has been commented out).  This is too bad because it means that version-dependent code changes like this will break for a person using an existing solrconfig e.g. customized from the sample.  The sample file really should set version explicitly in all RH definitions in order to establish a contract with the user that the behavior it defines will not change.  Perhaps there should be a top-level solrconfig parameter setting a global request-handler version default, and then the documentation could say "you must set the version to > xxx either globally or on a per-handler or per-request basis in order to use the following features" or vice-versa.

- J.J.

Re: sorting parameter changes?

Posted by Chris Hostetter <ho...@fucit.org>.
: "uncool" comment was, er, uncool, but the motivation was that AFAIK in
: no other place does SOLR make this distinction, an assumption based on
: SolrParams not having an API to test for param existence and being
: explicitly coded around nonexistence and null being equatable; meanwhile
: elsewhere in the code (e.g. parseSort) null and "" are equated.
:
: So this would constitute a policy change, and one that required a

agreed ... trying to drive behavior based on the distinction between a
param being blank an a param not being specified at all is a bad idea ...
particularly since we've observed inconsistencies in the way some servlet
containers deal with empty params...

	http://issues.apache.org/jira/browse/SOLR-12


-Hoss


Re: sorting parameter changes?

Posted by "J.J. Larrea" <jj...@panix.com>.
At 11:30 PM -0800 2/21/07, Ryan McKinley wrote:
>The question is do we want to add *another* parameter to say "don't
>parse the ; sort even if i don't specify the sort parameter"?
>
>Yes, testing the existence/non-existence of a param is not great - but
>I don't think adding another field is worth it for something this
>small that *can* be accomplished with an empty (or explicit
>'sort=score') parameter.  It seems like the effort involved in
>explaining a new parameter is more then saying "if you don't want ';'
>to get parsed as a sorting parameter, make sure to specify a 'sort'
>parameter."

Ryan, I agree with you that adding an extra control parameter is overkill for something so small.

Another question is whether the ; behavior is considered deprecated with the introduction of sort=, or maintained ad infinitum as a parallel approach for sort specification (for StandardRH only, or perhaps for DisMaxRH too if we introduce a parameter to control the q escaping and ; is not escaped).

If deprecated, then I think version= may be the most elegant way to do that, per  my prior email.

If parallel, then I agree that existence/nonexistence would be the simplest way to toggle the behavior (from a user perspective).  My "uncool" comment was, er, uncool, but the motivation was that AFAIK in no other place does SOLR make this distinction, an assumption based on SolrParams not having an API to test for param existence and being explicitly coded around nonexistence and null being equatable; meanwhile elsewhere in the code (e.g. parseSort) null and "" are equated.

So this would constitute a policy change, and one that required a fundamental (albeit small) API and logic change; for example, in MapSolrParams and MultiMapSolrParams, one would need to convert present but null-valued Map entries into "" or String[]{}, respectively, either upon initialization or on demand in get(key), so things like DefaultSolrParams and the various SolrParams.getXXX(key, default) can continue to use null as a flag for not-present.  Alternately one could introduce an exists(key) method, but then every one of those checks for null indicating "use the default" would need to be changed so null becomes a first-class citizen.  Or something like that.

So while such a global policy change would probably be a good thing in the long run, I fear it's not trivial (unless once again I missed something).

- J.J.

Re: sorting parameter changes?

Posted by Ryan McKinley <ry...@gmail.com>.
> >
> >so, one way to disable ; parsing for all queries would be to add a
> >default empty parameter for "sort"
>
> Don't you have that backwards?

I don't think so (but it happens so often)

with the existing parser:

 q=id:55;
 q=id:55
  q=id:55;score asc

are all equivalent.  I'm suggesting we change so that:

 q=id:55&sort=&...
 q=id:55;
 q=id:55
 q=id:55;score

are equal, but not:

 q=id:55&sort=&...
 q=id:55;&sort=&...

The question is do we want to add *another* parameter to say "don't
parse the ; sort even if i don't specify the sort parameter"?

Yes, testing the existence/non-existence of a param is not great - but
I don't think adding another field is worth it for something this
small that *can* be accomplished with an empty (or explicit
'sort=score') parameter.  It seems like the effort involved in
explaining a new parameter is more then saying "if you don't want ';'
to get parsed as a sorting parameter, make sure to specify a 'sort'
parameter."

ryan

Re: sorting parameter changes?

Posted by "J.J. Larrea" <jj...@panix.com>.
At 10:46 PM -0800 2/21/07, Ryan McKinley wrote:
>the 'explicit' parameter would be sending an empty string for 'sort'
>
>In the existing implementation of QueryParsing.parseSort( ):
> null, "", and "score"  all return a 'null' (default) SortSpec
>
>so, one way to disable ; parsing for all queries would be to add a
>default empty parameter for "sort"

Don't you have that backwards?  QueryParsing.parseSort takes the standard approach of equating missing/null and empty/"".  Since it also equates those to "score" (something which breaks the snippet I just sent) it seems to me there is no way to explicitly specify default sort without ; parsing, without changing that logic by differentiating present/non-present from empty (uncool IMHO) or introducing another control parameter as Hoss suggested.  Or did I miss something?

Re: sorting parameter changes?

Posted by Ryan McKinley <ry...@gmail.com>.
>
>  1) if sort param, use it and do no special ; parsing
>  2) if no sort param, check for ; and extract sort
>
> ..but i'm wondering if we want an explicit param indicating that the old
> behavior should be supressed -- that way having no sort param can default
> to sort on score just like with dismax (and it won't break if the query
> happens to have a ; in it).
>

the 'explicit' parameter would be sending an empty string for 'sort'

In the existing implementation of QueryParsing.parseSort( ):
  null, "", and "score"  all return a 'null' (default) SortSpec

so, one way to disable ; parsing for all queries would be to add a
default empty parameter for "sort"

Re: sorting parameter changes?

Posted by Chris Hostetter <ho...@fucit.org>.
: > Any reason not to have the standard handler accept sort=?
:
: Nope, it's been discussed before I think, and everyone agreed it was a
: good idea.

yeah ... the semantics discussed previously for backwards compatibility
were...

 1) if sort param, use it and do no special ; parsing
 2) if no sort param, check for ; and extract sort

..but i'm wondering if we want an explicit param indicating that the old
behavior should be supressed -- that way having no sort param can default
to sort on score just like with dismax (and it won't break if the query
happens to have a ; in it).


-Hoss


Re: sorting parameter changes?

Posted by Yonik Seeley <yo...@apache.org>.
On 2/21/07, Ryan McKinley <ry...@gmail.com> wrote:
> Currently, the standard query handler takes a "smart" sort parameter:
>
>   q=query; sort
>
> and dismax requires a separate sort parameter:
>
>  q=query&sort=sort
>
> Any reason not to have the standard handler accept sort=?

Nope, it's been discussed before I think, and everyone agreed it was a
good idea.

-Yonik