You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2018/01/15 07:34:00 UTC

[jira] [Commented] (SOLR-11854) multiValued PrimitiveFieldType should implicitly sort on min/max based on the asc/desc keyword

    [ https://issues.apache.org/jira/browse/SOLR-11854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325960#comment-16325960 ] 

Hoss Man commented on SOLR-11854:
---------------------------------

 
In the attached patch, I've beefed up the existing (and add some some new) helper methods in {{FieldType}} related to sorting to depend on a new method that subclasses can override...

{code}
  /**
   * Method for indicating which {@link MultiValueSelector} (if any) should be used when
   * sorting on a multivalued field of this type for the specified direction (asc/desc).  
   * The default implementation returns <code>null</code> (for all inputs).
   *
   * @param field The SchemaField (of this type) in question
   * @param reverse false if this is an ascending sort, true if this is a descending sort.
   * @return the implicit selector to use for this direction, or null if implicit sorting on the specified direction is not supported and should return an error.
   * @see MultiValueSelector
   */
  public MultiValueSelector getDefaultMultiValueSelectorForSort(SchemaField field, boolean reverse) {
    // trivial base case
    return null;
  }
{code}

...and then I've overridden it in {{PrimitiveFieldType}} to look like this...

{code}
  public MultiValueSelector getDefaultMultiValueSelectorForSort(SchemaField field, boolean reverse) {
    return reverse ? MultiValueSelector.MAX : MultiValueSelector.MIN;
  }
{code}


...so by default, arbitrary field types will not support this implicit min/max selection based on the asc/desc keyword -- but PrimativeFieldTypes (numerics, boolean, str, etc...) will support it.  Custom field types can also override this method to do so as well if they wish -- they can even override it to flip the mapping such that asc->max and desc->min.

The rest of the patch consists of:
* refactoring a lot of redundent/common code related to sorting into helper methods.  Notably: I refactored a lot of the details related to the logic of which numeric values to use for {{sortMissingLast}} and {{sortMissingFirst}} depending on the {{asc|desc}} choice into the existing {{NumberType}} enum so that they could be removed from a lot of concrete type classes.
* adding {{StrField.getSingleValueSource}} -- in my opinion i droped the ball on not including this in SOLR-2522.  Adding it now allows the same explicit sort syntax like {{sort=field(my_str,min) desc}} to work, and along with the other changes above, the new implicit multivalued sorting works automatically as well.
* dealing with some unique special case "sort missing" behavior in {{enum}} field types.
* beefed up tests of the explicit function syntax on strings, as well as the implicit sort syntax for all primative types

----

I think this patch is solid & pretty much good to go ... anyone have any concerns?



 

> multiValued PrimitiveFieldType should implicitly sort on min/max based on the asc/desc keyword
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11854
>                 URL: https://issues.apache.org/jira/browse/SOLR-11854
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Major
>         Attachments: SOLR-11854.patch
>
>
> Back in SOLR-2522, I added new syntax for (numeric) fields such that the {{field(someMultivaluedFieldName,min|max)}} syntax could be used to select either the min or max value of a multivalued (docvalues) field for use in other functions -- or for sorting.
> A little while back, it occured to me that a good "default" behavior for all primative multivalued fields would be:
> * automatically use the "min" value when {{sort=someMultivaluedFieldName asc}} is attempted
> * automatically use the "max" value when {{sort=someMultivaluedFieldName desc}} is attempted
> These defaults seem like they would be a big improvement over the current "throw an error" default behavior -- especially since it naturally reduces down in the trivial case where all docs have at most 1 value anyway -- and would align in practice with how most people I've talked to seem to think "sorting on a multivalued field" should work in theory.   If users don't like these dafaults, they can always use the explicit {{field(foo,min|max)}} syntax instead (ex: if users always want multivalued fields to sort on the 'min' value, regardless of the asc|desc selector)
> I've been experimenting with this off and on for a while, working up a POC patch -- I think it's worth doing (details to follow in comment)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org