You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2015/09/01 19:22:46 UTC

[jira] [Commented] (SOLR-2522) add syntax for selecting the min or max of a multivalued field in value source functions

    [ https://issues.apache.org/jira/browse/SOLR-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725722#comment-14725722 ] 

Hoss Man commented on SOLR-2522:
--------------------------------

bq. The code there is populating "exists" and is then populating the "value". According to a recent interaction I had with Adrien on some issue or another on fillValue, looking up "exists" is potentially one disk seek and looking up the "value" is another. ...

Maybe i'm missunderstanding what you're saying... 

I'm pretty sure what you describe  is only true in case where NumericDocValues are used under the covers (such as in your example: LongFieldSource) -- because the only way to know in that case if a value exists is to independently call DocValues.getDocsWithField to get a Bits instance.

But that's not applicable here, where the underlying "on disk" representation is SortedSetDocValues. In this case the "exists" information comes directly from the ord value -- values that don't exist have an ord of -1  (note the implementation of the exists() methods in the new code added by this issue. The new ValueFillers all follow the same pattern as DocTermsIndexDocValues, first using the ordinal value to determine if a value exists before trying to assign it. (the only diff here is delegating to the exists() method which already encapsulates the ordVal check.)

bq. In TrieLongField's longVal() it should check if the bytes is 0 length and if so return 0 instead of attempting to decode, which will fail (I tried).

Hmmm... the start of your comment mentioned that you thought these changes would be a perf improvement, but the wording at the end your comment ("will fail (i tried)") sounds like you're saying you found/demonstrated a bug ... but it's not exactly clear to me what exactly the bug is or how to reproduce it?

If there is a bug, can you please open a new jira  (since this feature was already released in 5.3) with either a test case or an example of how to reproduce?

(FWIW: i'm working on a blog post about this new feature with some benchmarks comparing it to sorting on single valued field.  Even if i missunderstood about whether you found a bug, if you can whip up a patch demonstrating your perf improvement idea -- even if it's just a single field type -- i'm happy to test it as well, and flesh it out to all field types).


> add syntax for selecting the min or max of a multivalued field in value source functions
> ----------------------------------------------------------------------------------------
>
>                 Key: SOLR-2522
>                 URL: https://issues.apache.org/jira/browse/SOLR-2522
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Bill Bell
>            Assignee: Hoss Man
>             Fix For: 5.3, Trunk
>
>         Attachments: SOLR-2522.patch, SOLR-2522.patch, SOLR-2522.patch
>
>
> Initial request...
> bq. Switch max() and min() functions to work on multiValued fields so we can 
> do sort=min(fieldname) asc and the sort would work on multiValued fields...
> ...this specific syntax has been spun off into SOLR-7853, but the underlying functionality s being implemented here using a new optional second argument to the {{field()}} function: {{field(multivalued_field_name,min)}} and {{field(multivalued_field_name,max)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org