You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Amrit Sarkar (JIRA)" <ji...@apache.org> on 2018/07/23 20:02:00 UTC

[jira] [Commented] (SOLR-12572) Reuse fieldvalues computed while sorting at writing in ExportWriter

    [ https://issues.apache.org/jira/browse/SOLR-12572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553333#comment-16553333 ] 

Amrit Sarkar commented on SOLR-12572:
-------------------------------------

Attached WIP patch, which reuses docValues for "numeric" field types. Initial tests showed improvement up-to 30% in execution time against export handler, for 2.5M docs and sorting on 8 fields.

I am facing IndexOutOfBounds exception for string and boolean (implements string logic) data type, as "exact" values are not computed, but "ordinals". I will detail this out more after performing few tests.

> Reuse fieldvalues computed while sorting at writing in ExportWriter
> -------------------------------------------------------------------
>
>                 Key: SOLR-12572
>                 URL: https://issues.apache.org/jira/browse/SOLR-12572
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: streaming expressions
>            Reporter: Amrit Sarkar
>            Assignee: Varun Thacker
>            Priority: Minor
>         Attachments: SOLR-12572.patch, SOLR-12572.patch
>
>
> Today while exporting result through "/export" handler,
> {code:java}
> http://localhost:8983/solr/core_name/export?q=my-query&sort=severity+desc,timestamp+desc&fl=severity,timestamp,msg
> {code}
> Doc-values are sought for all the {{sort}} fields defined (in this example 'severity, 'timestamp'). When we stream out docs we again make doc-value seeks against the {{fl}} fields ('severity','timestamp','msg') . 
> In most common use-cases we have {{fl = sort}} fields, or atleast the sort fields are subset of {{fl}} fields, so if we can *pre-collect* the values while sorting it, we can reduce the doc-value seeks potentially bringing *speed improvement*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org