You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2016/11/21 23:37:59 UTC

[jira] [Commented] (LUCENE-7568) Optimize merge when index sorting is used but the index is already sorted

    [ https://issues.apache.org/jira/browse/LUCENE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685113#comment-15685113 ] 

Michael McCandless commented on LUCENE-7568:
--------------------------------------------

This looks great!  Thanks [~jim.ferenczi].

Maybe we could improve the new tests a bit to:

  * Allow merging, using {{newLogMergePolicy}}, which keeps docs in order but randomizes how merges are done; we want to make sure this opto still applies when a newly merged segment is then picked for another merge

  * Assert that the resulting {{MergeState.needsIndexSort}} is always {{false}}

I think this would increase test coverage since {{MultiSorter.sort}} is only part of the logic in computing that boolean.

To do that 2nd part ... I think you could make a simple {{FilterCodec}} that overrides one of the formats, e.g. {{PointsFormat}}, so that it can intercept the {{merge}} call at which point it would check the boolean?

> Optimize merge when index sorting is used but the index is already sorted
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-7568
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7568
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Ferenczi Jim
>         Attachments: LUCENE-7568.patch
>
>
> When the index sorting is defined a lot of optimizations are disabled during the merge. For instance the bulk merge of the compressing stored fields is disabled since documents are not merged sequentially. Though it can happen that index sorting is enabled but the index is already in sorted order (the sort field is not filled or filled with the same value for all documents). In such case we can detect that the sort is not needed and activate the merge optimization.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org