You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2022/05/31 15:47:00 UTC

[jira] [Reopened] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

     [ https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand reopened LUCENE-10574:
-----------------------------------

Reopening: the change to LogMergePolicy isn't good.

The way that the change ignores unbalanced merges and moves on to the next possible merge doesn't work well for LogMergePolicy's existing logic. I now understand what is happening but it's quite complex to explain with words, it's a function of multiple factors.

For instance here is the distribution of segment sizes that you will get if you do 100 1-document flushes:
[19, 1, 1, 1, 1, 1, 1, 1, 1, 1, 19, 1, 1, 1, 1, 1, 1, 1, 1, 1, 19, 1, 1, 1, 1, 1, 1, 1, 1, 1, 10, 1, 1, 1, 1, 1, 1]


> Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-10574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10574
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Major
>             Fix For: 9.3
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Remove {{floorSegmentBytes}} parameter, or change lucene's default to a merge policy that doesn't merge in an O(n^2) way.
> I have the feeling it might have to be the latter, as folks seem really wed to this crazy O(n^2) behavior.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org