You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2022/09/28 20:30:00 UTC

[jira] [Commented] (SOLR-12730) Implement staggered SPLITSHARD requests in IndexSizeTrigger

    [ https://issues.apache.org/jira/browse/SOLR-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610720#comment-17610720 ] 

David Smiley commented on SOLR-12730:
-------------------------------------

IndexSizeTrigger is gone but nonetheless, SplitShardCmd takes a `splitFuzz` parameter.

I'd like to argue we don't need this size fuzzing to be applied at the SplitShardCmd or lower.  A simpler approach that accomplishes the goal (avoid coinciding splits) is to have the triggering mechanism (formerly IndexSizeTrigger, maybe now in SOLR-16348) lower the split threshold target by some threshold fuzzing factor.  In order to apply it consistently when it's re-evaluated, it can use a random with a seed based on the shard's name.

> Implement staggered SPLITSHARD requests in IndexSizeTrigger
> -----------------------------------------------------------
>
>                 Key: SOLR-12730
>                 URL: https://issues.apache.org/jira/browse/SOLR-12730
>             Project: Solr
>          Issue Type: Improvement
>          Components: AutoScaling
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>             Fix For: 8.1
>
>
> Simulated large scale tests uncovered an interesting scenario that occurs also in real clusters where {{IndexSizeTrigger}} is used for controlling the maximum shard size.
> As index size grows and the number of shards grows, if document assignment is more or less even then at equal intervals (on a {{log2}} scale) there will be an avalanche of SPLITSHARD operations, because all shards will reach the critical size at approximately the same time.
> A hundred or more split shard operations running in parallel may severely affect the cluster performance.
> One possible approach to reduce the likelihood of this situation is to split shards not exactly in half but rather fudge the proportions around 60/40% in a random sequence, so that the resulting sub-sub-sub…shards would reach the thresholds at different times. This would require modifications to the SPLITSHARD command to allow this randomization.
> Another approach would be to simply limit the maximum number of parallel split shard operations. However, this would slow down the process of reaching the balance (increase lag) and possibly violate other operational constraints due to some shards waiting too long for the split and significantly exceeding their max size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org