You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Megan Carey (Jira)" <ji...@apache.org> on 2021/01/29 00:27:00 UTC

[jira] [Comment Edited] (SOLR-15119) Make LINK splitMethod the default for SplitShardCmd

    [ https://issues.apache.org/jira/browse/SOLR-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17274053#comment-17274053 ] 

Megan Carey edited comment on SOLR-15119 at 1/29/21, 12:26 AM:
---------------------------------------------------------------

[~anshum] - in a sense, that's true. The IndexWriter enqueues half of the documents to be deleted from each sub-shard, but it can take a while for those deletes to complete. I assume (maybe incorrectly) that most users of SplitShardCmd trigger splits based on index size (either by branch_8x IndexSizeTrigger or another custom process). In my case, it's best for splits to be fast even at the cost of disk usage. I haven't had issues with LINK splits in terms of achieving desired replication factor or rebalancing - can you elaborate on that part?

For REWRITE splits, depending on the size of the shard at the time of split, it can take hours. During this time, the TLOG is also growing to be pretty large as it buffers updates (so perhaps also hard on disk usage?). It's definitely a trade-off. I'll bring this discussion to Slack in case more folks want to weigh in.

 [Edit]Discussion here: https://the-asf.slack.com/archives/CEKUCUNE9/p1611878168059800


was (Author: megancarey):
[~anshum] - in a sense, that's true. The IndexWriter enqueues half of the documents to be deleted from each sub-shard, but it can take a while for those deletes to complete. I assume (maybe incorrectly) that most users of SplitShardCmd trigger splits based on index size (either by branch_8x IndexSizeTrigger or another custom process). In my case, it's best for splits to be fast even at the cost of disk usage. I haven't had issues with LINK splits in terms of achieving desired replication factor or rebalancing - can you elaborate on that part?

For REWRITE splits, depending on the size of the shard at the time of split, it can take hours. During this time, the TLOG is also growing to be pretty large as it buffers updates (so perhaps also hard on disk usage?). It's definitely a trade-off. I'll bring this discussion to Slack in case more folks want to weigh in.

> Make LINK splitMethod the default for SplitShardCmd
> ---------------------------------------------------
>
>                 Key: SOLR-15119
>                 URL: https://issues.apache.org/jira/browse/SOLR-15119
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: master (9.0)
>            Reporter: Megan Carey
>            Priority: Major
>              Labels: easy-fix
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> REWRITE splitMethod is still the default in SplitShardCmd [1], despite LINK being much faster. IndexSizeTrigger in branch_8x already uses LINK by default [2], and we have found LINK to be reliable and performant at scale. This work will just update the default in SplitShardCmd to make LINK the default overall.
>  
>  [1][https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java#L88]
>  [2][https://github.com/apache/lucene-solr/blob/branch_8x/solr/core/src/java/org/apache/solr/cloud/autoscaling/IndexSizeTrigger.java#L186]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org