You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2016/08/05 17:42:21 UTC

[jira] [Updated] (SOLR-6465) CDCR: fall back to whole-index replication when tlogs are insufficient

     [ https://issues.apache.org/jira/browse/SOLR-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-6465:
----------------------------------------
    Attachment: SOLR-6465.patch

Major changes:
# Hardened the bootstrap and bootstrap status request code paths. The bootstrap is still done only once during init but I wrote chaos monkey style tests to exercise this path. Also see SOLR-9364
# tlog replication can be disabled via a parameter which is used by target clusters during bootstrap. This prevents tlogs from source leaders to be replicated to target leaders.
# Assert that we are the leader before starting bootstrap process
# Bootstrap uses the same recovery lock to avoid racing with recovery and potentially corrupting the index
# CdcrReplicatorState is initialized eagerly rather than waiting for bootstrap to allow QUEUES action to work
# Added a new test CdcrBootstrapTest#testBootstrapWithContinousIndexingOnSourceCluster to stress bootstrap during indexing load
# All existing tests pass and precommit passes

The current patch implements the goal of this ticket which is to fall back to whole-index replication when tlogs are insufficient. Therefore, this patch does not remove CdcrUpdateLog and related functionality which can be a bit complicated as Renaud had pointed out. This patch also does not allow updates to be sent while a bootstrap is in progress. Doing that opens a can of worms because you need to carefuly co-ordinate with the leader the order of hard commit and start of buffering to avoid losing documents. Unless the source cluster has very high update rates, the replicator thread should be able to catch up even without this headstart.

I plan to commit this patch as-is and open follow up issues for refactoring and other improvements.

> CDCR: fall back to whole-index replication when tlogs are insufficient
> ----------------------------------------------------------------------
>
>                 Key: SOLR-6465
>                 URL: https://issues.apache.org/jira/browse/SOLR-6465
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Yonik Seeley
>         Attachments: SOLR-6465.patch, SOLR-6465.patch, SOLR-6465.patch
>
>
> When the peer-shard doesn't have transaction logs to forward all the needed updates to bring a peer up to date, we need to fall back to normal replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org