You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Mike Percy (JIRA)" <ji...@apache.org> on 2017/10/05 20:20:00 UTC

[jira] [Created] (KUDU-2169) Allow replicas that do not exist to vote

Mike Percy created KUDU-2169:
--------------------------------

             Summary: Allow replicas that do not exist to vote
                 Key: KUDU-2169
                 URL: https://issues.apache.org/jira/browse/KUDU-2169
             Project: Kudu
          Issue Type: Sub-task
          Components: consensus
            Reporter: Mike Percy


In certain scenarios it is desirable for replicas that do not exist on a tablet server to be able to vote. After the implementation of KUDU-871, tombstoned tablets are now able to vote. However, there are circumstances (at least in a pre- KUDU-1037 world) where voters that do not have a copy of a replica (running or tombstoned) would be needed to vote to ensure availability in certain edge-case failure scenarios.

The quick justification for why it would be safe for a non-existent replica to vote is that it would be equivalent to a replica that has simply not yet replicated any WAL entries, in which case it would be legal to vote for any candidate. Of course, a candidate would only ask such a replica to vote for it if it believed that replica to be a voter in its config.

Some additional discussion can be found here: https://github.com/apache/kudu/blob/master/docs/design-docs/raft-tablet-copy.md#should-a-server-be-allowed-to-vote-if-it-does_not_exist-or-is-deleted

What follows is an example of a scenario where "non-existent" replicas being able to vote would be desired:

In a 3-2-3 re-replication paradigm, the leader (A) of a 3-replica config {A, B, C} evicts one replica (C). Then, the leader (A) adds a new voter (D). Before A is able to replicate this config change to B or D, A is partitioned from a network perspective. However A writes this config change to its local WAL. After this, the entire cluster is brought down, the network is restored, and the entire cluster is restarted. However, B fails to come back online due to a hardware failure.

The only way to automatically recover in this scenario is to allow D, which has no concept of the tablet being discussed, to vote for A to become leader, which will then tablet copy to D and make the tablet available for writes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)