You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Alexey Serbin (Jira)" <ji...@apache.org> on 2020/06/02 21:07:00 UTC

[jira] [Resolved] (KUDU-2169) Allow replicas that do not exist to vote

     [ https://issues.apache.org/jira/browse/KUDU-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Serbin resolved KUDU-2169.
---------------------------------
    Fix Version/s: n/a
       Resolution: Won't Do

Now we have the 3-4-3 replica management scheme, and we don't use 3-2-3 scheme anymore.

With the 3-4-3 scheme there are scenarios where the system first evicts a replica, and then adds a new non-voter replica: that's when the replica to be evicted fails behind WAL segment GC threshold or experience a disk failure.  In very rare cases it might happen that a tablet ends up with leader replica A, and replica A cannot replicate/commit the change in the Raft configuration as described.

From the other side, such a newly replica D in case of the 3-4-3 scheme is a non-voter, and it cannot vote by definition.

In other words, some manual intervention would be necessary in the described scenario, but not the way how this JIRA proposes.

Closing as 'Won't Do'.

> Allow replicas that do not exist to vote
> ----------------------------------------
>
>                 Key: KUDU-2169
>                 URL: https://issues.apache.org/jira/browse/KUDU-2169
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: consensus
>            Reporter: Mike Percy
>            Priority: Major
>             Fix For: n/a
>
>
> In certain scenarios it is desirable for replicas that do not exist on a tablet server to be able to vote. After the implementation of KUDU-871, tombstoned tablets are now able to vote. However, there are circumstances (at least in a pre- KUDU-1097 world) where voters that do not have a copy of a replica (running or tombstoned) would be needed to vote to ensure availability in certain edge-case failure scenarios.
> The quick justification for why it would be safe for a non-existent replica to vote is that it would be equivalent to a replica that has simply not yet replicated any WAL entries, in which case it would be legal to vote for any candidate. Of course, a candidate would only ask such a replica to vote for it if it believed that replica to be a voter in its config.
> Some additional discussion can be found here: https://github.com/apache/kudu/blob/master/docs/design-docs/raft-tablet-copy.md#should-a-server-be-allowed-to-vote-if-it-does_not_exist-or-is-deleted
> What follows is an example of a scenario where "non-existent" replicas being able to vote would be desired:
> In a 3-2-3 re-replication paradigm, the leader (A) of a 3-replica config \{A, B, C\} evicts one replica (C). Then, the leader (A) adds a new voter (D). Before A is able to replicate this config change to B or D, A is partitioned from a network perspective. However A writes this config change to its local WAL. After this, the entire cluster is brought down, the network is restored, and the entire cluster is restarted. However, B fails to come back online due to a hardware failure.
> The only way to automatically recover in this scenario is to allow D, which has no concept of the tablet being discussed, to vote for A to become leader, which will then tablet copy to D and make the tablet available for writes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)