You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2018/01/11 19:06:00 UTC

[jira] [Created] (KUDU-2257) Add tool to recover a tablet that has lost a majority of replicas

Will Berkeley created KUDU-2257:
-----------------------------------

             Summary: Add tool to recover a tablet that has lost a majority of replicas
                 Key: KUDU-2257
                 URL: https://issues.apache.org/jira/browse/KUDU-2257
             Project: Kudu
          Issue Type: Improvement
            Reporter: Will Berkeley


In the unfortunate case where a tablet has lost a majority of its replicas, the way to recover is to rewrite the config of the remaining healthy members so the unhealthy members are excluded, then wait for normal re-replication. This involves k similar commands on the k remaining healthy servers. We could simplify the process by adding a simple tool "kudu tablet unsafe_promote_minority" that does the rewrite in one step.

The tool could use ksck to try and identify the healthy minority, or it could rely on the user to do that.

The tool should also make it very clear that it may cause data loss and that its use is a last resort.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)