You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/02 21:32:00 UTC

[jira] [Updated] (KUDU-2257) Add tool to recover a tablet that has lost a majority of replicas

     [ https://issues.apache.org/jira/browse/KUDU-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Henke updated KUDU-2257:
------------------------------
    Labels: supportability  (was: )

> Add tool to recover a tablet that has lost a majority of replicas
> -----------------------------------------------------------------
>
>                 Key: KUDU-2257
>                 URL: https://issues.apache.org/jira/browse/KUDU-2257
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: William Berkeley
>            Priority: Major
>              Labels: supportability
>
> In the unfortunate case where a tablet has lost a majority of its replicas, the way to recover is to rewrite the config of the remaining healthy members so the unhealthy members are excluded, then wait for normal re-replication. This involves k similar commands on the k remaining healthy servers. We could simplify the process by adding a simple tool "kudu tablet unsafe_promote_minority" that does the rewrite in one step.
> The tool could use ksck to try and identify the healthy minority, or it could rely on the user to do that.
> The tool should also make it very clear that it may cause data loss and that its use is a last resort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)