You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ethan Rose (Jira)" <ji...@apache.org> on 2021/02/01 22:45:00 UTC

[jira] [Updated] (HDDS-4775) Avoid OM split brain going from 1 node OM to 3 node ratis without bootstrap

     [ https://issues.apache.org/jira/browse/HDDS-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Rose updated HDDS-4775:
-----------------------------
    Description: 
The expected flow to add more OMs to a single node OM ratis cluster is to adjust the configurations to add the new OMs, and then start the new OMs in bootstrap mode (supply the -–bootstrap flag on startup) so that they will get all transaction history from the original OM. However, it is possible for the user to mistakenly adjust the configs and not start the new OMs with the --bootstrap command. This will cause the two new OMs to form their own Ratis ring with a 2/3 majority that can service write requests, while the original OM is still the leader of a single node Ratis ring and also servicing write requests. This leads to a split brain scenario that is difficult to detect without inspecting the logs, because all OMs appear functional and the cluster is writeable.

This Jira aims to detect such a scenario and shut down the OMs when it occurs, instructing the user to bootstrap the new OMs on startup instead.

  was:The expected flow to add more OMs to a single node OM ratis cluster is to adjust the configurations to add the new OMs, and then start the new OMs in bootstrap mode (supply the -–bootstrap flag on startup) so that they will get all transaction history from the original OM. However, it is possible for the user to mistakenly adjust the configs and not start the new OMs with the --bootstrap command. This will cause the two new OMs to form their own Ratis ring with a 2/3 majority that can service write requests, while the original OM is still the leader of a single node Ratis ring and also servicing write requests. This leads to a split brain scenario that is difficult to detect without inspecting the logs, because all OMs appear functional and the cluster is writeable.


> Avoid OM split brain going from 1 node OM to 3 node ratis without bootstrap
> ---------------------------------------------------------------------------
>
>                 Key: HDDS-4775
>                 URL: https://issues.apache.org/jira/browse/HDDS-4775
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ethan Rose
>            Assignee: Ethan Rose
>            Priority: Major
>
> The expected flow to add more OMs to a single node OM ratis cluster is to adjust the configurations to add the new OMs, and then start the new OMs in bootstrap mode (supply the -–bootstrap flag on startup) so that they will get all transaction history from the original OM. However, it is possible for the user to mistakenly adjust the configs and not start the new OMs with the --bootstrap command. This will cause the two new OMs to form their own Ratis ring with a 2/3 majority that can service write requests, while the original OM is still the leader of a single node Ratis ring and also servicing write requests. This leads to a split brain scenario that is difficult to detect without inspecting the logs, because all OMs appear functional and the cluster is writeable.
> This Jira aims to detect such a scenario and shut down the OMs when it occurs, instructing the user to bootstrap the new OMs on startup instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org