You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Kirill Sizov (Jira)" <ji...@apache.org> on 2023/01/16 11:37:00 UTC

[jira] [Resolved] (HDDS-7706) OM Bootstrap is unable to add a node to a raft ring

     [ https://issues.apache.org/jira/browse/HDDS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kirill Sizov resolved HDDS-7706.
--------------------------------
    Fix Version/s: 1.4.0
       Resolution: Duplicate

> OM Bootstrap is unable to add a node to a raft ring
> ---------------------------------------------------
>
>                 Key: HDDS-7706
>                 URL: https://issues.apache.org/jira/browse/HDDS-7706
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM HA
>    Affects Versions: 1.3.0
>            Reporter: Kirill Sizov
>            Assignee: Kirill Sizov
>            Priority: Critical
>             Fix For: 1.4.0
>
>
> During the lifetime of OM HA the nodes might receive peer list update messages. It usually happens when a node goes down and the up. But as long as all the nodes of the cluster are present in this list  - everything's fine.
> However if we bootstrap a new node, in the following example it is om4, and it replays the raft log on its side, such message will be fatal and cause it to exit. 
> {noformat}
> 2022-12-22 19:45:17,386 [Thread[Thread-21,5,main]] INFO security.OzoneDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> 2022-12-22 19:45:17,386 [Listener at om4address/9862] INFO om.OzoneManager: Version File has different layout version (3) than OM DB (null). That is expected if this OM has never been finalized to a newer layout version.
> 2022-12-22 19:45:17,387 [om4@group-942F8267F22A-StateMachineUpdater] INFO ratis.OzoneManagerStateMachine: Received Configuration change notification from Ratis. New Peer list:
> [id: "om1"
> address: "om1address:9872"
> , id: "om3"
> address: "om3address:9872"
> , id: "om2"
> address: "om2address:9872"
> ]
> 2022-12-22 19:45:17,387 [om4@group-942F8267F22A-StateMachineUpdater] ERROR om.OzoneManager: Fatal Error: Shutting down as OM has been decommissioned.
> 2022-12-22 19:45:17,388 [om4@group-942F8267F22A-StateMachineUpdater] ERROR om.OzoneManager: Terminating with exit status 1: Shutting down as OM has been decommissioned.
> {noformat}
> *Expected behavior:*
> the new node should be able to correctly replay the raft log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org