You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Kirill Sizov (Jira)" <ji...@apache.org> on 2023/01/16 11:37:00 UTC
[jira] [Resolved] (HDDS-7706) OM Bootstrap is unable to add a node to a raft ring
[ https://issues.apache.org/jira/browse/HDDS-7706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kirill Sizov resolved HDDS-7706.
--------------------------------
Fix Version/s: 1.4.0
Resolution: Duplicate
> OM Bootstrap is unable to add a node to a raft ring
> ---------------------------------------------------
>
> Key: HDDS-7706
> URL: https://issues.apache.org/jira/browse/HDDS-7706
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM HA
> Affects Versions: 1.3.0
> Reporter: Kirill Sizov
> Assignee: Kirill Sizov
> Priority: Critical
> Fix For: 1.4.0
>
>
> During the lifetime of OM HA the nodes might receive peer list update messages. It usually happens when a node goes down and the up. But as long as all the nodes of the cluster are present in this list - everything's fine.
> However if we bootstrap a new node, in the following example it is om4, and it replays the raft log on its side, such message will be fatal and cause it to exit.
> {noformat}
> 2022-12-22 19:45:17,386 [Thread[Thread-21,5,main]] INFO security.OzoneDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> 2022-12-22 19:45:17,386 [Listener at om4address/9862] INFO om.OzoneManager: Version File has different layout version (3) than OM DB (null). That is expected if this OM has never been finalized to a newer layout version.
> 2022-12-22 19:45:17,387 [om4@group-942F8267F22A-StateMachineUpdater] INFO ratis.OzoneManagerStateMachine: Received Configuration change notification from Ratis. New Peer list:
> [id: "om1"
> address: "om1address:9872"
> , id: "om3"
> address: "om3address:9872"
> , id: "om2"
> address: "om2address:9872"
> ]
> 2022-12-22 19:45:17,387 [om4@group-942F8267F22A-StateMachineUpdater] ERROR om.OzoneManager: Fatal Error: Shutting down as OM has been decommissioned.
> 2022-12-22 19:45:17,388 [om4@group-942F8267F22A-StateMachineUpdater] ERROR om.OzoneManager: Terminating with exit status 1: Shutting down as OM has been decommissioned.
> {noformat}
> *Expected behavior:*
> the new node should be able to correctly replay the raft log.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org