You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/06 19:19:00 UTC

[jira] [Commented] (HDFS-16792) Add -newEditsOnly option to name node initializeSharedEdits command to format unformatted journal nodes for master replacement

    [ https://issues.apache.org/jira/browse/HDFS-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17684883#comment-17684883 ] 

ASF GitHub Bot commented on HDFS-16792:
---------------------------------------

ashutoshcipher commented on PR #4968:
URL: https://github.com/apache/hadoop/pull/4968#issuecomment-1419616776

   @ZanderXu @slfan1989 Please help in reviewing the PR. Thanks.




> Add -newEditsOnly option to name node initializeSharedEdits command to format unformatted journal nodes for master replacement
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-16792
>                 URL: https://issues.apache.org/jira/browse/HDFS-16792
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: journal-node
>    Affects Versions: 3.3.3, 3.3.4
>            Reporter: Ashutosh Gupta
>            Assignee: Ashutosh Gupta
>            Priority: Major
>              Labels: pull-request-available
>
> Add -newEditsOnly option to name node initializeSharedEdits command to format unformatted journal nodes for master replacement.
>  
> The current hadoop has limitations on formatting journal nodes. First, it requires all journal nodes to be present when formatting name node. After that, new journal nodes could not be formatted unless we reformat all of them and destroy all HDFS edit logs.
>  
> $ hdfs namenode -initializeSharedEdits -newEditsOnly
> There are two cases here:
> 1) the replaced master instance has name node running on it, then the above command will only be run once on the existing name node.
> 2) the replaced master instance does not have a name node running on it, then the above command will be run twice. But the above command is idempotent as long as the same command will not be issued at the same time, which could be guaranteed by instance controller when reconfiguring master instances. The reason is that all hadoop name node commands such as format, bootstrap, and initializeSharedEdits are not designed to be thread safe and they rely on an external mechanism, which is a manual work for hadoop admin for on-premise clusters, to make sure that they are not executed concurrently. We still keep the same behavior of the initializeSharedEdits command to minimize the code/logic changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org