You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "Steve Vaughan (Jira)" <ji...@apache.org> on 2022/08/23 19:37:00 UTC

[jira] [Updated] (HDFS-16690) Automatically format new unformatted JournalNodes using JournalNodeSyncer

     [ https://issues.apache.org/jira/browse/HDFS-16690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Vaughan updated HDFS-16690:
---------------------------------
     Target Version/s: 3.4.0, 3.3.9
    Affects Version/s: 3.4.0
                       3.3.9

> Automatically format new unformatted JournalNodes using JournalNodeSyncer 
> --------------------------------------------------------------------------
>
>                 Key: HDFS-16690
>                 URL: https://issues.apache.org/jira/browse/HDFS-16690
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: journal-node
>    Affects Versions: 3.4.0, 3.3.9
>         Environment: Demonstrated in a Kubernetes environment running Java 11.
>  # Start new cluster, but short 1 JN (minimum quorum, and the missing JN won’t resolve). VERIFY:
>  - NN formats the 2 existing JN and stabilizes.  NOTE: Formatting using just a quorum will be a separate submission
>  - Messages show sync between JN-0 and JN-1, and NN -> JN.
>  # Scale JN stateful set to add missing JN. VERIFY:
>  - New JN starts
>  - All other JN and all NN report IP address change (IP Address resolution).  NOTE: require HADOOP-18365 and HDFS-16688
>  - Messages show sync between all JN, and NN -> JN
>  - New JN is formatted at least once (possibly by multiple other JN)
>  - New JN storage directory is formatted only once
>  - New JN joins cluster (lastWriterEpoch is non-zero)
>            Reporter: Steve Vaughan
>            Assignee: Steve Vaughan
>            Priority: Major
>
> If an unformatted JournalNode is added to an existing JournalNode set, instances of the JournalNodeSyncer are unable to sync to the new node.  When a sync receives a JournalNotFormattedException, we can initiate a format operation, and then retry the synchronization.
> Conceptually this means that the JournalNodes and their data can be managed independently from the rest of the system, as the JournalNodes will incorporate new JournalNode instances.  Once the new JournalNode is formatted, it can participate in shared edits from the NameNodes. 
> I've been testing an update to the InterQJournalProtocol to add a format call like that used by the NameNode.  Current tests include starting an HA cluster from scratch, but with 2 JournalNode instances.  Once the cluster is up, I can add the 3rd JournalNode (which is unformatted), and the other 2 JournalNodes will eventually attempt to sync which results in a formatting and subsequent sync.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org