You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Roman Valls (JIRA)" <ji...@apache.org> on 2008/11/25 14:11:44 UTC

[jira] Commented: (HADOOP-1212) Data-nodes should be formatted when the name-node is formatted.

    [ https://issues.apache.org/jira/browse/HADOOP-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650556#action_12650556 ] 

Roman Valls commented on HADOOP-1212:
-------------------------------------

I can confirm the bug, upgrading from 0.17 to 0.18.1 did not work. I even deleted the HDFS on nodes and formatted (I'm running a small 9-node cluster):

$ bin/hadoop/stop-all && cluster-fork rm -rf /state/partition1/hdfs/hadoop/*
$ hadoop namenode -format

In addition, I tried a rough variant on Jared's solution that did not work either:

$ cp /state/partition1/hdfs/hadoop/dfs/data/current/VERSION /shared/apps/VERSION
$ cluster-fork cp -a /shared/apps/VERSION /state1/partition1/hdfs/hadoop/dfs/data/current/VERSION

Is there a reliable way to make it work right away ? Can this VERSION file (or namespaceID) be forced to be equal on every node ?

> Data-nodes should be formatted when the name-node is formatted.
> ---------------------------------------------------------------
>
>                 Key: HADOOP-1212
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1212
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Konstantin Shvachko
>
> The upgrade feature HADOOP-702 requires data-nodes to store persistently the namespaceID 
> in their version files and verify during startup that it matches the one stored on the name-node.
> When the name-node reformats it generates a new namespaceID.
> Now if the cluster starts with the reformatted name-node, and not reformatted data-nodes
> the data-nodes will fail with
> java.io.IOException: Incompatible namespaceIDs ...
> Data-nodes should be reformatted whenever the name-node is. I see 2 approaches here:
> 1) In order to reformat the cluster we call "start-dfs -format" or make a special script "format-dfs".
> This would format the cluster components all together. The question is whether it should start
> the cluster after formatting?
> 2) Format the name-node only. When data-nodes connect to the name-node it will tell them to
> format their storage directories if it sees that the namespace is empty and its cTime=0.
> The drawback of this approach is that we can loose blocks of a data-node from another cluster
> if it connects by mistake to the empty name-node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.