You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by mg <us...@gmail.com> on 2012/11/13 12:03:23 UTC
Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Hi,
we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
running on Ubuntu 12.04 (Precise).
We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
dist-upgrade on all nodes, started CM, checked and updated the
configuration and attempted to start the cluster.
However, the HDFS NameNode fails to start with the exception appended below.
There is sufficient space on all partitions. We do not bind against
wildcard addresses (at least not yet).
Any ideas? Stacktrace follows.
Cheers,
Martin
FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
Exception in namenode join
java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:375)
at java.lang.Long.valueOf(Long.java:525)
at
org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
at
org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
at
org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
at
org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
at
org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by Arun C Murthy <ac...@hortonworks.com>.
Please don't cross-post, this belongs to just CDH lists.
On Nov 13, 2012, at 5:18 AM, mg wrote:
> Meanwhile we found that the seen_txid files are empty in 4 of 5 replicated namenode directories.
>
> The edits_inprogress_... files are identical in all 5 dirs with the tx id from the one non-empty seen_txid file.
>
> The fsimage files are identical, too.
>
> Otherwise there are differences between every 2 of the 5 dirs as what the edits_ files are concerned.
>
> Is it safe to copy the one non-empty seen_txid file over into the other 4 nn directories?
>
> Cheers,
> Martin
>
> On 13.11.2012 12:03, mg wrote:
>> Hi,
>>
>> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
>> running on Ubuntu 12.04 (Precise).
>>
>> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
>> dist-upgrade on all nodes, started CM, checked and updated the
>> configuration and attempted to start the cluster.
>>
>> However, the HDFS NameNode fails to start with the exception appended
>> below.
>>
>> There is sufficient space on all partitions. We do not bind against
>> wildcard addresses (at least not yet).
>>
>> Any ideas? Stacktrace follows.
>>
>> Cheers,
>> Martin
>>
>> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
>> Exception in namenode join
>> java.lang.NumberFormatException: null
>> at java.lang.Long.parseLong(Long.java:375)
>> at java.lang.Long.valueOf(Long.java:525)
>> at
>> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by Arun C Murthy <ac...@hortonworks.com>.
Please don't cross-post, this belongs to just CDH lists.
On Nov 13, 2012, at 5:18 AM, mg wrote:
> Meanwhile we found that the seen_txid files are empty in 4 of 5 replicated namenode directories.
>
> The edits_inprogress_... files are identical in all 5 dirs with the tx id from the one non-empty seen_txid file.
>
> The fsimage files are identical, too.
>
> Otherwise there are differences between every 2 of the 5 dirs as what the edits_ files are concerned.
>
> Is it safe to copy the one non-empty seen_txid file over into the other 4 nn directories?
>
> Cheers,
> Martin
>
> On 13.11.2012 12:03, mg wrote:
>> Hi,
>>
>> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
>> running on Ubuntu 12.04 (Precise).
>>
>> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
>> dist-upgrade on all nodes, started CM, checked and updated the
>> configuration and attempted to start the cluster.
>>
>> However, the HDFS NameNode fails to start with the exception appended
>> below.
>>
>> There is sufficient space on all partitions. We do not bind against
>> wildcard addresses (at least not yet).
>>
>> Any ideas? Stacktrace follows.
>>
>> Cheers,
>> Martin
>>
>> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
>> Exception in namenode join
>> java.lang.NumberFormatException: null
>> at java.lang.Long.parseLong(Long.java:375)
>> at java.lang.Long.valueOf(Long.java:525)
>> at
>> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by Arun C Murthy <ac...@hortonworks.com>.
Please don't cross-post, this belongs to just CDH lists.
On Nov 13, 2012, at 5:18 AM, mg wrote:
> Meanwhile we found that the seen_txid files are empty in 4 of 5 replicated namenode directories.
>
> The edits_inprogress_... files are identical in all 5 dirs with the tx id from the one non-empty seen_txid file.
>
> The fsimage files are identical, too.
>
> Otherwise there are differences between every 2 of the 5 dirs as what the edits_ files are concerned.
>
> Is it safe to copy the one non-empty seen_txid file over into the other 4 nn directories?
>
> Cheers,
> Martin
>
> On 13.11.2012 12:03, mg wrote:
>> Hi,
>>
>> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
>> running on Ubuntu 12.04 (Precise).
>>
>> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
>> dist-upgrade on all nodes, started CM, checked and updated the
>> configuration and attempted to start the cluster.
>>
>> However, the HDFS NameNode fails to start with the exception appended
>> below.
>>
>> There is sufficient space on all partitions. We do not bind against
>> wildcard addresses (at least not yet).
>>
>> Any ideas? Stacktrace follows.
>>
>> Cheers,
>> Martin
>>
>> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
>> Exception in namenode join
>> java.lang.NumberFormatException: null
>> at java.lang.Long.parseLong(Long.java:375)
>> at java.lang.Long.valueOf(Long.java:525)
>> at
>> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by Arun C Murthy <ac...@hortonworks.com>.
Please don't cross-post, this belongs to just CDH lists.
On Nov 13, 2012, at 5:18 AM, mg wrote:
> Meanwhile we found that the seen_txid files are empty in 4 of 5 replicated namenode directories.
>
> The edits_inprogress_... files are identical in all 5 dirs with the tx id from the one non-empty seen_txid file.
>
> The fsimage files are identical, too.
>
> Otherwise there are differences between every 2 of the 5 dirs as what the edits_ files are concerned.
>
> Is it safe to copy the one non-empty seen_txid file over into the other 4 nn directories?
>
> Cheers,
> Martin
>
> On 13.11.2012 12:03, mg wrote:
>> Hi,
>>
>> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
>> running on Ubuntu 12.04 (Precise).
>>
>> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
>> dist-upgrade on all nodes, started CM, checked and updated the
>> configuration and attempted to start the cluster.
>>
>> However, the HDFS NameNode fails to start with the exception appended
>> below.
>>
>> There is sufficient space on all partitions. We do not bind against
>> wildcard addresses (at least not yet).
>>
>> Any ideas? Stacktrace follows.
>>
>> Cheers,
>> Martin
>>
>> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
>> Exception in namenode join
>> java.lang.NumberFormatException: null
>> at java.lang.Long.parseLong(Long.java:375)
>> at java.lang.Long.valueOf(Long.java:525)
>> at
>> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>>
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by mg <us...@gmail.com>.
Meanwhile we found that the seen_txid files are empty in 4 of 5
replicated namenode directories.
The edits_inprogress_... files are identical in all 5 dirs with the tx
id from the one non-empty seen_txid file.
The fsimage files are identical, too.
Otherwise there are differences between every 2 of the 5 dirs as what
the edits_ files are concerned.
Is it safe to copy the one non-empty seen_txid file over into the other
4 nn directories?
Cheers,
Martin
On 13.11.2012 12:03, mg wrote:
> Hi,
>
> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
> running on Ubuntu 12.04 (Precise).
>
> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
> dist-upgrade on all nodes, started CM, checked and updated the
> configuration and attempted to start the cluster.
>
> However, the HDFS NameNode fails to start with the exception appended
> below.
>
> There is sufficient space on all partitions. We do not bind against
> wildcard addresses (at least not yet).
>
> Any ideas? Stacktrace follows.
>
> Cheers,
> Martin
>
> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
> Exception in namenode join
> java.lang.NumberFormatException: null
> at java.lang.Long.parseLong(Long.java:375)
> at java.lang.Long.valueOf(Long.java:525)
> at
> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by mg <us...@gmail.com>.
Meanwhile we found that the seen_txid files are empty in 4 of 5
replicated namenode directories.
The edits_inprogress_... files are identical in all 5 dirs with the tx
id from the one non-empty seen_txid file.
The fsimage files are identical, too.
Otherwise there are differences between every 2 of the 5 dirs as what
the edits_ files are concerned.
Is it safe to copy the one non-empty seen_txid file over into the other
4 nn directories?
Cheers,
Martin
On 13.11.2012 12:03, mg wrote:
> Hi,
>
> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
> running on Ubuntu 12.04 (Precise).
>
> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
> dist-upgrade on all nodes, started CM, checked and updated the
> configuration and attempted to start the cluster.
>
> However, the HDFS NameNode fails to start with the exception appended
> below.
>
> There is sufficient space on all partitions. We do not bind against
> wildcard addresses (at least not yet).
>
> Any ideas? Stacktrace follows.
>
> Cheers,
> Martin
>
> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
> Exception in namenode join
> java.lang.NumberFormatException: null
> at java.lang.Long.parseLong(Long.java:375)
> at java.lang.Long.valueOf(Long.java:525)
> at
> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by mg <us...@gmail.com>.
Meanwhile we found that the seen_txid files are empty in 4 of 5
replicated namenode directories.
The edits_inprogress_... files are identical in all 5 dirs with the tx
id from the one non-empty seen_txid file.
The fsimage files are identical, too.
Otherwise there are differences between every 2 of the 5 dirs as what
the edits_ files are concerned.
Is it safe to copy the one non-empty seen_txid file over into the other
4 nn directories?
Cheers,
Martin
On 13.11.2012 12:03, mg wrote:
> Hi,
>
> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
> running on Ubuntu 12.04 (Precise).
>
> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
> dist-upgrade on all nodes, started CM, checked and updated the
> configuration and attempted to start the cluster.
>
> However, the HDFS NameNode fails to start with the exception appended
> below.
>
> There is sufficient space on all partitions. We do not bind against
> wildcard addresses (at least not yet).
>
> Any ideas? Stacktrace follows.
>
> Cheers,
> Martin
>
> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
> Exception in namenode join
> java.lang.NumberFormatException: null
> at java.lang.Long.parseLong(Long.java:375)
> at java.lang.Long.valueOf(Long.java:525)
> at
> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
Re: Namenode fails to start after upgrade from CDH 4.0.1 to 4.1.2
Posted by mg <us...@gmail.com>.
Meanwhile we found that the seen_txid files are empty in 4 of 5
replicated namenode directories.
The edits_inprogress_... files are identical in all 5 dirs with the tx
id from the one non-empty seen_txid file.
The fsimage files are identical, too.
Otherwise there are differences between every 2 of the 5 dirs as what
the edits_ files are concerned.
Is it safe to copy the one non-empty seen_txid file over into the other
4 nn directories?
Cheers,
Martin
On 13.11.2012 12:03, mg wrote:
> Hi,
>
> we just upgraded a cluster from CDH 4.0.1 to 4.1.2 on a number of nodes
> running on Ubuntu 12.04 (Precise).
>
> We first upgraded Cloudera Manager (now 4.1.0), then ran apt-get
> dist-upgrade on all nodes, started CM, checked and updated the
> configuration and attempted to start the cluster.
>
> However, the HDFS NameNode fails to start with the exception appended
> below.
>
> There is sufficient space on all partitions. We do not bind against
> wildcard addresses (at least not yet).
>
> Any ideas? Stacktrace follows.
>
> Cheers,
> Martin
>
> FATAL org.apache.hadoop.hdfs.server.namenode.NameNode
> Exception in namenode join
> java.lang.NumberFormatException: null
> at java.lang.Long.parseLong(Long.java:375)
> at java.lang.Long.valueOf(Long.java:525)
> at
> org.apache.hadoop.hdfs.util.PersistentLongFile.readFile(PersistentLongFile.java:93)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readTransactionIdFile(NNStorage.java:425)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.inspectDirectory(FSImageTransactionalStorageInspector.java:71)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.inspectStorageDirs(NNStorage.java:1039)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:1093)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:598)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:267)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)