You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jay Vyas <ja...@gmail.com> on 2011/10/29 16:57:11 UTC
can't format namenode....
Hi guys : In order to fix some issues im having (recently posted), I've
decided to try to make sure my name node is formatted.... But the formatting
fails (see 1 below) .....
So... To trace the failure, I figured I would grep through all log files
for exceptions.
I've curated the results here ... does this look familiar to anyone ?
Clearly, something is very wrong with my CDH hadoop installation.
1) To attempt to solve this, I figured I would format my namenode. Oddly,
when I run "hadoop -namenode format" I get the following stack trace :
11/10/29 14:39:37 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost.localdomain/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2-cdh3u1
STARTUP_MSG: build = file:///tmp/topdir/BUILD/hadoop-0.20.2-cdh3u1 -r
bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638; compiled by 'root' on Mon Jul 18
09:40:22 PDT 2011
************************************************************/
Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name ? (Y or
N) Y
11/10/29 14:39:40 INFO util.GSet: VM type = 64-bit
11/10/29 14:39:40 INFO util.GSet: 2% max memory = 19.33375 MB
11/10/29 14:39:40 INFO util.GSet: capacity = 2^21 = 2097152 entries
11/10/29 14:39:40 INFO util.GSet: recommended=2097152, actual=2097152
11/10/29 14:39:40 INFO namenode.FSNamesystem: fsOwner=cloudera
11/10/29 14:39:40 INFO namenode.FSNamesystem: supergroup=supergroup
11/10/29 14:39:40 INFO namenode.FSNamesystem: isPermissionEnabled=false
11/10/29 14:39:40 INFO namenode.FSNamesystem:
dfs.block.invalidate.limit=1000
11/10/29 14:39:40 INFO namenode.FSNamesystem: isAccessTokenEnabled=false
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/10/29 14:39:41 ERROR namenode.NameNode: java.io.IOException: Cannot
remove current directory: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:303)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1244)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1263)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1100)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1217)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233)
11/10/29 14:39:41 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
2) Here are the exceptions (abridged , i removed repetetive parts regarding
"replicated to 0 nodes instead of 1"
2011-10-28 22:36:52,669 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 8020, call
addBlock(/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info,
DFSClient_-134960056, null) from 127.0.0.1:35163: error:
java.io.IOException: File /var
lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1
STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:30:03,413 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only
be replicated to 0 nodes, in
tead of 1
.......................... REPEATED SEVERAL TIMES ......................
2011-10-28 22:36:52,716 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only
be replicated to 0 nodes, in
tead of 1
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only
be replicated to 0 nodes, instead of 1
STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:29:48,131 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException:
gcrc15.uchc.net: gcrc15.uchc.net
STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:30:03,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 8020, call
addBlock(/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info,
DFSClient_157537399, null) from 127.0.0.1:33519: error: java.io.IOException:
File /var/
ib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1
...............................(repeated several times)
.................................
STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:30:05,066 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
start task tracker because java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
SHUTDOWN_MSG: Shutting down TaskTracker at java.net.UnknownHostException:
gcrc15.uchc.net: gcrc15.uchc.net
Re: can't format namenode....
Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.
----- Original Message -----
From: Jay Vyas <ja...@gmail.com>
Date: Saturday, October 29, 2011 8:27 pm
Subject: can't format namenode....
To: common-user@hadoop.apache.org
> Hi guys : In order to fix some issues im having (recently posted),
> I'vedecided to try to make sure my name node is formatted.... But
> the formatting
> fails (see 1 below) .....
>
> So... To trace the failure, I figured I would grep through all log
> filesfor exceptions.
> I've curated the results here ... does this look familiar to anyone ?
>
> Clearly, something is very wrong with my CDH hadoop installation.
>
> 1) To attempt to solve this, I figured I would format my namenode.
> Oddly,when I run "hadoop -namenode format" I get the following
> stack trace :
>
> 11/10/29 14:39:37 INFO namenode.NameNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG: host = localhost.localdomain/127.0.0.1
> STARTUP_MSG: args = [-format]
> STARTUP_MSG: version = 0.20.2-cdh3u1
> STARTUP_MSG: build = file:///tmp/topdir/BUILD/hadoop-0.20.2-
> cdh3u1 -r
> bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638; compiled by 'root' on Mon
> Jul 18
> 09:40:22 PDT 2011
> ************************************************************/
> Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name
> ? (Y or
> N) Y
> 11/10/29 14:39:40 INFO util.GSet: VM type = 64-bit
> 11/10/29 14:39:40 INFO util.GSet: 2% max memory = 19.33375 MB
> 11/10/29 14:39:40 INFO util.GSet: capacity = 2^21 = 2097152
> entries11/10/29 14:39:40 INFO util.GSet: recommended=2097152,
> actual=209715211/10/29 14:39:40 INFO namenode.FSNamesystem:
> fsOwner=cloudera11/10/29 14:39:40 INFO namenode.FSNamesystem:
> supergroup=supergroup11/10/29 14:39:40 INFO namenode.FSNamesystem:
> isPermissionEnabled=false11/10/29 14:39:40 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=1000
> 11/10/29 14:39:40 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=falseaccessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 11/10/29 14:39:41 ERROR namenode.NameNode: java.io.IOException: Cannot
> remove current directory: /var/lib/hadoop-
> 0.20/cache/hadoop/dfs/name/current at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:303)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1244)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1263)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1100)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1217)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233)
>
> 11/10/29 14:39:41 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at
> localhost.localdomain/127.0.0.1************************************************************/
>
Are you able to remove this directory explicitely?
/var/lib/hadoop-0.20/cache/hadoop/dfs/name/current
>
> 2) Here are the exceptions (abridged , i removed repetetive parts
> regarding"replicated to 0 nodes instead of 1"
This is because, the file is not replicated to minimum replication (1).
>
> 2011-10-28 22:36:52,669 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 8020, call
> addBlock(/var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info,DFSClient_-
> 134960056, null) from 127.0.0.1:35163: error:
> java.io.IOException: File /var
> lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could
> only be
> replicated to 0 nodes, instead of 1
> java.io.IOException: File /var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info could only be
> replicated to 0 nodes, instead of 1
> STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:30:03,413 WARN org.apache.hadoop.hdfs.DFSClient:
> DataStreamerException: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File
> /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info
> could only
> be replicated to 0 nodes, in
> tead of 1
> .......................... REPEATED SEVERAL TIMES
> ......................2011-10-28 22:36:52,716 WARN
> org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File
> /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info
> could only
> be replicated to 0 nodes, in
> tead of 1
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info
> could only
> be replicated to 0 nodes, instead of 1
> STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:29:48,131 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException:
> gcrc15.uchc.net: gcrc15.uchc.net
> STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:30:03,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 8020, call
> addBlock(/var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info,DFSClient_157537399, null) from 127.0.0.1:33519: error: java.io.IOException:
> File /var/
> ib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could
> only be
> replicated to 0 nodes, instead of 1
> java.io.IOException: File /var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info could only be
> replicated to 0 nodes, instead of 1
> ...............................(repeated several times)
> .................................
> STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:30:05,066 ERROR org.apache.hadoop.mapred.TaskTracker:
> Can not
> start task tracker because java.net.UnknownHostException:
> gcrc15.uchc.net:gcrc15.uchc.net
> SHUTDOWN_MSG: Shutting down TaskTracker at
> java.net.UnknownHostException:gcrc15.uchc.net: gcrc15.uchc.net
>
Did you configure the host mappings correctly in /etc/hosts file?
Also can you please check, whether the DNs are sending heartbeats to NN.
Regards,
Uma