You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jay Vyas <ja...@gmail.com> on 2011/10/29 16:57:11 UTC

can't format namenode....

Hi guys : In order to fix some issues im having (recently posted), I've
decided to try to make sure my name node is formatted.... But the formatting
fails (see 1 below) .....

So... To trace the failure,  I figured I would grep through all log files
for exceptions.
I've curated the results here ... does this look familiar to anyone ?

Clearly, something is very wrong with my CDH hadoop installation.

1) To attempt to solve this, I figured I would format my namenode.  Oddly,
when I run "hadoop -namenode format" I get the following stack trace :

11/10/29 14:39:37 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = localhost.localdomain/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2-cdh3u1
STARTUP_MSG:   build = file:///tmp/topdir/BUILD/hadoop-0.20.2-cdh3u1 -r
bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638; compiled by 'root' on Mon Jul 18
09:40:22 PDT 2011
************************************************************/
Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name ? (Y or
N) Y
11/10/29 14:39:40 INFO util.GSet: VM type       = 64-bit
11/10/29 14:39:40 INFO util.GSet: 2% max memory = 19.33375 MB
11/10/29 14:39:40 INFO util.GSet: capacity      = 2^21 = 2097152 entries
11/10/29 14:39:40 INFO util.GSet: recommended=2097152, actual=2097152
11/10/29 14:39:40 INFO namenode.FSNamesystem: fsOwner=cloudera
11/10/29 14:39:40 INFO namenode.FSNamesystem: supergroup=supergroup
11/10/29 14:39:40 INFO namenode.FSNamesystem: isPermissionEnabled=false
11/10/29 14:39:40 INFO namenode.FSNamesystem:
dfs.block.invalidate.limit=1000
11/10/29 14:39:40 INFO namenode.FSNamesystem: isAccessTokenEnabled=false
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/10/29 14:39:41 ERROR namenode.NameNode: java.io.IOException: Cannot
remove current directory: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current
    at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:303)
    at
org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1244)
    at
org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1263)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1100)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1217)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233)

11/10/29 14:39:41 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/


2) Here are the exceptions (abridged , i removed repetetive parts regarding
"replicated to 0 nodes instead of 1"

2011-10-28 22:36:52,669 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 8020, call
addBlock(/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info,
DFSClient_-134960056, null) from 127.0.0.1:35163: error:
java.io.IOException: File /var
lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1
STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:30:03,413 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only
be replicated to 0 nodes, in
tead of 1
.......................... REPEATED SEVERAL TIMES ......................
2011-10-28 22:36:52,716 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only
be replicated to 0 nodes, in
tead of 1
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only
be replicated to 0 nodes, instead of 1
STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:29:48,131 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException:
gcrc15.uchc.net: gcrc15.uchc.net
STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:30:03,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 8020, call
addBlock(/var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info,
DFSClient_157537399, null) from 127.0.0.1:33519: error: java.io.IOException:
File /var/
ib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1
...............................(repeated several times)
.................................
STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
2011-10-28 22:30:05,066 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
start task tracker because java.net.UnknownHostException: gcrc15.uchc.net:
gcrc15.uchc.net
SHUTDOWN_MSG: Shutting down TaskTracker at java.net.UnknownHostException:
gcrc15.uchc.net: gcrc15.uchc.net

Re: can't format namenode....

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.
----- Original Message -----
From: Jay Vyas <ja...@gmail.com>
Date: Saturday, October 29, 2011 8:27 pm
Subject: can't format namenode....
To: common-user@hadoop.apache.org

> Hi guys : In order to fix some issues im having (recently posted), 
> I'vedecided to try to make sure my name node is formatted.... But 
> the formatting
> fails (see 1 below) .....
> 
> So... To trace the failure,  I figured I would grep through all log 
> filesfor exceptions.
> I've curated the results here ... does this look familiar to anyone ?
> 
> Clearly, something is very wrong with my CDH hadoop installation.
> 
> 1) To attempt to solve this, I figured I would format my namenode.  
> Oddly,when I run "hadoop -namenode format" I get the following 
> stack trace :
> 
> 11/10/29 14:39:37 INFO namenode.NameNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = localhost.localdomain/127.0.0.1
> STARTUP_MSG:   args = [-format]
> STARTUP_MSG:   version = 0.20.2-cdh3u1
> STARTUP_MSG:   build = file:///tmp/topdir/BUILD/hadoop-0.20.2-
> cdh3u1 -r
> bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638; compiled by 'root' on Mon 
> Jul 18
> 09:40:22 PDT 2011
> ************************************************************/
> Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name 
> ? (Y or
> N) Y
> 11/10/29 14:39:40 INFO util.GSet: VM type       = 64-bit
> 11/10/29 14:39:40 INFO util.GSet: 2% max memory = 19.33375 MB
> 11/10/29 14:39:40 INFO util.GSet: capacity      = 2^21 = 2097152 
> entries11/10/29 14:39:40 INFO util.GSet: recommended=2097152, 
> actual=209715211/10/29 14:39:40 INFO namenode.FSNamesystem: 
> fsOwner=cloudera11/10/29 14:39:40 INFO namenode.FSNamesystem: 
> supergroup=supergroup11/10/29 14:39:40 INFO namenode.FSNamesystem: 
> isPermissionEnabled=false11/10/29 14:39:40 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=1000
> 11/10/29 14:39:40 INFO namenode.FSNamesystem: 
> isAccessTokenEnabled=falseaccessKeyUpdateInterval=0 min(s), 
> accessTokenLifetime=0 min(s)
> 11/10/29 14:39:41 ERROR namenode.NameNode: java.io.IOException: Cannot
> remove current directory: /var/lib/hadoop-
> 0.20/cache/hadoop/dfs/name/current    at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:303)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1244)
>    at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1263)
>    at
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1100)
>    at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1217)
>    at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233)
> 
> 11/10/29 14:39:41 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at 
> localhost.localdomain/127.0.0.1************************************************************/
> 

Are you able to remove this directory explicitely?
  /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current
> 
> 2) Here are the exceptions (abridged , i removed repetetive parts 
> regarding"replicated to 0 nodes instead of 1"
This is because, the file is not replicated to minimum replication (1).
> 
> 2011-10-28 22:36:52,669 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 8020, call
> addBlock(/var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info,DFSClient_-
> 134960056, null) from 127.0.0.1:35163: error:
> java.io.IOException: File /var
> lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could 
> only be
> replicated to 0 nodes, instead of 1
> java.io.IOException: File /var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info could only be 
> replicated to 0 nodes, instead of 1
> STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:30:03,413 WARN org.apache.hadoop.hdfs.DFSClient: 
> DataStreamerException: org.apache.hadoop.ipc.RemoteException: 
> java.io.IOException: File
> /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info 
> could only
> be replicated to 0 nodes, in
> tead of 1
> .......................... REPEATED SEVERAL TIMES 
> ......................2011-10-28 22:36:52,716 WARN 
> org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: org.apache.hadoop.ipc.RemoteException: 
> java.io.IOException: File
> /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info 
> could only
> be replicated to 0 nodes, in
> tead of 1
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info 
> could only
> be replicated to 0 nodes, instead of 1
> STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:29:48,131 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException:
> gcrc15.uchc.net: gcrc15.uchc.net
> STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:30:03,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 8020, call
> addBlock(/var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info,DFSClient_157537399, null) from 127.0.0.1:33519: error: java.io.IOException:
> File /var/
> ib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could 
> only be
> replicated to 0 nodes, instead of 1
> java.io.IOException: File /var/lib/hadoop-
> 0.20/cache/mapred/mapred/system/jobtracker.info could only be 
> replicated to 0 nodes, instead of 1
> ...............................(repeated several times)
> .................................
> STARTUP_MSG:   host = java.net.UnknownHostException: gcrc15.uchc.net:
> gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net
> 2011-10-28 22:30:05,066 ERROR org.apache.hadoop.mapred.TaskTracker: 
> Can not
> start task tracker because java.net.UnknownHostException: 
> gcrc15.uchc.net:gcrc15.uchc.net
> SHUTDOWN_MSG: Shutting down TaskTracker at 
> java.net.UnknownHostException:gcrc15.uchc.net: gcrc15.uchc.net
> 
Did you configure the host mappings correctly in /etc/hosts file?
Also can you please check, whether the DNs are sending heartbeats to NN.

Regards,
Uma