You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Terry Healy <th...@bnl.gov> on 2012/05/04 15:56:10 UTC

Unable to restart 1.0.2 NN, DFS

Running Apache 1.0.2 Primary NN on Ubuntu 11.10, 8 datanodes. When I
came in today I ran a stop-all.sh since one of the datanodes was not
showing on the dfsnodelist.jsp?whatNodes=LIVE status page. I then ran
start-all.sh, and the NN dies on a Null Pointer Exception. I looked at
the source but cannot determine what is going. Below is the trace - any
explanation of cause or suggestions to get the system back up would be
appreciated.

** I have removed the prefix "org.apache.hadoop." from the list below in
the interest of saving space. **

-Terry

2012-05-04 09:45:02,046 INFO hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = xxxx/xxxxxxxx
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-05-04 09:45:02,229 INFO metrics2.impl.MetricsConfig: loaded
properties from hadoop-metrics2.properties
2012-05-04 09:45:02,244 INFO metrics2.impl.MetricsSourceAdapter: MBean
for source MetricsSystem,sub=Stats registered.
2012-05-04 09:45:02,246 INFO metrics2.impl.MetricsSystemImpl: Scheduled
snapshot period at 10 second(s).
2012-05-04 09:45:02,246 INFO metrics2.impl.MetricsSystemImpl: NameNode
metrics system started
2012-05-04 09:45:02,483 INFO metrics2.impl.MetricsSourceAdapter: MBean
for source ugi registered.
2012-05-04 09:45:02,488 WARN metrics2.impl.MetricsSystemImpl: Source
name ugi already exists!
2012-05-04 09:45:02,494 INFO metrics2.impl.MetricsSourceAdapter: MBean
for source jvm registered.
2012-05-04 09:45:02,496 INFO metrics2.impl.MetricsSourceAdapter: MBean
for source NameNode registered.
2012-05-04 09:45:02,530 INFO hdfs.util.GSet: VM type       = 64-bit
2012-05-04 09:45:02,530 INFO hdfs.util.GSet: 2% max memory = 17.77875 MB
2012-05-04 09:45:02,530 INFO hdfs.util.GSet: capacity      = 2^21 =
2097152 entries
2012-05-04 09:45:02,530 INFO hdfs.util.GSet: recommended=2097152,
actual=2097152
2012-05-04 09:45:02,560 INFO hdfs.server.namenode.FSNamesystem:
fsOwner=thealy
2012-05-04 09:45:02,560 INFO hdfs.server.namenode.FSNamesystem:
supergroup=supergroup
2012-05-04 09:45:02,560 INFO hdfs.server.namenode.FSNamesystem:
isPermissionEnabled=true
2012-05-04 09:45:02,567 INFO hdfs.server.namenode.FSNamesystem:
dfs.block.invalidate.limit=100
2012-05-04 09:45:02,567 INFO hdfs.server.namenode.FSNamesystem:
isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
accessTokenLifetime=0 min(s)
2012-05-04 09:45:02,749 INFO hdfs.server.namenode.FSNamesystem:
Registered FSNamesystemStateMBean and NameNodeMXBean
2012-05-04 09:45:02,765 INFO hdfs.server.namenode.NameNode: Caching file
names occuring more than 10 times
2012-05-04 09:45:02,775 INFO hdfs.server.common.Storage: Number of files
= 11
2012-05-04 09:45:02,783 INFO hdfs.server.common.Storage: Number of files
under construction = 0
2012-05-04 09:45:02,783 INFO hdfs.server.common.Storage: Image file of
size 11087 loaded in 0 seconds.
2012-05-04 09:45:02,785 ERROR hdfs.server.namenode.NameNode:
java.lang.NullPointerException
	at hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1094)
	at hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1106)
	at hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1009)
	at
hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
	at hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:626)
	at hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1015)
	at hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:833)
	at hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:372)
	at hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
	at hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
	at hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362)
	at hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
	at hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
	at hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
	at hdfs.server.namenode.NameNode.main(NameNode.java:1288)

Re: Unable to restart 1.0.2 NN, DFS

Posted by Terry Healy <th...@bnl.gov>.
Not a lot of interest I suppose, but the problem is believed to have
been caused by not following the recommended upgrade procedure when
going from 1.0.1 to 1.0.2.

On 5/4/12 9:56 AM, Terry Healy wrote:
> Running Apache 1.0.2 Primary NN on Ubuntu 11.10, 8 datanodes. When I
> came in today I ran a stop-all.sh since one of the datanodes was not
> showing on the dfsnodelist.jsp?whatNodes=LIVE status page. I then ran
> start-all.sh, and the NN dies on a Null Pointer Exception. I looked at
> the source but cannot determine what is going. Below is the trace - any
> explanation of cause or suggestions to get the system back up would be
> appreciated.
>
> ** I have removed the prefix "org.apache.hadoop." from the list below in
> the interest of saving space. **
>
> -Terry
>
> 2012-05-04 09:45:02,046 INFO hdfs.server.namenode.NameNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = xxxx/xxxxxxxx
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-05-04 09:45:02,229 INFO metrics2.impl.MetricsConfig: loaded
> properties from hadoop-metrics2.properties
> 2012-05-04 09:45:02,244 INFO metrics2.impl.MetricsSourceAdapter: MBean
> for source MetricsSystem,sub=Stats registered.
> 2012-05-04 09:45:02,246 INFO metrics2.impl.MetricsSystemImpl: Scheduled
> snapshot period at 10 second(s).
> 2012-05-04 09:45:02,246 INFO metrics2.impl.MetricsSystemImpl: NameNode
> metrics system started
> 2012-05-04 09:45:02,483 INFO metrics2.impl.MetricsSourceAdapter: MBean
> for source ugi registered.
> 2012-05-04 09:45:02,488 WARN metrics2.impl.MetricsSystemImpl: Source
> name ugi already exists!
> 2012-05-04 09:45:02,494 INFO metrics2.impl.MetricsSourceAdapter: MBean
> for source jvm registered.
> 2012-05-04 09:45:02,496 INFO metrics2.impl.MetricsSourceAdapter: MBean
> for source NameNode registered.
> 2012-05-04 09:45:02,530 INFO hdfs.util.GSet: VM type       = 64-bit
> 2012-05-04 09:45:02,530 INFO hdfs.util.GSet: 2% max memory = 17.77875 MB
> 2012-05-04 09:45:02,530 INFO hdfs.util.GSet: capacity      = 2^21 =
> 2097152 entries
> 2012-05-04 09:45:02,530 INFO hdfs.util.GSet: recommended=2097152,
> actual=2097152
> 2012-05-04 09:45:02,560 INFO hdfs.server.namenode.FSNamesystem:
> fsOwner=thealy
> 2012-05-04 09:45:02,560 INFO hdfs.server.namenode.FSNamesystem:
> supergroup=supergroup
> 2012-05-04 09:45:02,560 INFO hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=true
> 2012-05-04 09:45:02,567 INFO hdfs.server.namenode.FSNamesystem:
> dfs.block.invalidate.limit=100
> 2012-05-04 09:45:02,567 INFO hdfs.server.namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 2012-05-04 09:45:02,749 INFO hdfs.server.namenode.FSNamesystem:
> Registered FSNamesystemStateMBean and NameNodeMXBean
> 2012-05-04 09:45:02,765 INFO hdfs.server.namenode.NameNode: Caching file
> names occuring more than 10 times
> 2012-05-04 09:45:02,775 INFO hdfs.server.common.Storage: Number of files
> = 11
> 2012-05-04 09:45:02,783 INFO hdfs.server.common.Storage: Number of files
> under construction = 0
> 2012-05-04 09:45:02,783 INFO hdfs.server.common.Storage: Image file of
> size 11087 loaded in 0 seconds.
> 2012-05-04 09:45:02,785 ERROR hdfs.server.namenode.NameNode:
> java.lang.NullPointerException
> 	at hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1094)
> 	at hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1106)
> 	at hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1009)
> 	at
> hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
> 	at hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:626)
> 	at hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1015)
> 	at hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:833)
> 	at hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:372)
> 	at hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
> 	at hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
> 	at hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362)
> 	at hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
> 	at hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
> 	at hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
> 	at hdfs.server.namenode.NameNode.main(NameNode.java:1288)

-- 
Terry Healy / thealy@bnl.gov
Cyber Security Operations
Brookhaven National Laboratory
Building 515, Upton N.Y. 11973