You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Jameson Li <ho...@gmail.com> on 2013/08/08 15:52:09 UTC

Re: Secondary namenode is crashing and complaining about non-existing files

refer to https://issues.apache.org/jira/browse/HDFS-2827
when do this operation: hadoop fs -mv /a/b / , maybe reappear this issue.

2012/5/10 Alex Levin <al...@gmail.com>

> Hi,
>
> I have an issue with crashing secondary namenode due to a simple move
> operation ....
> Appreciate any ideas on the resolution ...
>
> Details bellow:
> I was moving old backups to a separate folder, exact command:
>
>     sudo -u hdfs hadoop fs -mv /hbase-bak /backup/
>
> and shortly after the command secondary namenode crashed with following
> message:
>
>  2012-05-09 09:37:44,168 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Edits file
> /NNBak/current/edits of size 7680232 edits # 45318 loaded in 1
> seconds.
> 2012-05-09 09:37:44,232 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 0 Total time for transactions(ms): 0Number of
> transactions batched in S
> 2012-05-09 09:37:45,449 ERROR
> org.apache.hadoop.hdfs.server.common.Storage: Unable to save image for
> /NNBak
> java.io.IOException: saveLeases found path
>
> /backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
> but no matching entry in namespace.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:5449)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1070)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1172)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1120)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
>        at java.lang.Thread.run(Thread.java:662)
> 2012-05-09 09:37:45,450 WARN
> org.apache.hadoop.hdfs.server.common.Storage: Removing storage dir
> /NNBak
> 2012-05-09 09:37:45,450 FATAL
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No edit streams
> are accessible
> java.lang.Exception: No edit streams are accessible
>        at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:410)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:429)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.open(FSEditLog.java:374)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1158)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:731)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:628)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:505)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:469)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:333)
>        at
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
>        at java.lang.Thread.run(Thread.java:662)
> 2012-05-09 09:37:45,451 INFO
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
> SHUTDOWN_MSG:
>
>
> looks like it is  expecting the file
>
> /backup/base-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
> which never existed
> but there is
> /backup/hbase-bak/.logs/data1,60020,1304443405002/data1%3A60020.1308869024750
> which was moved
>
>
> fsck / and fsck /backup  returns no issues
> I can backup  fsimage and edits from
>  http://namenode:50070/getimage?getimage=1
>  http://namnoede:50070/getimage?getedit=1
>
>
> but all attempts to start the secondary namenode resulted in the same
> crash ...
>
>
> on the primary namenode all edits goes to edits.new and edits is not
> updating ..
>
> looking at result of "strings edits.new" I see lines like:
>
>  /backup/base-bak/.logs
> 1336585759935
> hdfs
> supergroup
> :/backup/base-bak/.logs/data1.inadco.gg,60020,1304443405002
> 1336585759963
> hdfs
> supergroup
> a/backup/hbase-bak/.logs/data1.inadco.gg,60020,1304443405002/
> data1.inadco.gg%3A60020.1308869024750
> 1336585759995
> `/backup/base-bak/.logs/data1.inadco.gg,60020,1304443405002/
> data1.inadco.gg%3A60020.1308869024750
> 1336585760017
> 1336585760017
> 67108864
>
> duplicating /backup/hbase-bajk and  /backup/base-bak
>
>
>
> Thanks
> -- Alex
>



-- 


Thanks & Regards,
李剑 Jameson Li
Focus on Hadoop,Mysql