You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "yajun.don" <do...@gmail.com> on 2009/09/24 05:24:24 UTC

How to recover the state from the previous checkpoint in HDFS when the custer is DOWN and the edits file was broken.

Hi, NN was unable to process any request since it's cast more exceptions.
and then the NN can't be started, because fseditlog reads a unknown opcode
when NN is loading edits file.

Now, the dfs.name.dir directory below is:

[yuchao@d1950 ~]$ ll /home/yuchao/hadoop/name/current/
total 4332
-rw-rw-r-- 1 yuchao yuchao   14289 Sep 23 16:13 edits
-rw-rw-r-- 1 yuchao yuchao 1049088 Sep 23 17:08 edits.new
-rw-rw-r-- 1 yuchao yuchao 2179387 Sep 23 15:13 fsimage
-rw-rw-r-- 1 yuchao yuchao 2184070 Sep 23 21:36 fsimage.ckpt
-rw-rw-r-- 1 yuchao yuchao       8 Sep 23 15:13 fstime
-rw-rw-r-- 1 yuchao yuchao     100 Sep 23 15:13 VERSION

and the fs.checkpoint.dir directory is:

[yuchao@d1950 ~]$ ll fordim/tmp/dfs/namesecondary/current/
total 4288
-rw-rw-r-- 1 yuchao yuchao   14289 Sep 23 18:03 edits
-rw-rw-r-- 1 yuchao yuchao       4 Sep 23 18:03 edits.new
-rw-rw-r-- 1 yuchao yuchao 2179387 Sep 23 18:03 fsimage
-rw-rw-r-- 1 yuchao yuchao 2178424 Sep 23 18:03 fsimage.ckpt