You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Nitin Goyal <ni...@gmail.com> on 2014/07/04 09:29:31 UTC

In progress edit log from last run not being played in case of a cluster (HA) restart

Hi All,

I am running Hadoop 2.4.0. I am trying to restart my HA cluster but since
there isn't a way to gracefully shutdown the NN (AFAIK), I am running into
a (sort of) race condition. A client has issued a delete command and NN
successfully deletes the requested file (in-progress edit logs across NN &
JNs are updated and DN physically delete the blocks). But before the
current in-progress edit log segment can be closed, the NN is stopped. Now
when the NN is started again, it reads all edit logs from JNs but it does
not consider the last in-progress edit log from the last run. Due to this
NN is expecting more blocks to be reported than what the DNs have.
Unfortunately sometimes this difference can be large enough (considering
dfs.namenode.safemode.threshold-pct) to leave the NN in safemode forever.

This problem is looks to be generic to me. Can someone please confirm if
this is indeed a bug or point out where I may be wrong (either in my
process or understanding).


I modified the NN code to also read the in-progress edit log from JNs and
my problem was resolved. But I am not sure what implications this might
have. Here is the code change I did:

diff --git
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImag
index e78153f..b864ec1 100644
---
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
+++
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
@@ -623,7 +623,7 @@ private boolean loadFSImage(FSNamesystem target,
StartupOption startOpt,
       }
       editStreams = editLog.selectInputStreams(
           imageFiles.get(0).getCheckpointTxId() + 1,
-          toAtLeastTxId, recovery, false);
+          toAtLeastTxId, recovery, true);
     } else {
       editStreams = FSImagePreTransactionalStorageInspector
         .getEditLogStreams(storage);

-- 
Regards
Nitin Goyal