You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2009/03/04 20:41:56 UTC

[jira] Commented: (HADOOP-5399) Simulated datanodes crashes NameNode

    [ https://issues.apache.org/jira/browse/HADOOP-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678873#action_12678873 ] 

Hairong Kuang commented on HADOOP-5399:
---------------------------------------

It turns out this bug is caused by HADOOP-5384. Simulated datanodes send block reports to NN that contains a block with an invalid generation stamp, GenerationStamp.WILDCARD_STAMP. NN finds out the block does not belong to any file so marks it to be invalid. Then ReplicationMonitor schedules the block to be deleted on its datanode by adding it to the invalidateSet of its DatanodeDescriptor, which is a TreeSet. So adding the block to the invalidateSet triggers the call to Block#compareTo that throws IllegalStateExceptionon on wild card generation stamp. ReplicationMonitor calls System.exit to shutdown NN when catching a RuntimeException. So NN gets crashed.

A simple solution to the problem is that block report processing should filter blocks with wild card generation stamp.

> Simulated datanodes crashes NameNode
> ------------------------------------
>
>                 Key: HADOOP-5399
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5399
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_448_1 on
> XX size 10 does not belong to any file.
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_448 is added
> to invalidSet of XX
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_447_1 on
> XX size 10 does not belong to any file.
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_447 is added
> to invalidSet of XX
> WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received
> Runtime exception. java.lang.IllegalStateException: generationStamp (=1) == GenerationStamp.WILDCARD_STAMP
> INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at YY
> ************************************************************/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.