You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2014/07/30 22:26:39 UTC
[jira] [Resolved] (HDFS-1231) Generation Stamp mismatches, leading to failed append

     [ https://issues.apache.org/jira/browse/HDFS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer resolved HDFS-1231.
------------------------------------

    Resolution: Won't Fix

append got overhauled in 2.x. closing.

> Generation Stamp mismatches, leading to failed append
> -----------------------------------------------------
>
>                 Key: HDFS-1231
>                 URL: https://issues.apache.org/jira/browse/HDFS-1231
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 0.20-append
>            Reporter: Thanh Do
>
> - Summary: the recoverBlock is not atomic, leading retrial fails when 
> facing a failure.
>  
> - Setup:
> + # available datanodes = 3
> + # disks / datanode = 1
> + # failures = 2
> + failure type = crash
> + When/where failure happens = (see below)
>  
> - Details:
> Suppose there are 3 datanodes in the pipeline: dn3, dn2, and dn1. Dn1 is primary.
> When appending, client first calls dn1.recoverBlock to make all the datanodes in 
> pipeline agree on the new Generation Stamp (GS1) and the length of the block.
> Client then sends a data packet to dn3. dn3 in turn forwards this packet to down stream
> dns (dn2 and dn1) and starts writing to its own disk, then it crashes AFTER writing to the block
> file but BEFORE writing to the meta file. Client notices the crash, it calls dn1.recoverBlock().
> dn1.recoverBlock() first creates a syncList (by calling getMetadataInfo at all dn2 and dn1).
> Then dn1 calls NameNode.getNextGS() to get new Generation Stamp (GS2).
> Then it calls dn2.updateBlock(), this returns successfully.
> Now, it starts calling its own updateBlock and crashes after renaming from
> blk_X_GS1.meta to blk_X_GS1.meta_tmpGS2.
> Therefore, dn1.recoverBlock() from the client point of view fails.
> but the GS for corresponding block has been incremented in the namenode (GS2)
> The client retries by calling dn2.recoverBlock with old GS (GS1), which does not match with
> the new GS at the NameNode (GS1) -->exception, leading to append fails.
>  
> Now, after all, we have
> - in dn3 (which is crashed)
> tmp/blk_X
> tmp/blk_X_GS1.meta
> - in dn2
> current/blk_X
> current/blk_X_GS2
> - in dn1:
> current/blk_X
> current/blk_X_GS1.meta_tmpGS2
> - in NameNode, the block X has generation stamp GS1 (because dn1 has not called
> commitSyncronization yet).
>  
> Therefore, when crashed datanodes restart, at dn1 the block is invalid because 
> there is no meta file. In dn3, block file and meta file are finalized, however, the 
> block is corrupted because CRC mismatch. In dn2, the GS of the block is GS2,
> which is not equal with the generation stamp info of the block maintained in NameNode.
> Hence, the block blk_X is inaccessible.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)



--
This message was sent by Atlassian JIRA
(v6.2#6252)