You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Lukas Majercak (JIRA)" <ji...@apache.org> on 2017/03/24 18:38:41 UTC

[jira] [Created] (HDFS-11576) Block recovery will fail indefinitely if recovery time > heartbeat interval

Lukas Majercak created HDFS-11576:
-------------------------------------

             Summary: Block recovery will fail indefinitely if recovery time > heartbeat interval
                 Key: HDFS-11576
                 URL: https://issues.apache.org/jira/browse/HDFS-11576
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode, hdfs, namenode
    Affects Versions: 3.0.0-alpha2, 3.0.0-alpha1, 2.7.3, 2.7.2, 2.7.1
            Reporter: Lukas Majercak
            Assignee: Lukas Majercak
            Priority: Critical


Block recovery will fail indefinitely if the time to recover a block is always longer than the heartbeat interval. Scenario:
1. DN sends heartbeat 
2. NN sends a recovery command to DN, recoveryID=X
3. DN starts recovery
4. DN sends another heartbeat
5. NN sends a recovery command to DN, recoveryID=X+1
6. DN calls commitBlockSyncronization after succeeding with first recovery to NN, which fails because X < X+1
... 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org