You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Xi Fang (JIRA)" <ji...@apache.org> on 2013/07/10 00:37:49 UTC

[jira] [Created] (HADOOP-9714) Branch-1-win TestReplicationPolicy failed caused by stale data node handling

Xi Fang created HADOOP-9714:
-------------------------------

             Summary: Branch-1-win TestReplicationPolicy failed caused by stale data node handling
                 Key: HADOOP-9714
                 URL: https://issues.apache.org/jira/browse/HADOOP-9714
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 1-win
            Reporter: Xi Fang
            Assignee: Xi Fang
             Fix For: 1-win


Condor-Branch-1 TestReplicationPolicy failed on 
* testChooseTargetWithMoreThanAvailableNodes()
* testChooseTargetWithStaleNodes()
* testChooseTargetWithHalfStaleNodes()

The root of cause of testChooseTargetWithMoreThanAvailableNodes failing is the following:
In BlockPlacementPolicyDefault#chooseTarget()
{code}
  chooseRandom(numOfReplicas, NodeBase.ROOT, excludedNodes, 
        blocksize, maxNodesPerRack, results);
    } catch (NotEnoughReplicasException e) {
      FSNamesystem.LOG.warn("Not able to place enough replicas, still in need of " + numOfReplicas);
{code}
However, numOfReplicas is passed into chooseRandom() as int (primitive type in java) by value. The updating operation for numOfReplicas in chooseRandom() will not change the value in chooseTarget(). 

The root cause for testChooseTargetWithStaleNodes() and testChooseTargetWithHalfStaleNodes() is the current BlockPlacementPolicyDefault#chooseTarget() doesn't check if a node is stale.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira