You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Chris Nauroth (JIRA)" <ji...@apache.org> on 2013/07/18 07:34:48 UTC

[jira] [Resolved] (HDFS-5001) Branch-1-Win TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup failed caused by 1) old APIs and 2) incorrect value of depthOfAllLeaves

     [ https://issues.apache.org/jira/browse/HDFS-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Nauroth resolved HDFS-5001.
---------------------------------

    Resolution: Fixed
    
> Branch-1-Win TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup failed caused by 1) old APIs and 2) incorrect value of depthOfAllLeaves
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5001
>                 URL: https://issues.apache.org/jira/browse/HDFS-5001
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1-win
>            Reporter: Xi Fang
>            Assignee: Xi Fang
>             Fix For: 1-win
>
>         Attachments: HDFS-5001.patch
>
>
> After the backport patch of HDFS-4975 was committed, TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup failed. 
> The cause for the failure of TestReplicationPolicyWithNodeGroup is that some part in the patch of HDFS-3941 is missing. Our patch for HADOOP-495 makes methods in super class to be called incorrectly. More specifically, HDFS-4975 backported HDFS-4350, HDFS-4351, and HDFS-3912 to enable the method parameter "boolean avoidStaleNodes", and updated the APIs in BlockPlacementPolicyDefault. However, the override methods in ReplicationPolicyWithNodeGroup weren't updated.
> The cause for the failure of TestAzureBlockPlacementPolicy is similar.
> In addition, TestAzureBlockPlacementPolicy has an error. Here is the error info.
> Testcase: testPolicyWithDefaultRacks took 0.005 sec
> Caused an ERROR
> Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology.
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology.
> at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:396)
> at org.apache.hadoop.hdfs.server.namenode.TestAzureBlockPlacementPolicy.testPolicyWithDefaultRacks(TestAzureBlockPlacementPolicy.java:779)
> The error is caused by a check in NetworkTopology#add(Node node)
> {code}
> if (depthOfAllLeaves != node.getLevel()) {
>   LOG.error("Error: can't add leaf node at depth " +
>       node.getLevel() + " to topology:\n" + oldTopoStr);
>   throw new InvalidTopologyException("Invalid network topology. " +
>       "You cannot have a rack and a non-rack node at the same " +
>       "level of the network topology.");
> }
> {code}
> The problem of this check is that when we use NetworkTopology#remove(Node node) to remove a node from the cluster, depthOfAllLeaves won't change. As a result, we can't reset the value of NetworkTopology#depathOfAllLeaves of the old topology of a cluster by just removing all its dataNode. See TestAzureBlockPlacementPolicy#testPolicyWithDefaultRacks()
> {code}
> // clear the old topology
> for (Node node : dataNodes) {
>   cluster.remove(node);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira