You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "dragon (JIRA)" <ji...@apache.org> on 2016/03/15 09:16:36 UTC

[jira] [Created] (HDFS-10106) CLONE - Erasure coding: fix priority level of UnderReplicatedBlocks for striped block

dragon created HDFS-10106:
-----------------------------

             Summary: CLONE - Erasure coding: fix priority level of UnderReplicatedBlocks for striped block
                 Key: HDFS-10106
                 URL: https://issues.apache.org/jira/browse/HDFS-10106
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: dragon
            Assignee: Walter Su
             Fix For: HDFS-7285


Issues 1: correctly mark corrupted blocks.
Issues 2: distinguish highest risk priority and normal risk priority.
{code:title=UnderReplicatedBlocks.java}
  private int getPriority(int curReplicas,
  ...
    } else if (curReplicas == 1) {
      //only on replica -risk of loss
      // highest priority
      return QUEUE_HIGHEST_PRIORITY;
  ...
{code}
For stripe blocks, we should return QUEUE_HIGHEST_PRIORITY when curReplicas == 6( Suppose 6+3 schema).

That's important. Because
{code:title=BlockManager.java}
DatanodeDescriptor[] chooseSourceDatanodes(BlockInfo block,
  ...
     if(priority != UnderReplicatedBlocks.QUEUE_HIGHEST_PRIORITY 
          && !node.isDecommissionInProgress() 
          && node.getNumberOfBlocksToBeReplicated() >= maxReplicationStreams)
      {
        continue; // already reached replication limit
      }
  ...
{code}
It may return not enough source DNs ( maybe 5), and failed to recover.
A busy node should not be skiped if a block has highest risk/priority. The issue is the striped block doesn't have priority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)