You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Kitti Nanasi (JIRA)" <ji...@apache.org> on 2018/07/26 15:04:00 UTC

[jira] [Created] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

Kitti Nanasi created HDFS-13770:
-----------------------------------

             Summary: dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
                 Key: HDFS-13770
                 URL: https://issues.apache.org/jira/browse/HDFS-13770
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs
    Affects Versions: 2.7.7
            Reporter: Kitti Nanasi
            Assignee: Kitti Nanasi


Missing blocks (with replication factor 1) metric is not always decreased when file is deleted.

If a file is deleted, the remove function of UnderReplicatedBlocks can be called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called with the wrong priority the corruptReplOneBlocks metric is not decreased, however the block is removed from the priority queue which contains it.

The corresponding code:
{code:java}
/** remove a block from a under replication queue */
synchronized boolean remove(BlockInfo block,
 int oldReplicas,
 int oldReadOnlyReplicas,
 int decommissionedReplicas,
 int oldExpectedReplicas) {
 final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
 decommissionedReplicas, oldExpectedReplicas);
 boolean removedBlock = remove(block, priLevel);
 if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
 oldExpectedReplicas == 1 &&
 removedBlock) {
 corruptReplOneBlocks--;
 assert corruptReplOneBlocks >= 0 :
 "Number of corrupt blocks with replication factor 1 " +
 "should be non-negative";
 }
 return removedBlock;
}

/**
 * Remove a block from the under replication queues.
 *
 * The priLevel parameter is a hint of which queue to query
 * first: if negative or &gt;= \{@link #LEVEL} this shortcutting
 * is not attmpted.
 *
 * If the block is not found in the nominated queue, an attempt is made to
 * remove it from all queues.
 *
 * <i>Warning:</i> This is not a synchronized method.
 * @param block block to remove
 * @param priLevel expected privilege level
 * @return true if the block was found and removed from one of the priority queues
 */
boolean remove(BlockInfo block, int priLevel) {
 if(priLevel >= 0 && priLevel < LEVEL
 && priorityQueues.get(priLevel).remove(block)) {
 NameNode.blockStateChangeLog.debug(
 "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
 " from priority queue {}", block, priLevel);
 return true;
 } else {
 // Try to remove the block from all queues if the block was
 // not found in the queue for the given priority level.
 for (int i = 0; i < LEVEL; i++) {
 if (i != priLevel && priorityQueues.get(i).remove(block)) {
 NameNode.blockStateChangeLog.debug(
 "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
 " {} from priority queue {}", block, i);
 return true;
 }
 }
 }
 return false;
}
{code}
It is already fixed on trunk by this jira: HDFS-10999, but that ticket introduces new metrics, which I think should't be backported to branch-2.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org