You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Yuanbo Liu (Jira)" <ji...@apache.org> on 2022/07/13 02:35:00 UTC

[jira] [Created] (HDFS-16657) Changing pool-level lock to volume-level lock for invalidation of blocks

Yuanbo Liu created HDFS-16657:
---------------------------------

             Summary: Changing pool-level lock to volume-level lock for invalidation of blocks
                 Key: HDFS-16657
                 URL: https://issues.apache.org/jira/browse/HDFS-16657
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Yuanbo Liu
         Attachments: image-2022-07-13-10-25-37-383.png, image-2022-07-13-10-27-01-386.png, image-2022-07-13-10-27-44-258.png

Recently we see that the heartbeating of dn become slow in a very busy cluster, here is the chart:

!image-2022-07-13-10-25-37-383.png!

 

After getting jstack of the dn, we find that dn heartbeat stuck in invalidation of blocks:

!image-2022-07-13-10-27-01-386.png!

!image-2022-07-13-10-27-44-258.png!

The key code is:
{code:java}
// code placeholder
try {
  File blockFile = new File(info.getBlockURI());
  if (blockFile != null && blockFile.getParentFile() == null) {
    errors.add("Failed to delete replica " + invalidBlks[i]
        +  ". Parent not found for block file: " + blockFile);
    continue;
  }
} catch(IllegalArgumentException e) {
  LOG.warn("Parent directory check failed; replica " + info
      + " is not backed by a local file");
} {code}
DN is trying to locate parent path of block file, thus there is a disk I/O in pool-level lock. When the disk becomes very busy with high io wait, All the pending threads will be blocked by the pool-level lock, and the time of heartbeat is high. We proposal to change the pool-level lock to volume-level lock for block invalidation

cc: [~hexiaoqiao] [~Aiphag0] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org