You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "HuangTao (Jira)" <ji...@apache.org> on 2020/04/13 03:53:00 UTC
[jira] [Created] (HDFS-15274) NN doesn't remove the blocks from the failed DatanodeStorageInfo

HuangTao created HDFS-15274:
-------------------------------

             Summary: NN doesn't remove the blocks from the failed DatanodeStorageInfo
                 Key: HDFS-15274
                 URL: https://issues.apache.org/jira/browse/HDFS-15274
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
            Reporter: HuangTao
            Assignee: HuangTao
             Fix For: 3.4.0


In our federation cluster, we found there were some inconsistency failure volumes between two namespaces. The following logs are two NS separately.

NS1 received the failed storage info and removed the blocks associated with the failed storage.
{code:java}
[INFO] [IPC Server handler 76 on 8021] : Number of failed storages changes from 0 to 1
[INFO] [IPC Server handler 76 on 8021] : [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:NORMAL:X.X.X.X:50010:/data0/dfs failed.
[INFO] [org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@4fb57fb3] : Removed blocks associated with storage [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs from DataNode X.X.X.X:50010
[INFO] [IPC Server handler 73 on 8021] : Removed storage [DISK]DS-298de29e-9104-48dd-a674-5443a6126969:FAILED:X.X.X.X:50010:/data0/dfs from DataNode X.X.X.X:50010{code}
NS2 just received the failed storage.
{code:java}
[INFO] [IPC Server handler 87 on 8021] : Number of failed storages changes from 0 to 1  {code}
 

After digging into the code and trying to simulate disk failed with
{code:java}
echo offline > /sys/block/sda/device/state
echo 1 > /sys/block/sda/device/delete
# re-mount the failed disk
rescan-scsi-bus.sh -a
systemctl daemon-reload
mount /data0
{code}

I found the root reason is the inconsistency between StorageReport and VolumeFailureSummary in BPServiceActor#sendHeartBeat.

{code}
StorageReport[] reports =
        dn.getFSDataset().getStorageReports(bpos.getBlockPoolId());
  ......
  // the DISK may FAILED before executing the next line
    VolumeFailureSummary volumeFailureSummary = dn.getFSDataset()
        .getVolumeFailureSummary();
    int numFailedVolumes = volumeFailureSummary != null ?
        volumeFailureSummary.getFailedStorageLocations().length : 0;
{code} 

I improved the tolerance in NN DatanodeDescriptor#updateStorageStats to solve this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org