You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "Luominghui (Jira)" <ji...@apache.org> on 2022/08/23 19:57:00 UTC

[jira] [Created] (HDFS-16739) EC: Reconstruction failed when file has specified StoragePolicy

Luominghui created HDFS-16739:
---------------------------------

             Summary: EC: Reconstruction failed when file has specified StoragePolicy
                 Key: HDFS-16739
                 URL: https://issues.apache.org/jira/browse/HDFS-16739
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 3.1.3
            Reporter: Luominghui
             Fix For: 3.1.3


We found that due to BlockReconstructionWork use the same chooseTarge function with Redundancy Block, so the targe returned is more than real additionalReplRequired due to need to satisfy storage policy. So , it causes all kind of exception when DN do ECReconstructionWork.

One of Exception in DN as follows:
{code:java}
2022-08-24 03:01:39,534 WARN [Command processor] org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to reconstruct striped block blk_-9223372032283192848_35319673088
java.lang.IllegalArgumentException: Too much missed striped blocks.
    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedWriter.<init>(StripedWriter.java:87)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.<init>(StripedBlockReconstructor.java:45)
    at org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker.processErasureCodingTasks(ErasureCodingWorker.java:134)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:797)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:680)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1306)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1344)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1280)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1267) {code}
this file ec policy is RS-6-3-1024, here is inner block info, blk_-9223372032283192845 (index:3) need to reconstruct , and all Storage is DISK ,but the file's storage policy is ALL_SSD
{code:java}
[blk_-9223372032283192848:DatanodeInfoWithStorage[10.x.x.33:50010,DS-e1435341-f43c-42ef-806f-90fsddfsfdcd,DISK],
 blk_-9223372032283192847:DatanodeInfoWithStorage[10.x.x.35:50010,DS-a6dsd16a-676a-4fed-8ffe-fsdfscw23445,DISK],
 blk_-9223372032283192846:DatanodeInfoWithStorage[10.x.x.34:50010,DS-40cdc124-e2e0-40f6-aa47-4d2bdsf3e8e5,DISK],
 blk_-9223372032283192844:DatanodeInfoWithStorage[10.x.x.21:50010,DS-ef9dee4f-dfb2-495c-872a-974dfscds58e,DISK],
 blk_-9223372032283192843:DatanodeInfoWithStorage[10.x.x.40:50010,DS-6dsedfa7-8291-46bb-964d-dfsf34567655,DISK],
 blk_-9223372032283192842:DatanodeInfoWithStorage[10.x.x.36:50010,DS-2dddc387-c38b-427d-9925-15a664d3472b,DISK],
 blk_-9223372032283192841:DatanodeInfoWithStorage[10.x.x.151:50010,DS-fds91a7-89ad-4899-bc44-675dfs32f58e,DISK],
 blk_-9223372032283192840:DatanodeInfoWithStorage[10.x.x.27:50010,DS-77dfs4c1-c23c-4b26-baa3-aadsfdff4118,DISK]] {code}
here is BlockECReconstructionInfo, due to all inner block is not satisfied with storage policy(ALL_SSD) , so the targe length is 9 ,rather than 1. 
{code:java}
2022-08-24 03:01:39,534 INFO [Command processor] org.apache.hadoop.hdfs.server.datanode.DataNode: processErasureCodingTasks  BlockECReconstructionInfo(
  Recovering BP-390041874-10.x.x.x-1550651014658:blk_-9223372032283192848_35319673088 From: [10.x.x.33:50010, 10.x.x.35:50010, 10.x.x.34:50010, 10.x.x.21:50010, 10.x.x.40:50010, 10.x.x.36:50010, 10.x.x.151:50010, 10.x.x.27:50010] To: [[10.x.x.37:50010, 10.x.x.21:50010, 10.x.x.32:50010, 10.x.x.27:50010, 10.x.x.28:50010, 10.x.x.23:50010, 10.x.x.23:50010, 10.x.x.101:50010, 10.x.x.32:50010])
 Block Indices: [0, 1, 2, 4, 5, 6, 7, 8] {code}
when init stripedWriter in DN StripedBlockReconstructor, need to judge targetIndicies.length<=prityBlkNum (9<=3) . so, this striped blocks will never reconstruct successfully.
{code:java}
targetIndices = new short[targets.length];
Preconditions.checkArgument(targetIndices.length <= parityBlkNum,
    "Too much missed striped blocks."); {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org