You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Akira Ajisaka (Jira)" <ji...@apache.org> on 2022/06/20 02:34:00 UTC
[jira] [Resolved] (HDFS-16064) HDFS-721 causes DataNode decommissioning to get stuck indefinitely

     [ https://issues.apache.org/jira/browse/HDFS-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Akira Ajisaka resolved HDFS-16064.
----------------------------------
    Fix Version/s: 3.4.0
                   3.3.4
       Resolution: Fixed

Merged the PR into trunk and branch-3.3.

> HDFS-721 causes DataNode decommissioning to get stuck indefinitely
> ------------------------------------------------------------------
>
>                 Key: HDFS-16064
>                 URL: https://issues.apache.org/jira/browse/HDFS-16064
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 3.2.1
>            Reporter: Kevin Wikant
>            Assignee: Kevin Wikant
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0, 3.3.4
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a non-issue under the assumption that if the namenode & a datanode get into an inconsistent state for a given block pipeline, there should be another datanode available to replicate the block to
> While testing datanode decommissioning using "dfs.exclude.hosts", I have encountered a scenario where the decommissioning gets stuck indefinitely
> Below is the progression of events:
>  * there are initially 4 datanodes DN1, DN2, DN3, DN4
>  * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
>  * HDFS block pipelines on DN1 & DN2 must now be replicated to DN3 & DN4 in order to satisfy their minimum replication factor of 2
>  * during this replication process https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes the following inconsistent state:
>  ** DN3 thinks it has the block pipeline in FINALIZED state
>  ** the namenode does not think DN3 has the block pipeline
> {code:java}
> 2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode (DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 dst: /DN3:9866; org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
> {code}
>  * the replication is attempted again, but:
>  ** DN4 has the block
>  ** DN1 and/or DN2 have the block, but don't count towards the minimum replication factor because they are being decommissioned
>  ** DN3 does not have the block & cannot have the block replicated to it because of HDFS-721
>  * the namenode repeatedly tries to replicate the block to DN3 & repeatedly fails, this continues indefinitely
>  * therefore DN4 is the only live datanode with the block & the minimum replication factor of 2 cannot be satisfied
>  * because the minimum replication factor cannot be satisfied for the block(s) being moved off DN1 & DN2, the datanode decommissioning can never be completed 
> {code:java}
> 2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 2, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , Current Datanode: DN1:9866, Is current datanode decommissioning: true, Is current datanode entering maintenance: false
> ...
> 2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): Block: blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 2, maintenance replicas: 0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , Current Datanode: DN2:9866, Is current datanode decommissioning: true, Is current datanode entering maintenance: false
> {code}
> Being stuck in decommissioning state forever is not an intended behavior of DataNode decommissioning
> A few potential solutions:
>  * Address the root cause of the problem which is an inconsistent state between namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
>  * Detect when datanode decommissioning is stuck due to lack of available datanodes for satisfying the minimum replication factor, then recover by re-enabling the datanodes being decommissioned
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org