You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2023/08/16 21:05:00 UTC
[jira] [Resolved] (HDDS-7533) Intermittent failure in Decommissioning Ozone Datanode
[ https://issues.apache.org/jira/browse/HDDS-7533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell resolved HDDS-7533.
-------------------------------------
Resolution: Invalid
Closing this for now, as it does not appear to be a valid test. If the issue is reproducible and occurs again, please reopen with more details.
> Intermittent failure in Decommissioning Ozone Datanode
> ------------------------------------------------------
>
> Key: HDDS-7533
> URL: https://issues.apache.org/jira/browse/HDDS-7533
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Datanode
> Reporter: Varsha Ravi
> Priority: Major
>
> Ozone decommission of datanode is stuck and does not complete even after hours.
> STEPS TO REPRODUCE:
> ---------------------------
> # start only 3 DNs.
> # create non-EC directory and write significant data in it
> # shutdown these 3 DNs.
> # Start other set DNs for writing EC data.
> # Create EC directory and write significant data in it.
> # Start 1 DN from 1st set of 3 DNs.
> # Decommission 2 DNs from other set of EC DNs
> SCM logs when decommissioning is stuck
> {noformat}
> 4:58:30.828 PM ERROR UnderReplicatedProcessor
> Error processing under replicated container ContainerInfo{id=#4, state=CLOSED, pipelineID=PipelineID=e0019753-3738-473b-96b5-2338ce586a18, stateEnterTime=2022-11-18T10:33:33.353Z, owner=om2}
> org.apache.hadoop.hdds.scm.exceptions.SCMException: Not enough healthy nodes to allocate container. 2 datanodes required. Found 1
> at org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.chooseDatanodesInternal(SCMCommonPlacementPolicy.java:218)
> at org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRandom.chooseDatanodesInternal(SCMContainerPlacementRandom.java:81)
> at org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.chooseDatanodes(SCMCommonPlacementPolicy.java:175)
> at org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.chooseDatanodes(SCMCommonPlacementPolicy.java:117)
> at org.apache.hadoop.hdds.scm.container.replication.ECUnderReplicationHandler.getTargetDatanodes(ECUnderReplicationHandler.java:303)
> at org.apache.hadoop.hdds.scm.container.replication.ECUnderReplicationHandler.processAndCreateCommands(ECUnderReplicationHandler.java:186)
> at org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processUnderReplicatedContainer(ReplicationManager.java:471)
> at org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.processContainer(UnderReplicatedProcessor.java:99)
> at org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.processAll(UnderReplicatedProcessor.java:83)
> at org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.run(UnderReplicatedProcessor.java:138)
> at java.base/java.lang.Thread.run(Thread.java:834)
> 4:58:30.829 PM ERROR SCMCommonPlacementPolicy
> Not enough healthy nodes to allocate container. 2 datanodes required. Found 1{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org