You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Mark Gui (Jira)" <ji...@apache.org> on 2021/09/08 07:48:00 UTC

[jira] [Created] (HDDS-5726) Skip remove for already removed pipeline.

Mark Gui created HDDS-5726:
------------------------------

             Summary: Skip remove for already removed pipeline.
                 Key: HDDS-5726
                 URL: https://issues.apache.org/jira/browse/HDDS-5726
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Datanode
            Reporter: Mark Gui
            Assignee: Mark Gui


Suspicious logs seen while executing decommission on the datanode side:

 
{code:java}
[ozoneadmin@3d6bc06ffe3d logs]$ grep -nr "Received SCM close pipeline request" *
ozone-ozoneadmin-datanode-3d6bc06ffe3d.log.1:1021548:2021-09-07 06:56:48,927 [EndpointStateMachine task thread for /17.16.10.51:9861 - 0 ] DEBUG org.apache.hadoop.ozone.container.common.states.endpoint.HeartbeatEndpointTask: Received SCM close pipeline request PipelineID=c21f0d3e-a62d-4d34-97a5-9e95b8fbf9f1
ozone-ozoneadmin-datanode-3d6bc06ffe3d.log.1:1021550:2021-09-07 06:56:48,927 [EndpointStateMachine task thread for /17.16.10.51:9861 - 0 ] DEBUG org.apache.hadoop.ozone.container.common.states.endpoint.HeartbeatEndpointTask: Received SCM close pipeline request PipelineID=98792470-c118-4462-8978-e4edf9b38ba3
ozone-ozoneadmin-datanode-3d6bc06ffe3d.log.1:1021757:2021-09-07 06:56:50,006 [EndpointStateMachine task thread for /17.16.10.51:9861 - 0 ] DEBUG org.apache.hadoop.ozone.container.common.states.endpoint.HeartbeatEndpointTask: Received SCM close pipeline request PipelineID=c21f0d3e-a62d-4d34-97a5-9e95b8fbf9f1
ozone-ozoneadmin-datanode-3d6bc06ffe3d.log.1:1021758:2021-09-07 06:56:50,007 [EndpointStateMachine task thread for /17.16.10.51:9861 - 0 ] DEBUG org.apache.hadoop.ozone.container.common.states.endpoint.HeartbeatEndpointTask: Received SCM close pipeline request PipelineID=98792470-c118-4462-8978-e4edf9b38ba3
{code}
There are duplicate pipeline close commands received on a datanode. So it results in a succeeded pipeline close and a failed one.

I checked the log of scm and found that one is from the StartAdminOnNodeForStartDatanodeAdminHandler and one from PipelineReportForPipelineReportHandler.

Because the pipeline is already closed by decommission and the pipeline report is sent before it happens, so there is no such pipeline on the scm side, then scm delivers a second close pipeline command to datanode.

 

 
{code:java}
logs/ozone-ozoneadmin-scm-3d6bc06ffe3d.log:70775:2021-09-07 06:56:36,938 [EventQueue-StartAdminOnNodeForStartDatanodeAdminHandler] INFO org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send pipeline:PipelineID=c21f0d3e-a62d-4d34-97a5-9e95b8fbf9f1 close command to datanode 62687ea7-7043-4b7c-889a-6a27b1586df9
logs/ozone-ozoneadmin-scm-3d6bc06ffe3d.log:70777:2021-09-07 06:56:36,938 [EventQueue-StartAdminOnNodeForStartDatanodeAdminHandler] INFO org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send pipeline:PipelineID=c21f0d3e-a62d-4d34-97a5-9e95b8fbf9f1 close command to datanode 160bb7aa-dc9b-4817-9910-05fb20c7b2fc
logs/ozone-ozoneadmin-scm-3d6bc06ffe3d.log:70779:2021-09-07 06:56:36,938 [EventQueue-StartAdminOnNodeForStartDatanodeAdminHandler] INFO org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider: Send pipeline:PipelineID=c21f0d3e-a62d-4d34-97a5-9e95b8fbf9f1 close command to datanode 9e5d6a30-9149-4d8b-9455-e48db254bfa5
{code}
 
{code:java}
2021-09-07 06:56:48,925 [EventQueue-PipelineReportForPipelineReportHandler] INFO org.apache.hadoop.hdds.scm.pipeline.PipelineReportHandler: Reported pipeline PipelineID=c21f0d3e-a62d-4d34-97a5-9e95b8fbf9f1 is not found
2021-09-07 06:56:48,925 [EventQueue-PipelineReportForPipelineReportHandler] DEBUG org.apache.hadoop.hdds.server.events.EventQueue: Delivering [event=Datanode_Command] to executor/handler DatanodeCommandForSCMNodeManager: CommandForDatanode
{code}
So we should check for possible duplicate close commands on the datanode side.

 

 

 

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org