You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Nanda kumar (Jira)" <ji...@apache.org> on 2020/10/05 05:38:00 UTC

[jira] [Resolved] (HDDS-4304) Close Container event can fail if pipeline is removed

     [ https://issues.apache.org/jira/browse/HDDS-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nanda kumar resolved HDDS-4304.
-------------------------------
    Fix Version/s: 1.1.0
       Resolution: Fixed

> Close Container event can fail if pipeline is removed
> -----------------------------------------------------
>
>                 Key: HDDS-4304
>                 URL: https://issues.apache.org/jira/browse/HDDS-4304
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: SCM
>    Affects Versions: 1.1.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.1.0
>
>
> If you call `pipelineManager.finalizeAndDestroyPipeline()` with onTimeout=false, then the finalizePipeline call will result in a closeContainer event to be fired for every container on the pipeline. These are handled asynchronously.
> However, immediately after that, the `destroyPipeline(...)` call is made. This will remove the pipeline details from the various maps / stores.
> Then the closeContainer events get processed, and they attempt to remove the container from the pipeline. However as the pipeline has already been destroyed, this throws an exception and the close container events never get sent to the DNs:
> {code}
> 2020-10-01 15:44:18,838 [EventQueue-CloseContainerForCloseContainerEventHandler] INFO container.CloseContainerEventHandler: Close container Event triggered for container : #2
> 2020-10-01 15:44:18,842 [EventQueue-CloseContainerForCloseContainerEventHandler] ERROR container.CloseContainerEventHandler: Failed to close the container #2.
> org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException: PipelineID=59e5ae16-f1fe-45ff-9044-dd237b0e91c6 not found
> 	at org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.removeContainerFromPipeline(PipelineStateMap.java:372)
> 	at org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager.removeContainerFromPipeline(PipelineStateManager.java:111)
> 	at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.removeContainerFromPipeline(SCMPipelineManager.java:413)
> 	at org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:352)
> 	at org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:331)
> 	at org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler.onMessage(CloseContainerEventHandler.java:66)
> 	at org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler.Onmessage(CloseContainerEventHandler.java:45)
> 	at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> 	at java.base/java.util.concurrent.ThreadPoolExecutor
> {code}
> The simple solution is to catch the exception and ignore it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org