You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Bharat Viswanadham (Jira)" <ji...@apache.org> on 2021/08/23 10:33:00 UTC

[jira] [Commented] (HDDS-5655) SCM terminates when allocatecontainer happens on closed pipeline

    [ https://issues.apache.org/jira/browse/HDDS-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403111#comment-17403111 ] 

Bharat Viswanadham commented on HDDS-5655:
------------------------------------------


{code:java}
2021-08-22 20:25:38,392 ERROR org.apache.ratis.statemachine.StateMachine: Terminating with exit status 1: Cannot add container to pipeline=PipelineID=cc2029cb-bf76-4db0-9d91-9d0223644ef9 in closed state
java.io.IOException: Cannot add container to pipeline=PipelineID=cc2029cb-bf76-4db0-9d91-9d0223644ef9 in closed state
        at org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.addContainerToPipeline(PipelineStateMap.java:101)
        at org.apache.hadoop.hdds.scm.pipeline.PipelineStateManagerV2Impl.addContainerToPipeline(PipelineStateManagerV2Impl.java:114)
        at jdk.internal.reflect.GeneratedMethodAccessor88.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invokeLocal(SCMHAInvocationHandler.java:83)
        at org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invoke(SCMHAInvocationHandler.java:68)
        at com.sun.proxy.$Proxy14.addContainerToPipeline(Unknown Source)
        at org.apache.hadoop.hdds.scm.pipeline.PipelineManagerV2Impl.addContainerToPipeline(PipelineManagerV2Impl.java:240)
        at org.apache.hadoop.hdds.scm.container.ContainerStateManagerImpl.lambda$addContainer$1(ContainerStateManagerImpl.java:308)
        at org.apache.hadoop.hdds.scm.ha.ExecutionUtil.execute(ExecutionUtil.java:59)
        at org.apache.hadoop.hdds.scm.container.ContainerStateManagerImpl.addContainer(ContainerStateManagerImpl.java:312)
        at jdk.internal.reflect.GeneratedMethodAccessor87.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.hdds.scm.ha.SCMStateMachine.process(SCMStateMachine.java:168)
        at org.apache.hadoop.hdds.scm.ha.SCMStateMachine.applyTransaction(SCMStateMachine.java:139)
        at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1690)
        at org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:234)
        at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:179)
        at java.base/java.lang.Thread.run(Thread.java:834)
2021-08-22 20:25:38,399 INFO org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SHUTDOWN_MSG
{code}


> SCM terminates when allocatecontainer happens on closed pipeline
> ----------------------------------------------------------------
>
>                 Key: HDDS-5655
>                 URL: https://issues.apache.org/jira/browse/HDDS-5655
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Bharat Viswanadham
>            Assignee: Bharat Viswanadham
>            Priority: Major
>
> Scenario like this:
> 1. AllocateContainer selects open pipeline (This point it is open)
> 2. PipelineActionHandler comes and closes pipeline by calling pipelineManager#closePipeline which calls  stateManager#updatePipelineState(pipelineID.getProtobuf(),
>             HddsProtos.PipelineState.PIPELINE_CLOSED);
> 3. Now when containerStateManager#addContainer(containerInfo) is called, which calls  pipelineManager.addContainerToPipeline(pipelineID, containerID); this will fail, as now pipeline becomes closed  due to below piece of code. As containerStateManager#addContainer is replicateCall happens in StateMachine if any exception we terminate SCM.
> {code:java}
>   void addContainerToPipeline(PipelineID pipelineID, ContainerID containerID)
>       throws IOException {
>     Preconditions.checkNotNull(pipelineID,
>         "Pipeline Id cannot be null");
>     Preconditions.checkNotNull(containerID,
>         "Container Id cannot be null");
>     Pipeline pipeline = getPipeline(pipelineID);
>     if (pipeline.isClosed()) {
>       throw new IOException(String
>           .format("Cannot add container to pipeline=%s in closed state",
>               pipelineID));
>     }
>     pipeline2container.get(pipelineID).add(containerID);
>   }
> {code}
> *Proposed Solution:*
> Acquire pipelineManager during allocateContainer to avoid any updates to pipelineState.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org