You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Attila Doroszlai (Jira)" <ji...@apache.org> on 2023/06/27 17:32:00 UTC

[jira] [Updated] (HDDS-8934) SCMHAInvocationHandler throws undeclared exceptions, causes SCM to exit

     [ https://issues.apache.org/jira/browse/HDDS-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Attila Doroszlai updated HDDS-8934:
-----------------------------------
    Summary: SCMHAInvocationHandler throws undeclared exceptions, causes SCM to exit  (was: SCM throws SingleThreadExecutor exceptions and then shuts down)

> SCMHAInvocationHandler throws undeclared exceptions, causes SCM to exit
> -----------------------------------------------------------------------
>
>                 Key: HDDS-8934
>                 URL: https://issues.apache.org/jira/browse/HDDS-8934
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: SCM
>            Reporter: Pratyush Bhatt
>            Assignee: Attila Doroszlai
>            Priority: Major
>
> SCM throws org.apache.hadoop.hdds.server.events.SingleThreadExecutor exception
> {noformat}
> 2023-06-25 10:37:34,775 [IPC Server listener on 9860] INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 9860
> 2023-06-25 10:37:34,775 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Stopping Storage Container Manager HTTP server.
> 2023-06-25 10:37:34,776 [IPC Server Responder] INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
> 2023-06-25 10:37:34,776 [EventQueue-DeadNodeForDeadNodeHandler] ERROR org.apache.hadoop.hdds.server.events.SingleThreadExecutor: Error on execution message dd5867e2-90ba-44bf-b58e-1163e6fd8058(ozn-lease112-5.ozn-lease112.root.hwx.site/172.27.130.138)
> java.lang.reflect.UndeclaredThrowableException
>         at com.sun.proxy.$Proxy15.updatePipelineState(Unknown Source)
>         at org.apache.hadoop.hdds.scm.pipeline.PipelineManagerImpl.closePipeline(PipelineManagerImpl.java:440)
>         at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.lambda$null$0(DeadNodeHandler.java:126)
>         at java.lang.Iterable.forEach(Iterable.java:75)
>         at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.lambda$destroyPipelines$1(DeadNodeHandler.java:124)
>         at java.util.Optional.ifPresent(Optional.java:159)
>         at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.destroyPipelines(DeadNodeHandler.java:123)
>         at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:84)
>         at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:50)
>         at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: org.apache.ratis.protocol.exceptions.ServerNotReadyException: 70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6 is not in [RUNNING]: current state is CLOSING
>         at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>         at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
>         at org.apache.hadoop.hdds.scm.ha.SCMRatisServerImpl.submitRequest(SCMRatisServerImpl.java:229)
>         at org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invokeRatis(SCMHAInvocationHandler.java:115)
>         at org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invoke(SCMHAInvocationHandler.java:71)
>         ... 13 more
> Caused by: org.apache.ratis.protocol.exceptions.ServerNotReadyException: 70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6 is not in [RUNNING]: current state is CLOSING
>         at org.apache.ratis.server.impl.RaftServerImpl.lambda$assertLifeCycleState$9(RaftServerImpl.java:749)
>         at org.apache.ratis.util.LifeCycle.assertCurrentState(LifeCycle.java:253)
>         at org.apache.ratis.server.impl.RaftServerImpl.assertLifeCycleState(RaftServerImpl.java:748)
>         at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:838)
>         at org.apache.ratis.server.impl.RaftServerImpl.lambda$null$12(RaftServerImpl.java:831)
>         at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
>         at org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitClientRequestAsync$13(RaftServerImpl.java:831)
>         at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>         ... 3 more{noformat}
> Then shuts down:
> {noformat}
> 2023-06-25 10:37:34,807 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO SCMHATransactionMonitor: SCMHATransactionMonitor Service is not running, skip stop.
> 2023-06-25 10:37:34,807 [Lease Manager-LeaseManager#LeaseMonitor] WARN org.apache.hadoop.ozone.lease.LeaseManager: Lease manager is interrupted. Shutting down...
> java.lang.InterruptedException
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>         at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>         at org.apache.hadoop.ozone.lease.LeaseManager$LeaseMonitor.run(LeaseManager.java:270)
>         at java.lang.Thread.run(Thread.java:748)
> 2023-06-25 10:37:34,808 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Stopping SCM MetadataStore.
> 2023-06-25 10:37:34,852 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Terminating with exit status 0: scm statemachine is closed by ratis, terminate SCM
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down StorageContainerManager at ozn-lease112-2.ozn-lease112.root.hwx.site/172.27.214.76
> ************************************************************/
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Storage Container Manager is not running.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Stopping Replication Manager Service.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.container.replication.ReplicationManager: Replication Monitor Thread is not running.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.SCMSecurityProtocolServer: Join RPC server for SCMSecurityProtocolServer.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.SCMSecurityProtocolServer: Join gRPC server for SCMSecurityProtocolServer.{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org