You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Attila Doroszlai (Jira)" <ji...@apache.org> on 2023/06/27 17:32:00 UTC
[jira] [Updated] (HDDS-8934) SCMHAInvocationHandler throws undeclared exceptions, causes SCM to exit
[ https://issues.apache.org/jira/browse/HDDS-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Attila Doroszlai updated HDDS-8934:
-----------------------------------
Summary: SCMHAInvocationHandler throws undeclared exceptions, causes SCM to exit (was: SCM throws SingleThreadExecutor exceptions and then shuts down)
> SCMHAInvocationHandler throws undeclared exceptions, causes SCM to exit
> -----------------------------------------------------------------------
>
> Key: HDDS-8934
> URL: https://issues.apache.org/jira/browse/HDDS-8934
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Pratyush Bhatt
> Assignee: Attila Doroszlai
> Priority: Major
>
> SCM throws org.apache.hadoop.hdds.server.events.SingleThreadExecutor exception
> {noformat}
> 2023-06-25 10:37:34,775 [IPC Server listener on 9860] INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 9860
> 2023-06-25 10:37:34,775 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Stopping Storage Container Manager HTTP server.
> 2023-06-25 10:37:34,776 [IPC Server Responder] INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
> 2023-06-25 10:37:34,776 [EventQueue-DeadNodeForDeadNodeHandler] ERROR org.apache.hadoop.hdds.server.events.SingleThreadExecutor: Error on execution message dd5867e2-90ba-44bf-b58e-1163e6fd8058(ozn-lease112-5.ozn-lease112.root.hwx.site/172.27.130.138)
> java.lang.reflect.UndeclaredThrowableException
> at com.sun.proxy.$Proxy15.updatePipelineState(Unknown Source)
> at org.apache.hadoop.hdds.scm.pipeline.PipelineManagerImpl.closePipeline(PipelineManagerImpl.java:440)
> at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.lambda$null$0(DeadNodeHandler.java:126)
> at java.lang.Iterable.forEach(Iterable.java:75)
> at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.lambda$destroyPipelines$1(DeadNodeHandler.java:124)
> at java.util.Optional.ifPresent(Optional.java:159)
> at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.destroyPipelines(DeadNodeHandler.java:123)
> at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:84)
> at org.apache.hadoop.hdds.scm.node.DeadNodeHandler.onMessage(DeadNodeHandler.java:50)
> at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: org.apache.ratis.protocol.exceptions.ServerNotReadyException: 70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6 is not in [RUNNING]: current state is CLOSING
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
> at org.apache.hadoop.hdds.scm.ha.SCMRatisServerImpl.submitRequest(SCMRatisServerImpl.java:229)
> at org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invokeRatis(SCMHAInvocationHandler.java:115)
> at org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invoke(SCMHAInvocationHandler.java:71)
> ... 13 more
> Caused by: org.apache.ratis.protocol.exceptions.ServerNotReadyException: 70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6 is not in [RUNNING]: current state is CLOSING
> at org.apache.ratis.server.impl.RaftServerImpl.lambda$assertLifeCycleState$9(RaftServerImpl.java:749)
> at org.apache.ratis.util.LifeCycle.assertCurrentState(LifeCycle.java:253)
> at org.apache.ratis.server.impl.RaftServerImpl.assertLifeCycleState(RaftServerImpl.java:748)
> at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:838)
> at org.apache.ratis.server.impl.RaftServerImpl.lambda$null$12(RaftServerImpl.java:831)
> at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
> at org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitClientRequestAsync$13(RaftServerImpl.java:831)
> at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
> ... 3 more{noformat}
> Then shuts down:
> {noformat}
> 2023-06-25 10:37:34,807 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO SCMHATransactionMonitor: SCMHATransactionMonitor Service is not running, skip stop.
> 2023-06-25 10:37:34,807 [Lease Manager-LeaseManager#LeaseMonitor] WARN org.apache.hadoop.ozone.lease.LeaseManager: Lease manager is interrupted. Shutting down...
> java.lang.InterruptedException
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
> at org.apache.hadoop.ozone.lease.LeaseManager$LeaseMonitor.run(LeaseManager.java:270)
> at java.lang.Thread.run(Thread.java:748)
> 2023-06-25 10:37:34,808 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Stopping SCM MetadataStore.
> 2023-06-25 10:37:34,852 [70d60297-a634-4bc4-99c0-0a8092d2c5d0@group-97E573E5D3A6-StateMachineUpdater] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Terminating with exit status 0: scm statemachine is closed by ratis, terminate SCM
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down StorageContainerManager at ozn-lease112-2.ozn-lease112.root.hwx.site/172.27.214.76
> ************************************************************/
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Storage Container Manager is not running.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.StorageContainerManager: Stopping Replication Manager Service.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.container.replication.ReplicationManager: Replication Monitor Thread is not running.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.SCMSecurityProtocolServer: Join RPC server for SCMSecurityProtocolServer.
> 2023-06-25 10:37:34,855 [shutdown-hook-0] INFO org.apache.hadoop.hdds.scm.server.SCMSecurityProtocolServer: Join gRPC server for SCMSecurityProtocolServer.{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org