You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2024/01/22 17:35:00 UTC

[jira] [Assigned] (HDDS-10177) OM RPC server restarted by InstallSnapshotThread during shutdown

     [ https://issues.apache.org/jira/browse/HDDS-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wei-Chiu Chuang reassigned HDDS-10177:
--------------------------------------

    Assignee:     (was: Wei-Chiu Chuang)

> OM RPC server restarted by InstallSnapshotThread during shutdown
> ----------------------------------------------------------------
>
>                 Key: HDDS-10177
>                 URL: https://issues.apache.org/jira/browse/HDDS-10177
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Attila Doroszlai
>            Priority: Major
>         Attachments: 2024-01-20T18-36-42_926-jvmRun1.dump, org.apache.hadoop.ozone.om.TestSnapshotBackgroundServices-output.txt, org.apache.hadoop.ozone.om.TestSnapshotBackgroundServices.txt
>
>
> TestSnapshotBackgroundServices was successful:
> {code}
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 171.3 s -- in org.apache.hadoop.ozone.om.TestSnapshotBackgroundServices
> {code}
> but it timed out during post-test cluster shutdown, because it was waiting indefinitely for the RPC server to stop:
> {code}
> "main" 
>    java.lang.Thread.State: WAITING
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:502)
>         at org.apache.hadoop.ipc.Server.join(Server.java:3569)
>         at org.apache.hadoop.ozone.om.OzoneManager.join(OzoneManager.java:2286)
>         at org.apache.hadoop.ozone.MiniOzoneClusterImpl.stopOM(MiniOzoneClusterImpl.java:558)
>         at org.apache.hadoop.ozone.MiniOzoneHAClusterImpl.stop(MiniOzoneHAClusterImpl.java:311)
>         at org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:453)
>         at org.apache.hadoop.ozone.om.TestSnapshotBackgroundServices.shutdown(TestSnapshotBackgroundServices.java:202)
> {code}
> The problem is that {{InstallSnapshotThread}} restarted the RPC server in the meantime:
> {code}
> 2024-01-20 18:37:17,649 [main] INFO  ozone.MiniOzoneHAClusterImpl (MiniOzoneHAClusterImpl.java:stop(310)) - Stopping the OzoneManager omNode-3
> 2024-01-20 18:37:17,649 [main] INFO  om.OzoneManager (OzoneManager.java:stop(2204)) - omNode-3[localhost:15012]: Stopping Ozone Manager
> 2024-01-20 18:37:17,650 [main] INFO  ipc.Server (Server.java:stop(3523)) - Stopping server on 15012
> ...
> 2024-01-20 18:37:17,913 [omNode-3-InstallSnapshotThread] INFO  ipc.Server (Server.java:<init>(1287)) - Listener at localhost:15012
> 2024-01-20 18:37:17,932 [omNode-3-InstallSnapshotThread] INFO  om.OzoneManager (OzoneManager.java:installCheckpoint(3863)) - RPC server is re-started. Spend 377 ms.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org