You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Hongbing Wang (Jira)" <ji...@apache.org> on 2023/11/02 07:40:00 UTC

[jira] [Assigned] (HDDS-9541) NPE in OMDBCheckpointServlet with ozone.om.ratis.enable=false

     [ https://issues.apache.org/jira/browse/HDDS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hongbing Wang reassigned HDDS-9541:
-----------------------------------

    Assignee: Conway Zhang

> NPE in OMDBCheckpointServlet with ozone.om.ratis.enable=false
> -------------------------------------------------------------
>
>                 Key: HDDS-9541
>                 URL: https://issues.apache.org/jira/browse/HDDS-9541
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM HA
>    Affects Versions: 1.4.0
>            Reporter: Conway Zhang
>            Assignee: Conway Zhang
>            Priority: Blocker
>
> I am setting ozone.om.ratis.enable in ozone.xml as :
> {code:java}
> //ozone.xml
> <property>
>       <name>ozone.om.ratis.enable</name>
>       <value>false</value>
> </property> {code}
> When I using http://my.om.address:9874/dbCheckpoint
> there are some error in om.log:
> {code:java}
> //om.log
> 2023-10-25 20:13:35,233 [JvmPauseMonitor0] INFO org.apache.ratis.util.JvmPauseMonitor: JvmPauseMonitor-bf45f543-0a6a-4b80-b781-93fd81953556: Started
> 2023-10-25 20:16:58,832 [qtp1013364696-100] WARN org.apache.hadoop.hdds.server.ServerUtils: Storage directory for Ratis is not configured. It is a good idea to map this to an SSD disk. Falling back to ozone.metadata.dirs
> 2023-10-25 20:16:58,833 [qtp1013364696-100] WARN org.apache.hadoop.hdds.server.ServerUtils: ozone.om.db.dirs is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
> 2023-10-25 20:17:09,441 [qtp1013364696-226] INFO org.apache.hadoop.hdds.utils.DBCheckpointServlet: Received GET request to obtain DB checkpoint snapshot
> 2023-10-25 20:17:09,444 [qtp1013364696-226] ERROR org.apache.hadoop.hdds.utils.DBCheckpointServlet: Unable to process metadata snapshot request. 
> java.lang.NullPointerException
>         at org.apache.hadoop.ozone.om.OMDBCheckpointServlet$Lock.lock(OMDBCheckpointServlet.java:672)
>         at org.apache.hadoop.hdds.utils.DBCheckpointServlet.generateSnapshotCheckpoint(DBCheckpointServlet.java:197)
>         at org.apache.hadoop.hdds.utils.DBCheckpointServlet.doGet(DBCheckpointServlet.java:303)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>         at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
>         at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)
>         at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
>         at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>         at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
>         at org.apache.hadoop.hdds.server.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1660)
>         at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>         at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
>         at org.apache.hadoop.hdds.server.http.NoCacheFilter.doFilter(NoCacheFilter.java:48)
>         at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
>         at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)
>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600)
>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>         at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>         at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
>         at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
>         at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
>         at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
>         at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>         at org.eclipse.jetty.server.Server.handle(Server.java:516)
>         at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
>         at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
>         at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
>         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
>         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
>         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
>         at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>         at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
>         at java.lang.Thread.run(Thread.java:748)
> 2023-10-25 20:22:45,154 [qtp1013364696-232] INFO org.apache.hadoop.hdds.utils.DBCheckpointServlet: Received GET request to obtain DB checkpoint snapshot
> 2023-10-26 04:33:34,845 [JvmPauseMonitor0] WARN org.apache.ratis.util.JvmPauseMonitor: JvmPauseMonitor-bf45f543-0a6a-4b80-b781-93fd81953556: Detected pause in JVM or host machine approximately 0.241s without any GCs.
> 2023-10-26 07:53:12,498 [JvmPauseMonitor0] WARN org.apache.ratis.util.JvmPauseMonitor: JvmPauseMonitor-bf45f543-0a6a-4b80-b781-93fd81953556: Detected pause in JVM or host machine approximately 0.112s without any GCs.
> 2023-10-26 10:38:51,806 [qtp1013364696-234] INFO org.apache.hadoop.hdds.utils.DBCheckpointServlet: Received GET request to obtain DB checkpoint snapshot {code}
> And Recon has samer error:
> {code:java}
> 2023-10-25 19:33:00,064 [pool-27-thread-1] WARN org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Unable to get and apply delta updates from OM.
> java.lang.reflect.UndeclaredThrowableException
>         at com.sun.proxy.$Proxy39.submitRequest(Unknown Source)
>         at org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransport.submitRequest(Hadoop3OmTransport.java:80)
>         at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.submitRequest(OzoneManagerProtocolClientSideTranslatorPB.java:326)
>         at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getDBUpdates(OzoneManagerProtocolClientSideTranslatorPB.java:2142)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.innerGetAndApplyDeltaUpdatesFromOM(OzoneManagerServiceProviderImpl.java:457)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.getAndApplyDeltaUpdatesFromOM(OzoneManagerServiceProviderImpl.java:422)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.syncDataFromOM(OzoneManagerServiceProviderImpl.java:522)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.lambda$startSyncDataFromOM$0(OzoneManagerServiceProviderImpl.java:265)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.InterruptedIOException: Retry interrupted
>         at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.processWaitTimeAndRetryInfo(RetryInvocationHandler.java:137)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:108)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>         ... 15 more
> Caused by: java.lang.InterruptedException: sleep interrupted
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.processWaitTimeAndRetryInfo(RetryInvocationHandler.java:131)
>         ... 17 more
> 2023-10-25 19:33:00,064 [pool-27-thread-1] INFO org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Obtaining full snapshot from Ozone Manager
> 2023-10-25 19:33:00,065 [pool-27-thread-1] ERROR org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Unable to obtain Ozone Manager DB Snapshot. 
> java.net.ConnectException: Connection refused (Connection refused)
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>         at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
>         at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:589)
>         at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
>         at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
>         at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
>         at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
>         at sun.net.www.http.HttpClient.New(HttpClient.java:339)
>         at sun.net.www.http.HttpClient.New(HttpClient.java:357)
>         at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)
>         at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
>         at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
>         at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)
>         at org.apache.hadoop.ozone.recon.ReconUtils.makeHttpCall(ReconUtils.java:227)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.lambda$getOzoneManagerDBSnapshot$1(OzoneManagerServiceProviderImpl.java:354)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
>         at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:551)
>         at org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:531)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.getOzoneManagerDBSnapshot(OzoneManagerServiceProviderImpl.java:353)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.updateReconOmDBWithNewSnapshot(OzoneManagerServiceProviderImpl.java:385)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.syncDataFromOM(OzoneManagerServiceProviderImpl.java:547)
>         at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.lambda$startSyncDataFromOM$0(OzoneManagerServiceProviderImpl.java:265)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2023-10-25 19:33:00,065 [pool-27-thread-1] ERROR org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Null snapshot location got from OM.
> 2023-10-25 19:33:00,199 [JvmPauseMonitor0] INFO org.apache.ratis.util.JvmPauseMonitor: JvmPauseMonitor-Recon: Stopped{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org