You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Attila Doroszlai (Jira)" <ji...@apache.org> on 2020/07/15 17:14:00 UTC

[jira] [Comment Edited] (HDDS-3966) Intermittent crash in TestOMRatisSnapshots

    [ https://issues.apache.org/jira/browse/HDDS-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158544#comment-17158544 ] 

Attila Doroszlai edited comment on HDDS-3966 at 7/15/20, 5:13 PM:
------------------------------------------------------------------

The problem is background thread {{OM StateMachine ApplyTransaction Thread}} continues processing after DB is closed.

{code:title=hs_err_pid*.log}
J 11940  org.rocksdb.RocksDB.get(J[BIIJ)[B (0 bytes) @ 0x00007f8eaaf97d93 [0x00007f8eaaf97d40+0x53]
J 11978 C1 org.apache.hadoop.hdds.utils.db.RDBTable.get([B)[B (40 bytes) @ 0x00007f8eaafac6bc [0x00007f8eaafac580+0x13c]
J 11977 C1 org.apache.hadoop.hdds.utils.db.RDBTable.get(Ljava/lang/Object;)Ljava/lang/Object; (18 bytes) @ 0x00007f8eaafacf84 [0x00007f8eaafacde0+0x1a4]
J 10938 C1 org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(Ljava/lang/Object;)Ljava/lang/Object; (57 bytes) @ 0x00007f8eaabc8344 [0x00007f8eaabc8280+0xc4]
J 10946 C1 org.apache.hadoop.hdds.utils.db.TypedTable.get(Ljava/lang/Object;)Ljava/lang/Object; (99 bytes) @ 0x00007f8eaabed5f4 [0x00007f8eaabed380+0x274]
j  org.apache.hadoop.ozone.om.request.volume.OMVolumeCreateRequest.validateAndUpdateCache(Lorg/apache/hadoop/ozone/om/OzoneManager;JLorg/apache/hadoop/ozone/om/ratis/utils/OzoneManagerDoubleBufferHelper;)Lorg/apache/hadoop/ozone/om/response/OMClientResponse;+447
J 12916 C1 org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMRequest;J)Lorg/apache/hadoop/ozone/om/response/OMClientResponse; (58 bytes) @ 0x00007f8eab37ddd4 [0x00007f8eab37d880+0x554]
J 12915 C1 org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMRequest;J)Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMResponse; (91 bytes) @ 0x00007f8eab37ea04 [0x00007f8eab37e900+0x104]
J 12913 C1 org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine$$Lambda$555.get()Ljava/lang/Object; (16 bytes) @ 0x00007f8eab379334 [0x00007f8eab379220+0x114]
J 10879 C1 java.util.concurrent.CompletableFuture$AsyncSupply.run()V (61 bytes) @ 0x00007f8eaababf94 [0x00007f8eaababda0+0x1f4]
J 11076 C1 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x00007f8eaac5a314 [0x00007f8eaac59300+0x1014]
J 11240 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ 0x00007f8eaacf7604 [0x00007f8eaacf7500+0x104]
J 9453 C1 java.lang.Thread.run()V (17 bytes) @ 0x00007f8eaa727f44 [0x00007f8eaa727e00+0x144]
{code}

This is reproducible by adding some sleep in the test after:

{code:title=https://github.com/apache/hadoop-ozone/blob/da49ca6891cc7a2f6ed44aacf8ec51331ad969f4/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOMRatisSnapshots.java#L177}
followerOM.getMetadataManager().getStore().close();
{code}

CC [~bharat], [~hanishakoneru]


was (Author: adoroszlai):
CC [~bharat], [~hanishakoneru]

> Intermittent crash in TestOMRatisSnapshots
> ------------------------------------------
>
>                 Key: HDDS-3966
>                 URL: https://issues.apache.org/jira/browse/HDDS-3966
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Attila Doroszlai
>            Priority: Major
>
> TestOMRatisSnapshots was recently enabled and is crashing intermittently:
> https://github.com/elek/ozone-build-results/tree/master/2020/07/14/1690/it-hdds-om
> https://github.com/elek/ozone-build-results/tree/master/2020/07/14/1710/it-hdds-om
> https://github.com/elek/ozone-build-results/tree/master/2020/07/15/1713/it-hdds-om



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org