You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Arpit Agarwal (Jira)" <ji...@apache.org> on 2020/10/16 16:12:00 UTC

[jira] [Resolved] (HDDS-4351) DN crash while RatisApplyTransactionExecutor tries to putBlock to rocksDB

     [ https://issues.apache.org/jira/browse/HDDS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arpit Agarwal resolved HDDS-4351.
---------------------------------
    Fix Version/s: 1.1.0
       Resolution: Done

> DN crash while RatisApplyTransactionExecutor tries to putBlock to rocksDB
> -------------------------------------------------------------------------
>
>                 Key: HDDS-4351
>                 URL: https://issues.apache.org/jira/browse/HDDS-4351
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 1.1.0
>            Reporter: Glen Geng
>            Assignee: Ethan Rose
>            Priority: Major
>             Fix For: 1.1.0
>
>
> In Tencent, we monthly pick up the latest mater and deploy them to our production environment.
> This time, we tested c956ce6 (HDDS-4262 [. Use ClientID and CallID from Rpc Client to detect retry re…|https://github.com/apache/hadoop-ozone/commit/c956ce6b7537a0286c01b15d4963333a7ffeba90] ), encountered frequently crash in datanode while putBlock.
>  
> *The setup* is 3 DN, each engage in 8 pipelines. 1 OM 1 SCM and 1 Gateway.
> *The repo procedure* is simple. Continually writing 10GB size files to s3g from python (the aws lib boto3), after write tens of files, DN might crash while applying putBlock operations.
> After running the test on the version that revert HDDS-3869 for 10 hours,,  no DN crash occurred.
> Will schedule a long run test on the latest master with HDDS-4327, to check if adding try-with-resource to BatchOperation could fix the crash issue.
>  
> *Example1: segment fault while putBlock.*
> {code:java}
> Current thread (0x00007eff34524000):  JavaThread "RatisApplyTransactionExecutor 9" daemon [_thread_in_native, id=20401, stack(0x00007efef4a14000,0x00007efef4b15000)]siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007eff37eb9000Registers:
> RAX=0x00007efe8bbfb024, RBX=0x0000000000000000, RCX=0x0000000000000000, RDX=0x00000000007688e4
> RSP=0x00007efef4b11e38, RBP=0x00007efef4b11f60, RSI=0x00007eff37eb8feb, RDI=0x00007efe8f892640
> R8 =0x00007efe8bbfb024, R9 =0x0000000000800000, R10=0x0000000000000022, R11=0x0000000000001000
> R12=0x00007efef4b12100, R13=0x00007eff340badc0, R14=0x00007eff340bb7b0, R15=0x0000000004400000
> RIP=0x00007eff4fa04bae, EFLAGS=0x0000000000010206, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>   TRAPNO=0x000000000000000eStack: [0x00007efef4a14000,0x00007efef4b15000],  sp=0x00007efef4b11e38,  free space=1015k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> C  [libc.so.6+0x151bae]  __memmove_ssse3_back+0x192e
> C  [librocksdbjni3701435679326554484.so+0x3b2263]  rocksdb::MemTableInserter::DeleteCF(unsigned int, rocksdb::Slice const&)+0x253
> C  [librocksdbjni3701435679326554484.so+0x3a889f]  rocksdb::WriteBatchInternal::Iterate(rocksdb::WriteBatch const*, rocksdb::WriteBatch::Handler*, unsigned long, unsigned long)+0x75f
> C  [librocksdbjni3701435679326554484.so+0x3a8d44]  rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x24
> C  [librocksdbjni3701435679326554484.so+0x3ac3f9]  rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, rocksdb::TrimHistoryScheduler*, bool, unsigned long, rocksdb::DB*, bool, bool, bool)+0x249
> C  [librocksdbjni3701435679326554484.so+0x2f6308]  rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x1e98
> C  [librocksdbjni3701435679326554484.so+0x2f70c1]  rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x21
> C  [librocksdbjni3701435679326554484.so+0x1dd0cc]  Java_org_rocksdb_RocksDB_write0+0xcc
> j  org.rocksdb.RocksDB.write0(JJJ)V+0
> J 8597 C1 org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/helpers/BlockData;Z)J (487 bytes) @ 0x00007eff3a8dd84c [0x00007eff3a8db8e0+0x1f6c]
> J 8700 C1 org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (211 bytes) @ 0x00007eff3a927ebc [0x00007eff3a926220+0x1c9c]
> J 6685 C1 org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueHandler;Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (228 bytes) @ 0x00007eff3a2ba2c4 [0x00007eff3a2b7640+0x2c84]
> J 6684 C1 org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (11 bytes) @ 0x00007eff3a29a8ac [0x00007eff3a29a740+0x16c]
> J 7108 C1 org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (1105 bytes) @ 0x00007eff3a4a8324 [0x00007eff3a4a2b40+0x57e4]
> J 7105 C1 org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(Ljava/lang/Object;Lorg/apache/hadoop/hdds/function/FunctionWithServiceException;Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object; (205 bytes) @ 0x00007eff3a48158c [0x00007eff3a480220+0x136c]
> J 7102 C1 org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (38 bytes) @ 0x00007eff3a472bfc [0x00007eff3a4725a0+0x65c]
> J 7250 C1 org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (103 bytes) @ 0x00007eff39c397a4 [0x00007eff39c38260+0x1544]
> J 7960 C1 org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext$Builder;JLjava/util/concurrent/CompletableFuture;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (81 bytes) @ 0x00007eff3a7b654c [0x00007eff3a7b6280+0x2cc]
> J 7959 C1 org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine$$Lambda$572.get()Ljava/lang/Object; (24 bytes) @ 0x00007eff3a7b2124 [0x00007eff3a7b2080+0xa4]
> J 7226 C1 java.util.concurrent.CompletableFuture$AsyncSupply.run()V (61 bytes) @ 0x00007eff3a0e672c [0x00007eff3a0e6520+0x20c]
> j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
> j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
> j  java.lang.Thread.run()V+11
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x68868b]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0xddb
> V  [libjvm.so+0x685f53]  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x263
> V  [libjvm.so+0x686517]  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)+0x47
> V  [libjvm.so+0x6f268c]  thread_entry(JavaThread*, Thread*)+0x6c
> V  [libjvm.so+0xa7ca9b]  JavaThread::thread_main_inner()+0xdb
> V  [libjvm.so+0xa7cda1]  JavaThread::run()+0x2d1
> V  [libjvm.so+0x90dcb2]  java_start(Thread*)+0x102
> C  [libpthread.so.0+0x7e25]  start_thread+0xc5Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  org.rocksdb.RocksDB.write0(JJJ)V+0
> J 8597 C1 org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/helpers/BlockData;Z)J (487 bytes) @ 0x00007eff3a8dd84c [0x00007eff3a8db8e0+0x1f6c]
> J 8700 C1 org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (211 bytes) @ 0x00007eff3a927ebc [0x00007eff3a926220+0x1c9c]
> J 6685 C1 org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueHandler;Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (228 bytes) @ 0x00007eff3a2ba2c4 [0x00007eff3a2b7640+0x2c84]
> J 6684 C1 org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (11 bytes) @ 0x00007eff3a29a8ac [0x00007eff3a29a740+0x16c]
> J 7108 C1 org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (1105 bytes) @ 0x00007eff3a4a8324 [0x00007eff3a4a2b40+0x57e4]
> J 7105 C1 org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(Ljava/lang/Object;Lorg/apache/hadoop/hdds/function/FunctionWithServiceException;Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object; (205 bytes) @ 0x00007eff3a48158c [0x00007eff3a480220+0x136c]
> J 7102 C1 org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (38 bytes) @ 0x00007eff3a472bfc [0x00007eff3a4725a0+0x65c]
> J 7250 C1 org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (103 bytes) @ 0x00007eff39c397a4 [0x00007eff39c38260+0x1544]
> J 7960 C1 org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext$Builder;JLjava/util/concurrent/CompletableFuture;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; (81 bytes) @ 0x00007eff3a7b654c [0x00007eff3a7b6280+0x2cc]
> J 7959 C1 org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine$$Lambda$572.get()Ljava/lang/Object; (24 bytes) @ 0x00007eff3a7b2124 [0x00007eff3a7b2080+0xa4]
> J 7226 C1 java.util.concurrent.CompletableFuture$AsyncSupply.run()V (61 bytes) @ 0x00007eff3a0e672c [0x00007eff3a0e6520+0x20c]
> j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
> j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
> j  java.lang.Thread.run()V+11
> v  ~StubRoutines::call_stub
> {code}
> *Example2: new throw std::bad_alloc while putBlock*
> {code:java}
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Missing separate debuginfo for /opt/software/jdk1.8.0_231/jre/lib/amd64/server/libjvm.so
> Missing separate debuginfo for /opt/software/jdk1.8.0_231/jre/lib/amd64/libverify.so
> Missing separate debuginfo for /opt/software/jdk1.8.0_231/jre/lib/amd64/libmanagement.so
> Core was generated by `/opt/software/jdk1.8.0_231/bin/java -Dproc_datanode -Djava.net.preferIPv4Stack='.
> Program terminated with signal 6, Aborted.
> #0  0x00007f293e2b41f7 in raise () from /lib64/libc.so.6
> Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.tl2.3.x86_64 java-1.8.0-openjdk-headless-1.8.0.71-2.b15.el7_2.x86_64 libgcc-4.8.5-5.tl2.x86_64 libstdc++-4.8.5-5.tl2.x86_64 lz4-1.7.5-2.tl2.x86_64 snappy-1.1.0-3.el7.x86_64 zlib-1.2.7-15.el7.x86_64
> (gdb) bt
> #0  0x00007f293e2b41f7 in raise () from /lib64/libc.so.6
> #1  0x00007f293e2b58e8 in abort () from /lib64/libc.so.6
> #2  0x00007f28e6eda9d5 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
> #3  0x00007f28e6ed8946 in ?? () from /lib64/libstdc++.so.6
> #4  0x00007f28e6ed8973 in std::terminate() () from /lib64/libstdc++.so.6
> #5  0x00007f28e6ed8b93 in __cxa_throw () from /lib64/libstdc++.so.6
> #6  0x00007f28e6ed912d in operator new(unsigned long) () from /lib64/libstdc++.so.6
> #7  0x00007f28e6f37c69 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /lib64/libstdc++.so.6
> #8  0x00007f28e6f37e56 in std::string::_M_mutate(unsigned long, unsigned long, unsigned long) () from /lib64/libstdc++.so.6
> #9  0x00007f28e6f37fe6 in std::string::_M_leak_hard() () from /lib64/libstdc++.so.6
> #10 0x00007f28e6487211 in rocksdb::WriteBatchInternal::SetSequence(rocksdb::WriteBatch*, unsigned long) () from /tmp/librocksdbjni4549469381011773502.so
> #11 0x00007f28e648a3e0 in rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, rocksdb::TrimHistoryScheduler*, bool, unsigned long, rocksdb::DB*, bool, bool, bool) () from /tmp/librocksdbjni4549469381011773502.so
> #12 0x00007f28e63d4308 in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*) () from /tmp/librocksdbjni4549469381011773502.so
> #13 0x00007f28e63d50c1 in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) () from /tmp/librocksdbjni4549469381011773502.so
> #14 0x00007f28e62bb0cc in Java_org_rocksdb_RocksDB_write0 () from /tmp/librocksdbjni4549469381011773502.so
> #15 0x00007f292ac9b57e in ?? ()
> #16 0x00000000b21578a8 in ?? ()
> #17 0x0000000093632f70 in ?? ()
> #18 0x0000000000000012 in ?? ()
> #19 0x00007f292a8a0b84 in ?? ()
> #20 0x01751bb038161c0c in ?? ()
> #21 0x0000000093632848 in ?? ()
> #22 0x00000000aa506db0 in ?? ()
> #23 0x00007f28c43f6290 in ?? ()
> #24 0x00000000a9bc47d8 in ?? ()
> #25 0x00007f292b170254 in ?? ()
> #26 0x00007f28c43f6280 in ?? ()
> #27 0x00007f293d988c48 in JVM_MonitorWait () from /opt/software/jdk1.8.0_231/jre/lib/amd64/server/libjvm.so
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> {code}
> *Example 3: put key failed, but DN does not crash.*
> {code:java}
> 2020-10-13 11:02:11,160 [RatisApplyTransactionExecutor 1] INFO org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: Operation: PutBlock , Trace ID:  , Message: Put Key failed , Result: IO_EXCEPTION , StorageContainerException Occurred.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: Put Key failed
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:449)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:187)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:163)
>         at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:309)
>         at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:171)
>         at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
>         at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:170)
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:400)
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:410)
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(ContainerStateMachine.java:754)
>         at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Unable to write the batch.
>         at org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
>         at org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:279)
>         at org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(BlockManagerImpl.java:152)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:440)
>         ... 13 more
> Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
>         at org.rocksdb.RocksDB.write0(Native Method)
>         at org.rocksdb.RocksDB.write(RocksDB.java:1586)
>         at org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
>         ... 16 more
> 2020-10-13 11:02:11,439 [Datanode State Machine Thread - 0] WARN org.apache.hadoop.ozone.container.common.statemachine.StateContext: No available thread in pool for past 5 seconds.
> 2020-10-13 11:02:11,449 [RatisApplyTransactionExecutor 1] ERROR org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine: gid group-39F7E68E05D7 : ApplyTransaction failed. cmd PutBlock logIndex 1651 msg : Put Key failed Container Result: IO_EXCEPTION
> 2020-10-13 11:02:11,458 [RatisApplyTransactionExecutor 1] ERROR org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: pipeline Action CLOSE on pipeline PipelineID=504be054-27bc-4c67-ae2d-39f7e68e05d7.Reason : Ratis Transaction failure in datanode b65b0b6c-b0bb-429f-a23d-467c72d4b85c with role LEADER .Triggering pipeline close action.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org