You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Daniel Haviv <da...@gmail.com> on 2016/08/23 04:36:00 UTC

Random nodemanager crashes SIGSEGV

Hi,

In the last 24 hours our node managers keep crashing due to SIGSEGV.

The only info I could find was in the hs_err_XXXX.pid files which includes
the following java stack:



Stack: [0x00007f756a30f000,0x00007f756a410000],
sp=0x00007f756a40dea0,  free space=1019k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libleveldbjni-64-1-5625225739273738004.8+0x2aaac]
leveldb::log::Writer::EmitPhysicalRecord(leveldb::log::RecordType,
char const*, unsigned long)+0x7c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.fusesource.leveldbjni.internal.NativeDB$DBJNI.Put(JLorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;Lorg/fu
sesource/leveldbjni/internal/NativeSlice;)J+0
j  org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;Lorg/fusesourc
e/leveldbjni/internal/NativeSlice;)V+11
j  org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeBuffer;Lorg/fusesour
ce/leveldbjni/internal/NativeBuffer;)V+18
j  org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;[B[B)V+36
j  org.fusesource.leveldbjni.internal.JniDB.put([B[BLorg/iq80/leveldb/WriteOptions;)Lorg/iq80/leveldb/Snapshot;+28
j  org.fusesource.leveldbjni.internal.JniDB.put([B[B)V+10
j  org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.storeDeletionTask(ILorg/apache/hadoop/yarn/proto/YarnServerNodemanagerRecoveryProtos$De
letionServiceDeleteTaskProto;)V+32
j  org.apache.hadoop.yarn.server.nodemanager.DeletionService.recordDeletionTaskInStateStore(Lorg/apache/hadoop/yarn/server/nodemanager/DeletionService$FileDeletionTask;
)V+245
j  org.apache.hadoop.yarn.server.nodemanager.DeletionService.delete(Ljava/lang/String;Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/Path;)V+44
j  org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run()V+271
v  ~StubRoutines::call_stub

The culprit seems to be  [libleveldbjni-64-1-5625225739273738004.8+0x2aaac]
leveldb::log::Writer::EmitPhysicalRecord(leveldb::log::RecordType, char
const*, unsigned long)+0x7c



Any ideas on what that is and how to solve it ?



Thank you.

Daniel