You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/02/11 00:23:22 UTC

[GitHub] [bookkeeper] dlg99 opened a new issue #3040: [FLAKY TEST] RocksDB segfaulted during CompactionTest: rocksdb::DBImpl::NewIterator(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*)

dlg99 opened a new issue #3040:
URL: https://github.com/apache/bookkeeper/issues/3040


   **BUG REPORT**
   
   ***Describe the bug***
   
   happened during run of `./gradlew bookkeeper-server:test --tests="org.apache.bookkeeper.bookie.CompactionByBytesTest"`
   Not sure how easily reproducible it is.
   
   Possibly related to https://github.com/facebook/rocksdb/issues/7948 (using `rocksDb: "6.27.3"` at the moment of crash)
   
   ```
   # JRE version: OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (11.0.11+9) (build 11.0.11+9)
   # Java VM: OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (11.0.11+9, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
   # Problematic frame:
   # C  [librocksdbjni16740395254626857584.jnilib+0xe60f8]  rocksdb::DBImpl::NewIterator(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*)+0x188
   #
   ...
   ---------------  T H R E A D  ---------------
   
   Current thread (0x00007f8d7d398000):  JavaThread "GarbageCollectorThread-488-1" [_thread_in_native, id=96531, stack(0x000070000c9a4000,0x000070000caa4000)]
   
   Stack: [0x000070000c9a4000,0x000070000caa4000],  sp=0x000070000caa30b0,  free space=1020k
   Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
   C  [librocksdbjni16740395254626857584.jnilib+0xe60f8]  rocksdb::DBImpl::NewIterator(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*)+0x188
   C  [librocksdbjni16740395254626857584.jnilib+0x239a9]  Java_org_rocksdb_RocksDB_iterator__JJ+0xc9
   J 7099  org.rocksdb.RocksDB.iterator(JJ)J (0 bytes) @ 0x000000011fd2447d [0x000000011fd243c0+0x00000000000000bd]
   J 6886 c1 org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.iterator()Lorg/apache/bookkeeper/bookie/storage/ldb/KeyValueStorage$CloseableIterator; (35 bytes) @ 0x00000001192f6c34 [0x00000001192f6ae0+0x0000000000000154]
   J 6885 c1 org.apache.bookkeeper.bookie.storage.ldb.PersistentEntryLogMetadataMap.forEach(Ljava/util/function/BiConsumer;)V (322 bytes) @ 0x00000001192ecdc4 [0x00000001192ecca0+0x0000000000000124]
   J 7290 c1 org.apache.bookkeeper.bookie.GarbageCollectorThread.doGcEntryLogs()V (47 bytes) @ 0x00000001193d8dfc [0x00000001193d8940+0x00000000000004bc]
   J 7024 c1 org.apache.bookkeeper.bookie.GarbageCollectorThread.runWithFlags(ZZZ)V (362 bytes) @ 0x00000001193304cc [0x00000001193300c0+0x000000000000040c]
   J 7108 c1 org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun()V (44 bytes) @ 0x000000011937089c [0x0000000119370600+0x000000000000029c]
   J 5985 c2 org.apache.bookkeeper.common.util.SafeRunnable.run()V (22 bytes) @ 0x000000011fbf3dfc [0x000000011fbf3dc0+0x000000000000003c]
   J 4227 c1 java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; java.base@11.0.11 (14 bytes) @ 0x0000000118c55854 [0x0000000118c55740+0x0000000000000114]
   J 6428 c1 java.util.concurrent.FutureTask.runAndReset()Z java.base@11.0.11 (125 bytes) @ 0x00000001191e514c [0x00000001191e4a80+0x00000000000006cc]
   J 4559 c1 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V java.base@11.0.11 (57 bytes) @ 0x0000000118d43c24 [0x0000000118d43a40+0x00000000000001e4]
   J 6888 c2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V java.base@11.0.11 (187 bytes) @ 0x000000011fcdbaf8 [0x000000011fcdb920+0x00000000000001d8]
   J 5318 c1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V java.base@11.0.11 (9 bytes) @ 0x0000000118fa8f44 [0x0000000118fa8ec0+0x0000000000000084]
   J 5079 c1 io.netty.util.concurrent.FastThreadLocalRunnable.run()V (22 bytes) @ 0x0000000118f0296c [0x0000000118f02860+0x000000000000010c]
   J 4051 c1 java.lang.Thread.run()V java.base@11.0.11 (17 bytes) @ 0x0000000118be3e84 [0x0000000118be3d40+0x0000000000000144]
   v  ~StubRoutines::call_stub
   V  [libjvm.dylib+0x3b09e0]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x21a
   V  [libjvm.dylib+0x3afe2a]  JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0xee
   V  [libjvm.dylib+0x3afee6]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*)+0x62
   V  [libjvm.dylib+0x436a2e]  thread_entry(JavaThread*, Thread*)+0x78
   V  [libjvm.dylib+0x7732a6]  JavaThread::thread_main_inner()+0x82
   V  [libjvm.dylib+0x7730f0]  JavaThread::run()+0x174
   V  [libjvm.dylib+0x770fcc]  Thread::call_run()+0x68
   V  [libjvm.dylib+0x62014b]  thread_native_entry(Thread*)+0x139
   C  [libsystem_pthread.dylib+0x68fc]  _pthread_start+0xe0
   C  [libsystem_pthread.dylib+0x2443]  thread_start+0xf
   
   Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
   J 7099  org.rocksdb.RocksDB.iterator(JJ)J (0 bytes) @ 0x000000011fd24408 [0x000000011fd243c0+0x0000000000000048]
   J 6886 c1 org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.iterator()Lorg/apache/bookkeeper/bookie/storage/ldb/KeyValueStorage$CloseableIterator; (35 bytes) @ 0x00000001192f6c34 [0x00000001192f
   6ae0+0x0000000000000154]
   J 6885 c1 org.apache.bookkeeper.bookie.storage.ldb.PersistentEntryLogMetadataMap.forEach(Ljava/util/function/BiConsumer;)V (322 bytes) @ 0x00000001192ecdc4 [0x00000001192ecca0+0x0000000000000124]
   J 7290 c1 org.apache.bookkeeper.bookie.GarbageCollectorThread.doGcEntryLogs()V (47 bytes) @ 0x00000001193d8dfc [0x00000001193d8940+0x00000000000004bc]
   J 7024 c1 org.apache.bookkeeper.bookie.GarbageCollectorThread.runWithFlags(ZZZ)V (362 bytes) @ 0x00000001193304cc [0x00000001193300c0+0x000000000000040c]
   J 7108 c1 org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun()V (44 bytes) @ 0x000000011937089c [0x0000000119370600+0x000000000000029c]
   J 5985 c2 org.apache.bookkeeper.common.util.SafeRunnable.run()V (22 bytes) @ 0x000000011fbf3dfc [0x000000011fbf3dc0+0x000000000000003c]
   J 4227 c1 java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; java.base@11.0.11 (14 bytes) @ 0x0000000118c55854 [0x0000000118c55740+0x0000000000000114]
   J 6428 c1 java.util.concurrent.FutureTask.runAndReset()Z java.base@11.0.11 (125 bytes) @ 0x00000001191e514c [0x00000001191e4a80+0x00000000000006cc]
   J 4559 c1 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V java.base@11.0.11 (57 bytes) @ 0x0000000118d43c24 [0x0000000118d43a40+0x00000000000001e4]
   J 6888 c2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V java.base@11.0.11 (187 bytes) @ 0x000000011fcdbaf8 [0x000000011fcdb920+0x00000000000001d8]
   J 5318 c1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V java.base@11.0.11 (9 bytes) @ 0x0000000118fa8f44 [0x0000000118fa8ec0+0x0000000000000084]
   J 5079 c1 io.netty.util.concurrent.FastThreadLocalRunnable.run()V (22 bytes) @ 0x0000000118f0296c [0x0000000118f02860+0x000000000000010c]
   J 4051 c1 java.lang.Thread.run()V java.base@11.0.11 (17 bytes) @ 0x0000000118be3e84 [0x0000000118be3d40+0x0000000000000144]
   v  ~StubRoutines::call_stub
   ```
   
   ***Expected behavior***
   
   no crash
   
   ***Additional context***
   
   Log:
   
   [hs_err_pid33716.log](https://github.com/apache/bookkeeper/files/8045126/hs_err_pid33716.log)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] dlg99 closed issue #3040: [FLAKY TEST] RocksDB segfaulted during CompactionTest: rocksdb::DBImpl::NewIterator(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*)

Posted by GitBox <gi...@apache.org>.
dlg99 closed issue #3040:
URL: https://github.com/apache/bookkeeper/issues/3040


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] dlg99 commented on issue #3040: [FLAKY TEST] RocksDB segfaulted during CompactionTest: rocksdb::DBImpl::NewIterator(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*)

Posted by GitBox <gi...@apache.org>.
dlg99 commented on issue #3040:
URL: https://github.com/apache/bookkeeper/issues/3040#issuecomment-1036508397


   It is tricky to mock / will require otherwise useless refactoring/test injection, but I reproed it with minimal modifications.
   The trick is to force GC thread to create an iterator on already closed PersistentEntryLogMetadataMap/underlying RocksDB. Probably any kind of access to it is enough.
   
   ```
   diff --git a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/GarbageCollectorThread.java b/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/GarbageCollectorThread.java
   index bf00566f1..a1a902a64 100644
   --- a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/GarbageCollectorThread.java
   +++ b/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/GarbageCollectorThread.java
   @@ -31,6 +31,7 @@ import java.util.concurrent.ScheduledExecutorService;
    import java.util.concurrent.TimeUnit;
    import java.util.concurrent.atomic.AtomicBoolean;
    
   +import java.util.concurrent.atomic.AtomicInteger;
    import java.util.concurrent.atomic.AtomicLong;
    import java.util.function.Supplier;
    
   @@ -474,6 +475,8 @@ public class GarbageCollectorThread extends SafeRunnable {
            });
        }
    
   +    final AtomicInteger steps = new AtomicInteger(0);
   +
        /**
         * Compact entry logs if necessary.
         *
   @@ -495,6 +498,14 @@ public class GarbageCollectorThread extends SafeRunnable {
            MutableLong end = new MutableLong(start);
            MutableLong timeDiff = new MutableLong(0);
    
   +        while (!steps.compareAndSet(1, 2)) {
   +            try {
   +                Thread.sleep(10);
   +            } catch (InterruptedException e) {
   +                e.printStackTrace();
   +            }
   +        }
   +
            entryLogMetaMap.forEach((entryLogId, meta) -> {
                int bucketIndex = calculateUsageIndex(numBuckets, meta.getUsage());
                entryLogUsageBuckets[bucketIndex]++;
   @@ -562,6 +573,7 @@ public class GarbageCollectorThread extends SafeRunnable {
            }
    
            this.running = false;
   +
            // Interrupt GC executor thread
            gcExecutor.shutdownNow();
            try {
   @@ -569,6 +581,14 @@ public class GarbageCollectorThread extends SafeRunnable {
            } catch (Exception e) {
                LOG.warn("Failed to close entryLog metadata-map", e);
            }
   +
   +        while (!steps.compareAndSet(0, 1)) {
   +            Thread.sleep(10);
   +        }
   +
   +        while (!steps.compareAndSet(2, 3)) {
   +            Thread.sleep(10);
   +        }
        }
    
        /**
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [bookkeeper] dlg99 closed issue #3040: [FLAKY TEST] RocksDB segfaulted during CompactionTest: rocksdb::DBImpl::NewIterator(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*)

Posted by GitBox <gi...@apache.org>.
dlg99 closed issue #3040:
URL: https://github.com/apache/bookkeeper/issues/3040


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org