You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2017/12/21 03:17:41 UTC

[GitHub] yzang commented on a change in pull request #513: BOOKKEEPER-1106: Introduce write FileInfo cache and read FileInfo cache

yzang commented on a change in pull request #513: BOOKKEEPER-1106: Introduce write FileInfo cache and read FileInfo cache
URL: https://github.com/apache/bookkeeper/pull/513#discussion_r158189618
 
 

 ##########
 File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/IndexPersistenceMgr.java
 ##########
 @@ -100,71 +118,160 @@ public IndexPersistenceMgr(int pageSize,
         getActiveLedgers();
         ledgerDirsManager.addLedgerDirsListener(getLedgerDirsListener());
 
+        // build the file info cache
+        int concurrencyLevel = Math.max(1, Math.max(conf.getNumAddWorkerThreads(), conf.getNumReadWorkerThreads()));
+        RemovalListener<Long, FileInfo> fileInfoEvictionListener = this::handleLedgerEviction;
+        writeFileInfoCache = buildCache(
+            concurrencyLevel,
+            conf.getFileInfoCacheInitialCapacity(),
+            openFileLimit,
+            conf.getFileInfoMaxIdleTime(),
+            fileInfoEvictionListener);
+        readFileInfoCache = buildCache(
+            concurrencyLevel,
+            2 * conf.getFileInfoCacheInitialCapacity(),
+            2 * openFileLimit,
+            conf.getFileInfoMaxIdleTime(),
+            fileInfoEvictionListener);
+
         // Expose Stats
         evictedLedgersCounter = statsLogger.getCounter(LEDGER_CACHE_NUM_EVICTED_LEDGERS);
-        statsLogger.registerGauge(NUM_OPEN_LEDGERS, new Gauge<Integer>() {
+        pendingGetFileInfoCounter = statsLogger.getCounter(PENDING_GET_FILE_INFO);
+        statsLogger.registerGauge(WRITE_FILE_INFO_CACHE_SIZE, new Gauge<Number>() {
+            @Override
+            public Number getDefaultValue() {
+                return 0;
+            }
+
+            @Override
+            public Number getSample() {
+                return writeFileInfoCache.size();
+            }
+        });
+        statsLogger.registerGauge(READ_FILE_INFO_CACHE_SIZE, new Gauge<Number>() {
             @Override
-            public Integer getDefaultValue() {
+            public Number getDefaultValue() {
                 return 0;
             }
 
             @Override
-            public Integer getSample() {
-                return getNumOpenLedgers();
+            public Number getSample() {
+                return readFileInfoCache.size();
             }
         });
     }
 
-    FileInfo getFileInfo(Long ledger, byte masterKey[]) throws IOException {
-        FileInfo fi = fileInfoCache.get(ledger);
-        if (null == fi) {
-            boolean createdNewFile = false;
-            File lf = null;
-            synchronized (this) {
-                // Check if the index file exists on disk.
-                lf = findIndexFile(ledger);
-                if (null == lf) {
-                    if (null == masterKey) {
-                        throw new Bookie.NoLedgerException(ledger);
+    private static Cache<Long, FileInfo> buildCache(int concurrencyLevel,
+                                            int initialCapacity,
+                                            int maximumSize,
+                                            long expireAfterAccessSeconds,
+                                            RemovalListener<Long, FileInfo> removalListener) {
+        CacheBuilder<Long, FileInfo> builder = CacheBuilder.newBuilder()
+            .concurrencyLevel(concurrencyLevel)
+            .initialCapacity(initialCapacity)
+            .maximumSize(maximumSize)
+            .removalListener(removalListener);
+        if (expireAfterAccessSeconds > 0) {
+            builder.expireAfterAccess(expireAfterAccessSeconds, TimeUnit.SECONDS);
+        }
+        return builder.build();
+    }
+
+    /**
+     * When a ledger is evicted, we need to make sure there's no other thread
+     * trying to get FileInfo for that ledger at the same time when we close
+     * the FileInfo.
+     */
+    private void handleLedgerEviction(RemovalNotification<Long, FileInfo> notification) {
+        FileInfo fileInfo = notification.getValue();
+        Long ledgerId = notification.getKey();
+        if (null == fileInfo || null == notification.getKey()) {
+            return;
+        }
+        if (notification.wasEvicted()) {
+            evictedLedgersCounter.inc();
+            // we need to acquire the write lock in another thread,
+            // otherwise there could be dead lock happening.
+            evictionThreadPool.execute(() -> {
 
 Review comment:
   @ivankelly Good questions. The deadlock is because eviction is running in the same thread who call getFileInfo. So when a thread call getFileInfo, it already acquires a read lock, and in order to guarantee no race condition, we have to acquire a write lock before we close the fileInfo, but the read lock is already acquired by the same thread, so we'll be stuck here because acquiring a write lock need to wait for read lock to be released. That's why we need to acquire the write lock asynchronously or in another thread to let itself release the read lock first.
   
   As for the problem you mentioned, I don't think it will happen in normal cases. But it might happen in extreme case.
   1. When FileInfo is evicted, it's already not in the cache, so whoever wants to read the file wouldn't get stale information
   2. There's no impact if only read fileinfo is evicted or write fileinfo is evicted, because we only close the file when both of them are evicted and no one is using them by looking at the ref counting.
   3. There's no async flush happening, we're just closing the file channel asynchronously when the ref counter for that FileInfo is 0.
   4. If thread A is going to fileInfo, and thread B is getting the fileInfo, inside guava cache, this operation is done in a single thread, so if thread A happens first, then thread B will find nothing in the cache and create a new FileInfo, if thread B happens first, thread A will basically not evicting fileInfo anymore because the fileInfo is just accessed. 
   
   (Extreme case is thread B access the FileInfo and before it has the chance to increase the ref counting, thread A evict the FileInfo from the cache and close the file) 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services