You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/08/15 17:40:41 UTC
[GitHub] [lucene-solr] msokolov commented on a change in pull request #1623: LUCENE-8962: Merge segments on getReader

msokolov commented on a change in pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#discussion_r471017030



##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -607,6 +633,57 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean writeAllDeletes) thro
           }
         }
       }
+      if (onCommitMerges != null) { // only relevant if we do merge on getReader
+        boolean replaceReaderSuccess = false;
+        try {
+          mergeScheduler.merge(mergeSource, MergeTrigger.GET_READER);
+          onCommitMerges.await(maxCommitMergeWaitMillis, TimeUnit.MILLISECONDS);

Review comment:
       There's no need to check whether we timed out here, since we effectively abort all of the point-in-time merges we created by setting `includeMergeReader` to `false` below, right? I wonder if it would be cleaner to eliminate this AtomicBoolean that is shared with this and the merge threads, and instead using the existing merge abort technique that we have? OTOH IDK how that other mechanism works - is it aborting *all* outstanding merges?

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -607,6 +633,57 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean writeAllDeletes) thro
           }
         }
       }
+      if (onCommitMerges != null) { // only relevant if we do merge on getReader
+        boolean replaceReaderSuccess = false;
+        try {
+          mergeScheduler.merge(mergeSource, MergeTrigger.GET_READER);
+          onCommitMerges.await(maxCommitMergeWaitMillis, TimeUnit.MILLISECONDS);
+          assert openingSegmentInfos != null;
+          synchronized (this) {
+            includeMergeReader.set(false);
+            boolean openNewReader = mergedReaders.isEmpty() == false;
+            if (openNewReader) {
+              StandardDirectoryReader mergedReader = StandardDirectoryReader.open(this,
+                  sci -> {
+                    // as soon as we remove the reader and return it the StandardDirectoryReader#open
+                    // will take care of closing it. We only need to handle the readers that remain in the
+                    // mergedReaders map and close them.
+                    SegmentReader remove = mergedReaders.remove(sci.info.name);
+                    if (remove == null) {
+                      remove = openedReadOnlyClones.remove(sci.info.name);
+                      assert remove != null;
+                      // each of the readers we reuse from the previous reader needs to be refInced
+                      // since we reuse them but don't have an implicit refInc in the SDR:open call
+                      remove.incRef();
+                    }
+                    return remove;
+                  }, openingSegmentInfos, applyAllDeletes, writeAllDeletes);
+              try {
+                r.close(); // close and swap in the new reader... close is cool here since we didn't leak this reader yet
+              } finally {
+                r = mergedReader;
+              }
+            }
+          }
+          replaceReaderSuccess = true;
+        } finally {
+          synchronized (this) {

Review comment:
       and this one, closeMergedReaders?

##########
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##########
@@ -607,6 +633,57 @@ DirectoryReader getReader(boolean applyAllDeletes, boolean writeAllDeletes) thro
           }
         }
       }
+      if (onCommitMerges != null) { // only relevant if we do merge on getReader
+        boolean replaceReaderSuccess = false;
+        try {
+          mergeScheduler.merge(mergeSource, MergeTrigger.GET_READER);
+          onCommitMerges.await(maxCommitMergeWaitMillis, TimeUnit.MILLISECONDS);
+          assert openingSegmentInfos != null;
+          synchronized (this) {

Review comment:
       This method is getting pretty big, and it might help readability if we named these synchronized blocks as functions. This one could be replaceReader()? OTOH maybe it requires too many parameters - it's hard to tell in code review 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org