You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/02/04 15:51:45 UTC

[GitHub] [lucene] mikemccand commented on a change in pull request #630: LUCENE-10371 Make IndexRearranger able to arrange segment in a determined order

mikemccand commented on a change in pull request #630:
URL: https://github.com/apache/lucene/pull/630#discussion_r799585879



##########
File path: lucene/misc/src/java/org/apache/lucene/misc/index/IndexRearranger.java
##########
@@ -84,6 +112,28 @@ public void execute() throws Exception {
       }
       executor.shutdown();
     }
+    if (determinedOrder == false) {
+      return;
+    }
+    List<SegmentCommitInfo> ordered = new ArrayList<>();
+    try (IndexReader reader = DirectoryReader.open(output)) {
+      for (DocumentSelector ds : documentSelectors) {
+        boolean found = false;
+        for (LeafReaderContext context : reader.leaves()) {
+          SegmentReader sr = (SegmentReader) context.reader();
+          if (ds.getFilteredLiveDocs(sr).nextSetBit(0) != DocIdSetIterator.NO_MORE_DOCS) {
+            assert found == false;

Review comment:
       If we will keep the `determinedOrder` `boolean`, can we upgrade this `assert` to a real `if` so we are sure to catch invalid usage instead of silently messing up the resulting index when assertions are disabled?

##########
File path: lucene/misc/src/java/org/apache/lucene/misc/index/IndexRearranger.java
##########
@@ -49,16 +53,40 @@
   protected final Directory input, output;
   protected final IndexWriterConfig config;
   protected final List<DocumentSelector> documentSelectors;
+  protected final boolean determinedOrder;
 
+  /**
+   * Constructor
+   *
+   * @param input input dir
+   * @param output output dir
+   * @param config index writer config
+   * @param documentSelectors specify what document is desired in the rearranged index segments,
+   *     each selector correspond to one segment
+   * @param determinedOrder make sure the rearranged index have the segment order aligned with the

Review comment:
       Should we maybe just hardwire to `true`?  Is there any good reason to keep `false`?  I guess if you want to duplicate documents still?  But is that really an important usage?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org