You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/06/29 16:21:47 UTC
[GitHub] [lucene] jpountz commented on a diff in pull request #995: LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code

jpountz commented on code in PR #995:
URL: https://github.com/apache/lucene/pull/995#discussion_r910164457


##########
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##########
@@ -3382,6 +3383,7 @@ private static void checkSortedSetDocValues(
         seenOrds.set(ord);
         ordCount++;
       }
+

Review Comment:
   At this point `ord` is going to be the last ord while it used to always be NO_MORE_ORDS, which I suspect may cause CheckIndex failures.



##########
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##########
@@ -304,25 +306,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
     private void setOrd() throws IOException {
       if (docID() != NO_MORE_DOCS) {
-        int upto = 0;
-        while (true) {
-          long nextOrd = in.nextOrd();
-          if (nextOrd == NO_MORE_ORDS) {
-            break;
-          }
-          if (upto == ords.length) {
-            ords = ArrayUtil.grow(ords);
-          }
-          ords[upto++] = (int) nextOrd;
-        }
-
-        if (upto == 0) {
+        int docValueCount = in.docValueCount();
+        if (docValueCount == 0) {

Review Comment:
   likewise here



##########
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##########
@@ -304,25 +306,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
     private void setOrd() throws IOException {
       if (docID() != NO_MORE_DOCS) {
-        int upto = 0;
-        while (true) {
-          long nextOrd = in.nextOrd();
-          if (nextOrd == NO_MORE_ORDS) {
-            break;
-          }
-          if (upto == ords.length) {
-            ords = ArrayUtil.grow(ords);
-          }
-          ords[upto++] = (int) nextOrd;
-        }
-
-        if (upto == 0) {
+        int docValueCount = in.docValueCount();
+        if (docValueCount == 0) {
           // iterator should not have returned this docID if it has no ords:
           assert false;
           ord = (int) NO_MORE_ORDS;
-        } else {
-          ord = ords[(upto - 1) >>> 1];
+          return;
+        }
+
+        ords = ArrayUtil.grow(ords, docValueCount);
+        for (int i = 0; i < docValueCount; i++) {
+          ords[i] = (int) in.nextOrd();
         }
+        ord = ords[(docValueCount - 1) >>> 1];

Review Comment:
   we don't even need to buffer ords now that we know their number I think?
   We could compute the index of the ord we're interested in, then consume this number of ords, and the next ord would be the median ord?



##########
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##########
@@ -226,12 +226,14 @@ public int lookupTerm(BytesRef key) throws IOException {
 
     private void setOrd() throws IOException {
       if (docID() != NO_MORE_DOCS) {
-        while (true) {
-          long nextOrd = in.nextOrd();
-          if (nextOrd == NO_MORE_ORDS) {
-            break;
+        int docValueCount = in.docValueCount();
+        if (docValueCount == 0) {

Review Comment:
   docValueCount may never return 0, we can drop this branch



##########
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##########
@@ -394,25 +390,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
     private void setOrd() throws IOException {
       if (docID() != NO_MORE_DOCS) {
-        int upto = 0;
-        while (true) {
-          long nextOrd = in.nextOrd();
-          if (nextOrd == NO_MORE_ORDS) {
-            break;
-          }
-          if (upto == ords.length) {
-            ords = ArrayUtil.grow(ords);
-          }
-          ords[upto++] = (int) nextOrd;
-        }
-
-        if (upto == 0) {
+        int docValueCount = in.docValueCount();
+        if (docValueCount == 0) {
           // iterator should not have returned this docID if it has no ords:
           assert false;
           ord = (int) NO_MORE_ORDS;
-        } else {
-          ord = ords[upto >>> 1];
+          return;
+        }
+
+        ords = ArrayUtil.grow(ords, docValueCount);
+        for (int i = 0; i < docValueCount; i++) {
+          ords[i] = (int) in.nextOrd();
         }
+        ord = ords[docValueCount >>> 1];

Review Comment:
   and likewise here?



##########
lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java:
##########
@@ -394,25 +390,19 @@ public int lookupTerm(BytesRef key) throws IOException {
 
     private void setOrd() throws IOException {
       if (docID() != NO_MORE_DOCS) {
-        int upto = 0;
-        while (true) {
-          long nextOrd = in.nextOrd();
-          if (nextOrd == NO_MORE_ORDS) {
-            break;
-          }
-          if (upto == ords.length) {
-            ords = ArrayUtil.grow(ords);
-          }
-          ords[upto++] = (int) nextOrd;
-        }
-
-        if (upto == 0) {
+        int docValueCount = in.docValueCount();
+        if (docValueCount == 0) {

Review Comment:
   we can drop this branch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org