You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2022/12/13 13:44:48 UTC

[GitHub] [solr] magibney commented on a diff in pull request #1236: SOLR-16585: Fix NPE in MatchAllDocs pagination

magibney commented on code in PR #1236:
URL: https://github.com/apache/solr/pull/1236#discussion_r1047173635


##########
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##########
@@ -2203,15 +2203,19 @@ public DocListAndSet getDocListAndSet(Query query, Sort lsort, int offset, int l
 
   private DocList constantScoreDocList(int offset, int length, DocSet docs) {
     final int size = docs.size();
-    if (length == 0 || size <= offset) {
-      return new DocSlice(0, 0, new int[0], null, size, 0f, TotalHits.Relation.EQUAL_TO);
-    }
-    final DocIterator iter = docs.iterator();
-    for (int i = offset; i > 0; i--) {
-      iter.nextDoc(); // discard
-    }
-    final int returnSize = Math.min(length, size - offset);
+
+    // NOTE: it would be possible to special-case `length == 0 || size <= offset` here
+    // (returning a DocList backed by an empty array) -- but the cases that would practically
+    // benefit from doing so would be extremely unusual, and likely pathological:
+    //   1. length==0 in conjunction with offset>0 (why?)
+    //   2. specifying offset>size (paging beyond end of results)
+    // This would require special consideration in dealing with cache handling (and generation
+    // of the final DocList via `DocSlice.subset(int, int)`), and it's just not worth it.
+
+    final int returnSize = Math.min(offset + length, size);

Review Comment:
   That's an indirect consequence of the essence of the bug. In SOLR-14765 I incorrectly assumed it was ok to trim the result list at the point of initial DocSlice creation -- i.e., if you have offset=200, len=10, that it should be possible to return a DocSlice backed by a 10-element int[]. But in fact there's an assumption, in all cases, that index 0 of the backing docs int[] is the top doc (and consequently we always need to populate the array out as far as docs[offset+length].
   
   So at this stage, DocSlice is initialized with an `offset=0` ctor arg, where offset reflects both the offset into the array _and_ the offset of the first element of the array . The _actual_ returned docList is created subsequently by calling initialDocList.subset(actualRequestedOffset, actualRequestedLength). In fact, in the codebase the initial DocSlice is almost always created with offset=0 -- I toyed with the idea of actually enforcing this, making a public ctor that always sets offset=0 and length=docs.length, and only setting explicit offset/len via a private ctor (invoked by DocSlice.subset(int, int)). But it became more complex than I'd hoped, so I figured to keep this change as small as possible.
   
   When initial DocSlices are created in this way, it simplifies the logic about how they can be cached and repurposed to serve other ranges of requested docs. The special-casing of length=0 I think would also have messed with this, hence the removal of the special-casing. Worst-case consequence of removing the special-casing is that if somebody requests offset>0, length=0, we now have to build and populate an array of `offset` elements, or if someone requests offset>docSet.size(), length>0, we now have to build and populate an array of `docSet.size()` elements. Neither of these would actually be used to satisfy the immediate request, but both cases are weird/pathological edge cases so probably not worth special-casing, given the hoops we'd have to jump through, e.g. to avoid caching them, etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org