You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/07/06 13:50:17 UTC

[GitHub] [lucene] mayya-sharipova commented on a change in pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID

mayya-sharipova commented on a change in pull request #204:
URL: https://github.com/apache/lucene/pull/204#discussion_r664573881



##########
File path: lucene/core/src/java/org/apache/lucene/search/comparators/DocComparator.java
##########
@@ -81,7 +87,12 @@ public Integer value(int slot) {
     public DocLeafComparator(LeafReaderContext context) {
       this.docBase = context.docBase;
       if (enableSkipping) {
-        this.minDoc = topValue + 1;
+        // For a single sort on _doc, we want to skip all docs before topValue.
+        // For multiple fields sort on [_doc, other fields], we want to include docs with the same
+        // docID.
+        // This is needed in a distributed search, where there are docs from different indices with
+        // the same docID.
+        this.minDoc = singleSort ? topValue + 1 : topValue;

Review comment:
       Great comment! +1 for simplifying the code at the expense for extra single case in `DocComparator`.  
   
   For `NumericComparator` though this is not the case, and there could be huge number of docs with the same value, so extra optimization for `singleSort` is important.
   
   > I guess it doesn't specifically address the case where _doc is the last sort, for example a sort on ["some_field", "_doc"], where we could also use topValue + 1.
   
   No, the sort optimizations in `DocComparator` are not applicable where `_doc` is the 1st sort. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org