You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/04/04 09:23:52 UTC

[GitHub] [lucene] wjp719 opened a new pull request, #780: LUCENE-10496: disable unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

wjp719 opened a new pull request, #780:
URL: https://github.com/apache/lucene/pull/780

   users ofter write doc with indexSorting in one direction(asc or desc) , but need to search top docs both in two direction (asc and desc)
   
   if index sort and search sort are in opposite direction, NumericLeafComparator needn't to check if can skip non-competitive doc inside one segments, because the rest docs are all competitive. 
   
   
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request title.
   - [ ] I have given Lucene maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r842415780


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -94,6 +95,7 @@ public void disableSkipping() {
     private long iteratorCost;
     private int maxDocVisited = -1;
     private int updateCounter = 0;
+    private boolean disableSegmentInternalSkip = false;

Review Comment:
   Thanks, I had missed this. In my opinion, this adds too much complexity for the problem that it is solving. I think we should either accept the skipping overhead even though skipping won't help much, or disable skipping entirely.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz merged pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz merged PR #780:
URL: https://github.com/apache/lucene/pull/780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 closed pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 closed pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction
URL: https://github.com/apache/lucene/pull/780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r842427168


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -94,6 +95,7 @@ public void disableSkipping() {
     private long iteratorCost;
     private int maxDocVisited = -1;
     private int updateCounter = 0;
+    private boolean disableSegmentInternalSkip = false;

Review Comment:
   Ok, we should keep skipping, it help more than the skipping overhead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1119568905

   @jpountz Hi, I add an adaptive skip interval to avoid unnecessary frequent attempts to skip docs, can you help to review again.
   
   I also rerun the rally tests, following is the result:
   
   #### data set
   es rally httpLog
   #### main operations
   1. use ssd disk
   1. add index sort `desc` of field `@timestamp`
   2. set shard_number as 2
   3. force merge to 1 segment
   4. only search one index `logs-241998` which is 13GB and 181million docs in total
   #### query result
   |                                                        Metric |                                        Task |    Baseline |   Contender |     Diff |   Unit |
   |--------------------------------------------------------------:|--------------------------------------------:|------------:|------------:|---------:|-------:|
   |                                  50th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     6750.63 |     4849.66 | -1900.97 |     ms |
   |                                  90th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     6844.34 |      4921.1 | -1923.23 |     ms |
   |                                  99th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     7011.71 |     5076.31 | -1935.41 |     ms |
   |                                 100th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     7014.05 |     5112.71 | -1901.33 |     ms |
   
   we can see 28% decrease  (6844.34ms->4921.1ms) of query latency if we apply this pr.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r841995188


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -94,6 +95,7 @@ public void disableSkipping() {
     private long iteratorCost;
     private int maxDocVisited = -1;
     private int updateCounter = 0;
+    private boolean disableSegmentInternalSkip = false;

Review Comment:
   we need enableSkipping to be true to skip doc between segments, because docs in one segment can be skipped successfully compared to another segment's docs. This pr only wants to disable skip in the internal of one segment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r842438315


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -94,6 +95,7 @@ public void disableSkipping() {
     private long iteratorCost;
     private int maxDocVisited = -1;
     private int updateCounter = 0;
+    private boolean disableSegmentInternalSkip = false;

Review Comment:
   I will close this pr. thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r842434529


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -94,6 +95,7 @@ public void disableSkipping() {
     private long iteratorCost;
     private int maxDocVisited = -1;
     private int updateCounter = 0;
+    private boolean disableSegmentInternalSkip = false;

Review Comment:
   OK, then my suggestion would be to close this PR if that works for you too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1112123414

   @jpountz Ok, tanks. As users ofter need to search on both directions but index data can only be index sorted by one directions. This situation may be common for users.
   
   > If you can find an approach that is less intrusive 
   
   can you give me any hits to achieve this? Thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r868900109


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -42,6 +42,8 @@
  * but in this case you must override both of these methods.
  */
 public abstract class NumericComparator<T extends Number> extends FieldComparator<T> {
+  private final int minSkipInterval = 32;
+  private final int maxSkipInterval = 8192;

Review Comment:
   Can you make these constants (`private static final int MIN_SKIP_INTERVAL = 32`) and leave a comment that they need to be powers of 2?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r841911556


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -94,6 +95,7 @@ public void disableSkipping() {
     private long iteratorCost;
     private int maxDocVisited = -1;
     private int updateCounter = 0;
+    private boolean disableSegmentInternalSkip = false;

Review Comment:
   could we reuse `enableSkipping` for this purpose?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 closed pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 closed pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction
URL: https://github.com/apache/lucene/pull/780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1112113792

   If you can find an approach that is less intrusive, maybe I would reconsider, but it still looks to me like we're adding complexity to work around a bad user decisions, which doesn't feel like the right trade-off to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r869932548


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -269,11 +276,23 @@ public PointValues.Relation compare(byte[] minPackedValue, byte[] maxPackedValue
       if (estimatedNumberOfMatches >= threshold) {
         // the new range is not selective enough to be worth materializing, it doesn't reduce number
         // of docs at least 8x
+        if (updateCounter > 256) {
+          if (tryUpdateFailCount >= 3) {
+            currentSkipInterval = Math.min(currentSkipInterval * 2, MAX_SKIP_INTERVAL);
+            tryUpdateFailCount = 0;
+          } else {
+            tryUpdateFailCount++;
+          }
+        }

Review Comment:
   this logic is not entirely trivial, so inlining it here makes the code a bit hard to read, can you maybe extract it into an `updateSkipInterval(boolean success)` helper method that would encapsulate it, plus the other code addition you made below, or something along these lines?



##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -269,11 +276,23 @@ public PointValues.Relation compare(byte[] minPackedValue, byte[] maxPackedValue
       if (estimatedNumberOfMatches >= threshold) {
         // the new range is not selective enough to be worth materializing, it doesn't reduce number
         // of docs at least 8x
+        if (updateCounter > 256) {
+          if (tryUpdateFailCount >= 3) {
+            currentSkipInterval = Math.min(currentSkipInterval * 2, MAX_SKIP_INTERVAL);
+            tryUpdateFailCount = 0;
+          } else {
+            tryUpdateFailCount++;
+          }
+        }

Review Comment:
   Maybe also add a comment about `tryUpdateFailCount` that explains that it helps be conservative about increasing the sampling interval?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1123104829

   @jpountz Hi, I modify the logic.  The skip interval is only changed when `updateCounter` is bigger than 256, and the speed of double skip interval is 3 times slower than the speed of divide skip interval. 
   
   Based on the new logic, following is the performance tests result of no index sort in multiple segments and one segment per shard:
   
   #### data set
   es rally httpLog
   #### main operations
   1. use ssd disk
   2. no index sort  
   3. set shard_number as 2 
   4. only search one index `logs-241998` which is 13GB and 181million docs in total
   #### query result
   1. case `asc_sort_timestamp` 6.28% decrease  (121.195ms->113.581ms) of query latency
   1. case ` asc-sort-timestamp-after-force-merge-1-seg` 9.36% decrease  (485.229ms->439.778ms) of query latency
   2. case `asc-sort-with-after-timestamp-after-force-merge-1-seg` 9.31% decrease  (441.676ms->400.519ms) of query latency
   3. other cases also have performance improvement
   
   |                                                        Metric |                                        Task |    Baseline |   Contender |     Diff |   Unit |
   |--------------------------------------------------------------:|--------------------------------------------:|------------:|------------:|---------:|-------:|
   |                                  50th percentile service time |            desc_sort_timestamp |     10.0582 |     9.32669 | -0.73154 |     ms |
   |                                  90th percentile service time |            desc_sort_timestamp |     10.7014 |     9.61116 | -1.09025 |     ms |
   |                                  99th percentile service time |            desc_sort_timestamp |     13.8007 |     9.93615 | -3.86456 |     ms |
   |                                 100th percentile service time |            desc_sort_timestamp |     30.3847 |     10.2724 | -20.1123 |     ms |
   |                                  50th percentile service time |             asc_sort_timestamp |     118.245 |     111.074 | -7.17049 |     ms |
   |                                  90th percentile service time |             asc_sort_timestamp |     121.195 |     113.581 | -7.61345 |     ms |
   |                                  99th percentile service time |             asc_sort_timestamp |     126.698 |      114.86 |  -11.838 |     ms |
   |                                 100th percentile service time |             asc_sort_timestamp |     131.592 |     126.641 | -4.95039 |     ms |
   |                                  50th percentile service time | desc_sort_with_after_timestamp |     71.1903 |     67.4284 | -3.76184 |     ms |
   |                                  90th percentile service time | desc_sort_with_after_timestamp |     72.1165 |     68.5731 | -3.54346 |     ms |
   |                                  99th percentile service time | desc_sort_with_after_timestamp |     81.3781 |     70.8015 | -10.5765 |     ms |
   |                                 100th percentile service time | desc_sort_with_after_timestamp |     82.0475 |     94.6865 |  12.6391 |     ms |
   |                                  50th percentile service time |  asc_sort_with_after_timestamp |     92.1529 |     94.0611 |  1.90815 |     ms |
   |                                  90th percentile service time |  asc_sort_with_after_timestamp |     96.2921 |     95.5402 | -0.75191 |     ms |
   |                                  99th percentile service time |  asc_sort_with_after_timestamp |     118.337 |     114.149 | -4.18805 |     ms |
   |                                 100th percentile service time |  asc_sort_with_after_timestamp |     119.652 |     186.802 |  67.1499 |     ms |
   |                                  50th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |      131.23 |     125.879 |  -5.3516 |     ms |
   |                                  90th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     133.185 |     130.755 | -2.43072 |     ms |
   |                                  99th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |      154.65 |      140.28 |   -14.37 |     ms |
   |                                 100th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     158.243 |     151.763 | -6.47964 |     ms |
   |                                  50th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     472.373 |     429.798 | -42.5744 |     ms |
   |                                  90th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     485.229 |     439.778 | -45.4507 |     ms |
   |                                  99th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     504.165 |     455.842 | -48.3231 |     ms |
   |                                 100th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     532.754 |     457.975 | -74.7793 |     ms |
   |                                  50th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     84.5181 |     80.2796 |  -4.2385 |     ms |
   |                                  90th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     85.9541 |     82.4319 | -3.52219 |     ms |
   |                                  99th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     116.454 |      96.489 | -19.9651 |     ms |
   |                                 100th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     120.934 |     97.8837 | -23.0508 |     ms |
   |                                  50th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     428.985 |     392.431 | -36.5541 |     ms |
   |                                  90th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     441.676 |     400.519 | -41.1567 |     ms |
   |                                  99th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     449.748 |     411.862 | -37.8854 |     ms |
   |                                 100th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     450.671 |     424.775 | -25.8964 |     ms |
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on a diff in pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on code in PR #780:
URL: https://github.com/apache/lucene/pull/780#discussion_r869953128


##########
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##########
@@ -269,11 +276,23 @@ public PointValues.Relation compare(byte[] minPackedValue, byte[] maxPackedValue
       if (estimatedNumberOfMatches >= threshold) {
         // the new range is not selective enough to be worth materializing, it doesn't reduce number
         // of docs at least 8x
+        if (updateCounter > 256) {
+          if (tryUpdateFailCount >= 3) {
+            currentSkipInterval = Math.min(currentSkipInterval * 2, MAX_SKIP_INTERVAL);
+            tryUpdateFailCount = 0;
+          } else {
+            tryUpdateFailCount++;
+          }
+        }

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1123279942

   > I'm curious about `tryUpdateFailCount`, did you get better results on the benchmark with it than without it?
   
   @jpountz yes, with `tryUpdateFailCount`, the case `asc_sort_with_after_timestamp` perform better than without it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
jpountz commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1112127776

   Why do you sort the index if you need to sort in both directions? Is it for range queries? Could you use points instead?
   
   Sorry I don't have good ideas for making it less intrusive.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] wjp719 commented on pull request #780: LUCENE-10496: avoid unnecessary attempts to evaluate skipping doc if index sort and search sort are in opposite direction

Posted by GitBox <gi...@apache.org>.
wjp719 commented on PR #780:
URL: https://github.com/apache/lucene/pull/780#issuecomment-1111151403

   I use es rally `httpLog` dataSet to compare the performance of this pr
   
   #### data set
   es rally httpLog
   #### main operations
   1. use ssd disk
   1. add index sort `desc` of field `@timestamp`
   2. set shard_number as 2
   3. force merge to 1 segment
   4. only search one index `logs-241998` which is 13GB and 181million docs in total
   #### query result
   |                                                        Metric |                                        Task |    Baseline |   Contender |     Diff |   Unit |
   |--------------------------------------------------------------:|--------------------------------------------:|------------:|------------:|---------:|-------:|
   |                                  50th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     7148.99 |     5283.69 | -1865.29 |     ms |
   |                                  90th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     7240.72 |     5530.43 | -1710.29 |     ms |
   |                                  99th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     7360.77 |     5695.64 | -1665.13 |     ms |
   |                                 100th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |     7432.98 |     5743.79 | -1689.19 |     ms |
   
   @jpountz we can see 25% decrease  (7240.72ms->5530ms) of query latency if we apply this pr. As Lucene `NumericComparator` will run `PointValues#estimatePointCount`  every 32 doc, this is really cpu consuming. So maybe it's worth to apply this pr to reduce cpu?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org