You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/06/17 14:11:26 UTC

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

jpountz commented on PR #964:
URL: https://github.com/apache/lucene/pull/964#issuecomment-1158911923

   Thanks for looking @romseygeek. To make sure this new API would effectively have more than one use-case, I migrated `TopScoreDocCollector` and `TopFieldCollector` to it too. The immediate benefit is that collectors that pass a `totalHitsThreshold` of `Integer.MAX_VALUE` will still be able to skip non-competitive hits if the weight supports counting hits. In addition to that, I fixed some tests that were assuming that `TotalHitCountCollector` would naively iterate over matches by using a new `DummyTotalHitCountCollector` instead.
   
   I verified that there is no performance impact on luceneutil using `wikimedium10m`:
   
   ```
                               TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                           HighTerm     2374.78      (5.1%)     2297.55      (5.2%)   -3.3% ( -12% -    7%) 0.047
                            MedTerm     2795.30      (5.4%)     2704.66      (5.6%)   -3.2% ( -13% -    8%) 0.063
                       OrNotHighMed     1448.25      (3.9%)     1427.48      (4.5%)   -1.4% (  -9% -    7%) 0.286
                      OrNotHighHigh      996.35      (3.1%)      982.37      (4.6%)   -1.4% (  -8% -    6%) 0.255
                       OrHighNotMed     1898.69      (3.8%)     1876.02      (4.7%)   -1.2% (  -9% -    7%) 0.375
                         AndHighLow     1049.40      (3.3%)     1042.92      (3.8%)   -0.6% (  -7% -    6%) 0.583
                   HighSloppyPhrase       21.77      (4.0%)       21.66      (4.8%)   -0.5% (  -8% -    8%) 0.716
                            LowTerm     2640.20      (6.3%)     2629.11      (4.2%)   -0.4% ( -10% -   10%) 0.803
                       OrHighNotLow     1667.62      (4.2%)     1660.75      (5.6%)   -0.4% (  -9% -    9%) 0.794
                       OrNotHighLow     1663.32      (3.0%)     1658.41      (4.2%)   -0.3% (  -7% -    7%) 0.801
                    LowSloppyPhrase       54.27      (3.1%)       54.15      (3.6%)   -0.2% (  -6% -    6%) 0.834
                      OrHighNotHigh     1259.39      (3.7%)     1257.03      (4.7%)   -0.2% (  -8% -    8%) 0.889
                    MedSloppyPhrase      115.91      (4.3%)      115.79      (6.1%)   -0.1% ( -10% -   10%) 0.952
                           PKLookup      249.41      (1.2%)      249.32      (1.5%)   -0.0% (  -2% -    2%) 0.934
                             Fuzzy2      118.47      (1.1%)      118.75      (1.2%)    0.2% (  -2% -    2%) 0.538
                            Respell       74.59      (1.1%)       74.90      (1.5%)    0.4% (  -2% -    3%) 0.323
                             IntNRQ      682.36      (2.8%)      685.81      (3.7%)    0.5% (  -5% -    7%) 0.628
                             Fuzzy1      124.32      (1.1%)      125.09      (1.1%)    0.6% (  -1% -    2%) 0.079
                          MedPhrase      623.13      (3.3%)      627.26      (3.0%)    0.7% (  -5% -    7%) 0.502
                          OrHighMed      130.02      (3.7%)      130.94      (4.2%)    0.7% (  -6% -    8%) 0.571
                          LowPhrase      110.49      (3.6%)      111.30      (2.5%)    0.7% (  -5% -    7%) 0.459
                           Wildcard       40.65      (1.6%)       40.95      (1.8%)    0.7% (  -2% -    4%) 0.167
                          OrHighLow     1092.12      (3.0%)     1101.15      (2.7%)    0.8% (  -4% -    6%) 0.360
                         AndHighMed      234.73      (4.5%)      236.77      (5.3%)    0.9% (  -8% -   11%) 0.575
                        MedSpanNear       28.83      (4.1%)       29.14      (3.3%)    1.1% (  -6% -    8%) 0.369
                        LowSpanNear       16.20      (4.2%)       16.38      (3.4%)    1.1% (  -6% -    9%) 0.363
                       HighSpanNear        7.51      (4.7%)        7.59      (3.5%)    1.1% (  -6% -    9%) 0.405
                        AndHighHigh       70.69      (5.3%)       71.60      (6.4%)    1.3% (  -9% -   13%) 0.486
                         OrHighHigh       30.64      (3.2%)       31.07      (4.3%)    1.4% (  -5% -    9%) 0.244
                         HighPhrase       22.89      (3.8%)       23.25      (3.6%)    1.6% (  -5% -    9%) 0.178
                            Prefix3      421.34      (3.5%)      430.69      (4.4%)    2.2% (  -5% -   10%) 0.078
                LowIntervalsOrdered       67.14      (4.8%)       69.35      (5.5%)    3.3% (  -6% -   14%) 0.043
               HighIntervalsOrdered        6.49      (7.8%)        6.73      (7.1%)    3.7% ( -10% -   20%) 0.112
                MedIntervalsOrdered       37.02      (7.8%)       38.45      (7.3%)    3.9% ( -10% -   20%) 0.108
              HighTermDayOfYearSort      144.92      (3.7%)      150.78      (4.6%)    4.0% (  -4% -   12%) 0.002
                         TermDTSort      204.11      (7.0%)      213.24      (7.7%)    4.5% (  -9% -   20%) 0.055
                  HighTermMonthSort      154.26      (4.0%)      161.70      (4.9%)    4.8% (  -3% -   14%) 0.001
               HighTermTitleBDVSort      248.08      (3.7%)      262.32      (8.8%)    5.7% (  -6% -   18%) 0.007
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org