You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/06/17 14:11:26 UTC
[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.
jpountz commented on PR #964:
URL: https://github.com/apache/lucene/pull/964#issuecomment-1158911923
Thanks for looking @romseygeek. To make sure this new API would effectively have more than one use-case, I migrated `TopScoreDocCollector` and `TopFieldCollector` to it too. The immediate benefit is that collectors that pass a `totalHitsThreshold` of `Integer.MAX_VALUE` will still be able to skip non-competitive hits if the weight supports counting hits. In addition to that, I fixed some tests that were assuming that `TotalHitCountCollector` would naively iterate over matches by using a new `DummyTotalHitCountCollector` instead.
I verified that there is no performance impact on luceneutil using `wikimedium10m`:
```
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
HighTerm 2374.78 (5.1%) 2297.55 (5.2%) -3.3% ( -12% - 7%) 0.047
MedTerm 2795.30 (5.4%) 2704.66 (5.6%) -3.2% ( -13% - 8%) 0.063
OrNotHighMed 1448.25 (3.9%) 1427.48 (4.5%) -1.4% ( -9% - 7%) 0.286
OrNotHighHigh 996.35 (3.1%) 982.37 (4.6%) -1.4% ( -8% - 6%) 0.255
OrHighNotMed 1898.69 (3.8%) 1876.02 (4.7%) -1.2% ( -9% - 7%) 0.375
AndHighLow 1049.40 (3.3%) 1042.92 (3.8%) -0.6% ( -7% - 6%) 0.583
HighSloppyPhrase 21.77 (4.0%) 21.66 (4.8%) -0.5% ( -8% - 8%) 0.716
LowTerm 2640.20 (6.3%) 2629.11 (4.2%) -0.4% ( -10% - 10%) 0.803
OrHighNotLow 1667.62 (4.2%) 1660.75 (5.6%) -0.4% ( -9% - 9%) 0.794
OrNotHighLow 1663.32 (3.0%) 1658.41 (4.2%) -0.3% ( -7% - 7%) 0.801
LowSloppyPhrase 54.27 (3.1%) 54.15 (3.6%) -0.2% ( -6% - 6%) 0.834
OrHighNotHigh 1259.39 (3.7%) 1257.03 (4.7%) -0.2% ( -8% - 8%) 0.889
MedSloppyPhrase 115.91 (4.3%) 115.79 (6.1%) -0.1% ( -10% - 10%) 0.952
PKLookup 249.41 (1.2%) 249.32 (1.5%) -0.0% ( -2% - 2%) 0.934
Fuzzy2 118.47 (1.1%) 118.75 (1.2%) 0.2% ( -2% - 2%) 0.538
Respell 74.59 (1.1%) 74.90 (1.5%) 0.4% ( -2% - 3%) 0.323
IntNRQ 682.36 (2.8%) 685.81 (3.7%) 0.5% ( -5% - 7%) 0.628
Fuzzy1 124.32 (1.1%) 125.09 (1.1%) 0.6% ( -1% - 2%) 0.079
MedPhrase 623.13 (3.3%) 627.26 (3.0%) 0.7% ( -5% - 7%) 0.502
OrHighMed 130.02 (3.7%) 130.94 (4.2%) 0.7% ( -6% - 8%) 0.571
LowPhrase 110.49 (3.6%) 111.30 (2.5%) 0.7% ( -5% - 7%) 0.459
Wildcard 40.65 (1.6%) 40.95 (1.8%) 0.7% ( -2% - 4%) 0.167
OrHighLow 1092.12 (3.0%) 1101.15 (2.7%) 0.8% ( -4% - 6%) 0.360
AndHighMed 234.73 (4.5%) 236.77 (5.3%) 0.9% ( -8% - 11%) 0.575
MedSpanNear 28.83 (4.1%) 29.14 (3.3%) 1.1% ( -6% - 8%) 0.369
LowSpanNear 16.20 (4.2%) 16.38 (3.4%) 1.1% ( -6% - 9%) 0.363
HighSpanNear 7.51 (4.7%) 7.59 (3.5%) 1.1% ( -6% - 9%) 0.405
AndHighHigh 70.69 (5.3%) 71.60 (6.4%) 1.3% ( -9% - 13%) 0.486
OrHighHigh 30.64 (3.2%) 31.07 (4.3%) 1.4% ( -5% - 9%) 0.244
HighPhrase 22.89 (3.8%) 23.25 (3.6%) 1.6% ( -5% - 9%) 0.178
Prefix3 421.34 (3.5%) 430.69 (4.4%) 2.2% ( -5% - 10%) 0.078
LowIntervalsOrdered 67.14 (4.8%) 69.35 (5.5%) 3.3% ( -6% - 14%) 0.043
HighIntervalsOrdered 6.49 (7.8%) 6.73 (7.1%) 3.7% ( -10% - 20%) 0.112
MedIntervalsOrdered 37.02 (7.8%) 38.45 (7.3%) 3.9% ( -10% - 20%) 0.108
HighTermDayOfYearSort 144.92 (3.7%) 150.78 (4.6%) 4.0% ( -4% - 12%) 0.002
TermDTSort 204.11 (7.0%) 213.24 (7.7%) 4.5% ( -9% - 20%) 0.055
HighTermMonthSort 154.26 (4.0%) 161.70 (4.9%) 4.8% ( -3% - 14%) 0.001
HighTermTitleBDVSort 248.08 (3.7%) 262.32 (8.8%) 5.7% ( -6% - 18%) 0.007
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org