You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2019/10/02 11:31:39 UTC
[GitHub] [lucene-solr] jpountz commented on issue #904: LUCENE-8992: Share
minimum score across segment in concurrent search
jpountz commented on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search
URL: https://github.com/apache/lucene-solr/pull/904#issuecomment-537451121
> I also had to deactivate some checks that ensure that multiple runs of the same query return the same number of total hits.
FYI luceneutil has an option for this: `verifyCounts=False`, e.g. I have this in my benchmark configuration when I test changes that don't provide reproducible counts:
```python
comp = competition.Competition(verifyScores=True, verifyCounts=False)
```
One thought I had when reviewing the patch: currently we check the minimum competitive score of other collectors when the minimum competitive score of the current collector needs updating. Depending on the query, this might be very frequent, or infrequent. Maybe another way to do this that would give us more control over this would be to introduce checkpoints via bulk scorers. IndexSearcher could do something like that:
```java
BulkScorer scorer = weight.bulkScorer(ctx);
if (scorer != null) {
try {
Bits liveDocs = ctx.reader().getLiveDocs();
if (multipleThreads && sortByScore) {
// Split the doc ID space to introduce checkpoints when collectors exchange information about their minimum competitive scores
int windowSize = 1 << 20; // arbitrary
for (int doc = 0; doc < maxDoc; ) {
int start = doc;
int end = (int) Math.min(maxDoc, (long) doc + windowSize);
float maxOfMinCompetitiveScores = ...; // can we get it here?
scorer.setMinCompetitiveScore(maxOfMinCompetitiveScores); // this method would have to be added, and implemented by propagating to the underlying scorer
doc = scorer.score(leafCollector, liveDocs, start, end);
}
} else {
scorer.score(leafCollector, liveDocs);
}
} catch (CollectionTerminatedException e) {
// collection was terminated prematurely
// continue with the following leaf
}
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org