You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Tomas Eduardo Fernandez Lobbe (Jira)" <ji...@apache.org> on 2021/02/11 17:09:00 UTC
[jira] [Comment Edited] (SOLR-15114) WAND does not work correctly
on multiple segments in Solr 8.6.3
[ https://issues.apache.org/jira/browse/SOLR-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282783#comment-17282783 ]
Tomas Eduardo Fernandez Lobbe edited comment on SOLR-15114 at 2/11/21, 5:08 PM:
--------------------------------------------------------------------------------
I've run a perf test on the change using Gatling:
* Using a wikipedia snapshot (20M docs, shortened to 1k characters)
* Using Mike McCandless [query set|https://github.com/mikemccand/luceneutil/blob/master/tasks/wikimedium.10M.tasks]
* 10k queries per type (180k queries total)
* 2 shards, 1 replica each (on the same node).
* Each shard has ~30 segments
* search on the article body
* 10 parallel users
* Single Solr instance (iMac Pro 3.2 GHz 8-Core Intel Xeon W with 128 GB RAM).
* Gatling running in the machine
* The default example Solr parameters
* Always used {{rows=2}}, in the case of WAND I also added {{minExactCount=2}}
While there is some noise in the tests (I'd expect master WAND, master no-WAND and patch no-WAND to perform similarly), the WAND scenario with the patch applied is definitely faster:
||Stat||master WAND||master no-WAND||patch WAND||patch no-WAND||
|QPS|97.72|103.687|153.061|111.732|
|min|1|1|1|1|
|p50|39|39|23|36|
|p75|102|95|57|87|
|p95|387|350|245|322|
|p99|829|809|668|769|
|max|2405|2416|1331|2447|
|mean|95|89|59|82|
|std dev|155|147|110|139|
was (Author: tomasflobbe):
I've run a perf test on the change using Gatling:
* Using a wikipedia snapshot (20M docs, shortened to 1k characters)
* Using Mike McCandless [query set|https://github.com/mikemccand/luceneutil/blob/master/tasks/wikimedium.10M.tasks]
* 10k queries per type (18k queries total)
* 2 shards, 1 replica each (on the same node).
* Each shard has ~30 segments
* search on the article body
* 10 parallel users
* Single Solr instance (iMac Pro 3.2 GHz 8-Core Intel Xeon W with 128 GB RAM).
* Gatling running in the machine
* The default example Solr parameters
* Always used {{rows=2}}, in the case of WAND I also added {{minExactCount=2}}
While there is some noise in the tests (I'd expect master WAND, master no-WAND and patch no-WAND to perform similarly), the WAND scenario with the patch applied is definitely faster:
||Stat||master WAND||master no-WAND||patch WAND||patch no-WAND||
|QPS|97.72|103.687|153.061|111.732|
|min|1|1|1|1|
|p50|39|39|23|36|
|p75|102|95|57|87|
|p95|387|350|245|322|
|p99|829|809|668|769|
|max|2405|2416|1331|2447|
|mean|95|89|59|82|
|std dev|155|147|110|139|
> WAND does not work correctly on multiple segments in Solr 8.6.3
> ---------------------------------------------------------------
>
> Key: SOLR-15114
> URL: https://issues.apache.org/jira/browse/SOLR-15114
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 8.6.3, master (9.0)
> Reporter: Naoto Minami
> Assignee: Tomas Eduardo Fernandez Lobbe
> Priority: Blocker
> Fix For: master (9.0), 8.8.1
>
> Attachments: wand.pdf
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In Solr 8.6.3, minCompetitiveScore of WANDScorer resets to zero for each index segment and remain zero until maxScore is updated.
> There are two causes of this problem:
> - MaxScoreCollector does not set minCompetitiveScore of MinCompetitiveScoreAwareScorable newly generated for another index segment.
> - MaxScoreCollector updates minCompetitiveScore only if maxScore is updated. This behavior is correct considering the purpose of MaxScoreCollector.
> For details, see the attached pdf.
> *Note*
> This problem occurs in distributed search (SolrCloud) or the fl=score parameter specified.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org