You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Tomas Eduardo Fernandez Lobbe (Jira)" <ji...@apache.org> on 2021/02/11 17:09:00 UTC

[jira] [Comment Edited] (SOLR-15114) WAND does not work correctly on multiple segments in Solr 8.6.3

    [ https://issues.apache.org/jira/browse/SOLR-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282783#comment-17282783 ] 

Tomas Eduardo Fernandez Lobbe edited comment on SOLR-15114 at 2/11/21, 5:08 PM:
--------------------------------------------------------------------------------

I've run a perf test on the change using Gatling:
 * Using a wikipedia snapshot (20M docs, shortened to 1k characters)
 * Using Mike McCandless [query set|https://github.com/mikemccand/luceneutil/blob/master/tasks/wikimedium.10M.tasks]
 * 10k queries per type (180k queries total)
 * 2 shards, 1 replica each (on the same node).
 * Each shard has ~30 segments
 * search on the article body
 * 10 parallel users
 * Single Solr instance (iMac Pro 3.2 GHz 8-Core Intel Xeon W with 128 GB RAM).
 * Gatling running in the machine
 * The default example Solr parameters
 * Always used {{rows=2}}, in the case of WAND I also added {{minExactCount=2}}

While there is some noise in the tests (I'd expect master WAND, master no-WAND and patch no-WAND to perform similarly), the WAND scenario with the patch applied is definitely faster:
||Stat||master WAND||master no-WAND||patch WAND||patch no-WAND||
|QPS|97.72|103.687|153.061|111.732|
|min|1|1|1|1|
|p50|39|39|23|36|
|p75|102|95|57|87|
|p95|387|350|245|322|
|p99|829|809|668|769|
|max|2405|2416|1331|2447|
|mean|95|89|59|82|
|std dev|155|147|110|139|


was (Author: tomasflobbe):
I've run a perf test on the change using Gatling:
 * Using a wikipedia snapshot (20M docs, shortened to 1k characters)
 * Using Mike McCandless [query set|https://github.com/mikemccand/luceneutil/blob/master/tasks/wikimedium.10M.tasks]
 * 10k queries per type (18k queries total)
 * 2 shards, 1 replica each (on the same node).
 * Each shard has ~30 segments
 * search on the article body
 * 10 parallel users
 * Single Solr instance (iMac Pro 3.2 GHz 8-Core Intel Xeon W with 128 GB RAM).
 * Gatling running in the machine
 * The default example Solr parameters
 * Always used {{rows=2}}, in the case of WAND I also added {{minExactCount=2}}

While there is some noise in the tests (I'd expect master WAND, master no-WAND and patch no-WAND to perform similarly), the WAND scenario with the patch applied is definitely faster:
||Stat||master WAND||master no-WAND||patch WAND||patch no-WAND||
|QPS|97.72|103.687|153.061|111.732|
|min|1|1|1|1|
|p50|39|39|23|36|
|p75|102|95|57|87|
|p95|387|350|245|322|
|p99|829|809|668|769|
|max|2405|2416|1331|2447|
|mean|95|89|59|82|
|std dev|155|147|110|139|

> WAND does not work correctly on multiple segments in Solr 8.6.3
> ---------------------------------------------------------------
>
>                 Key: SOLR-15114
>                 URL: https://issues.apache.org/jira/browse/SOLR-15114
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.6.3, master (9.0)
>            Reporter: Naoto Minami
>            Assignee: Tomas Eduardo Fernandez Lobbe
>            Priority: Blocker
>             Fix For: master (9.0), 8.8.1
>
>         Attachments: wand.pdf
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In Solr 8.6.3, minCompetitiveScore of WANDScorer resets to zero for each index segment and remain zero until maxScore is updated.
> There are two causes of this problem:
>  - MaxScoreCollector does not set minCompetitiveScore of MinCompetitiveScoreAwareScorable newly generated for another index segment.
>  - MaxScoreCollector updates minCompetitiveScore only if maxScore is updated. This behavior is correct considering the purpose of MaxScoreCollector.
> For details, see the attached pdf.
> *Note*
> This problem occurs in distributed search (SolrCloud) or the fl=score parameter specified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org