You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Varun Thacker (JIRA)" <ji...@apache.org> on 2018/08/07 22:55:00 UTC
[jira] [Updated] (SOLR-12635) HashQParserPlugin should be run as a
post filter when executed from a ParallelStream
[ https://issues.apache.org/jira/browse/SOLR-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Varun Thacker updated SOLR-12635:
---------------------------------
Attachment: SOLR-12635.patch
> HashQParserPlugin should be run as a post filter when executed from a ParallelStream
> ------------------------------------------------------------------------------------
>
> Key: SOLR-12635
> URL: https://issues.apache.org/jira/browse/SOLR-12635
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Varun Thacker
> Assignee: Varun Thacker
> Priority: Major
> Attachments: SOLR-12635.patch
>
>
> I was doing some performance benchmarking for a user on slow streaming queries
> The weird thing was that same streaming expression was fast when we fired it again
> We were able to isolate the slowness to hash query parser
> Here is the first and second time we fired the query - to simplify things this is for one shard and for the same worker
> {code:java}
> path=/export params={q=*:*&distrib=false&indent=off&fl=fields&fq=user:1&fq={!hash workers=6 worker=3}&partitionKeys=partitionKey&sort=partitionKey asc&wt=javabin&version=2.2} hits=0 status=0 QTime=6821
> path=/export params={q=*:*&distrib=false&indent=off&fl=fields&fq=user:1&fq={!hash workers=6 worker=3}&partitionKeys=partitionKey&sort=partitionKey asc&wt=javabin&version=2.2} hits=0 status=0 QTime=0{code}
> Even with hits=0 the first query took 6.8 seconds. The shard has 17m documents
> The second query utilizes the queryResultCache and hence it's lightening fast the second time around.
> When we execute the same query and add a cost i.e {{&fq={!hash workers=6 worker=3}} cost=101} the query get's executed as a post filter and even uncashed is super fast.
> I created this Jira so that we can always set cost > 100 from the parallel stream.
> However I am happy to change the default behaviour for HashQParserPlugin and make it run as a post filter always unless explicitly specified. CollapsingQParserPlugin does this currently to make sure it's run as a post filter by default
> {code:java}
> public int getCost() {
> return Math.max(super.getCost(), 100);
> }{code}
> Thoughts anyone?
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org