You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2017/03/01 18:17:46 UTC

[jira] [Commented] (SOLR-10059) In SolrCloud, every fq added via is computed twice.

    [ https://issues.apache.org/jira/browse/SOLR-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890736#comment-15890736 ] 

Hoss Man commented on SOLR-10059:
---------------------------------

SOme historical context here is that when "distributed search" was first added, before there was any native "cloud support" the want to trigger a distributed search was to specify a list of shard URLs (as a request param) for the coordinator node to query & aggregate the responses from.  A common configuration pattern was for people to put the shards (URLS) in their handler defaults in solrconfig.xml -- but also have a "shards.qt" param that pointed at a different handler name. (to some other handler registration w/o the shards list) ... alternatively, some people deployed one solrconfig.xml file to the nodes that had data one them (and included things like defaults/appends fqs), and had completely diff solrconfig.xml for their coordinator nodes that only know about the shards param and the list of nodes to aggregate from.

you're definitely correct -- as things evolved into solr cloud, the fact that things like appends fqs are being computed multiple times because they come from both the coordinator node's init params and the individual shard's (identical) init params.

I think the the general approach #2 you suggested makes the most sense ... the bit of code (in RequestHandlerBase i believe?) where the defaults/invariants/appends are wrapped around/under the request params should be skipped in (some) solr cloud shard requests -- but i think checking IS_SHARD is really only 1 piece of the puzzle? for completeness we should probably also check that the SolrCore says we are in solrcloud mode (to ensure the user isn't rolling their own distributed search via pre-solrcloud shard requests like i described above)

the only other thing to worry about i guess is what should happen when multi-collection requests are issued? -- such as when a collection alias points to multiple collections.  Shouldn't the "appends" FQ params from collection1 be applied anytime a query includes collection1, and the appends FQ params from collection1 be applied any time a query includes collection2; even if those are both a single query that originated via a request to "both_collections" (which is an alias for "collection1,collection2") ?

I suppose the coordinating node could include the "source collection (alias)" of the request as a param that the individual shards could compare with themselves to decide when they need to wrap the params?

(just thinking outloud -- probably a better solution)






> In SolrCloud, every fq added via <lst name="appends"> is computed twice.
> ------------------------------------------------------------------------
>
>                 Key: SOLR-10059
>                 URL: https://issues.apache.org/jira/browse/SOLR-10059
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 6.4.0
>            Reporter: Marc Morissette
>              Labels: performance
>
> While researching another issue, I noticed that parameters appended to a query via SearchHandler's <lst name="appends"> are added to the query twice in SolrCloud: once on the aggregator and again on the shard.
> The FacetComponent corrects this automatically by removing duplicates. Field queries added in this fashion are however computed twice and that hinders performance on filter queries that aren't simple bitsets such as those produced by the CollapsingQueryParser.
> To reproduce the issue, simply test this handler on a large enough collection, then replace "appends" with "defaults". You'll notice significant performance improvements.
> {code}
> <requestHandler name="/myHandler" class="solr.SearchHandler">
>     <lst name="appends">
>         <str name="fq">{!collapse field=routingKey hint=top_fc}</str>
>     </lst>
> </requestHandler>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org