You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Cao Manh Dat (JIRA)" <ji...@apache.org> on 2019/05/03 11:25:00 UTC

[jira] [Comment Edited] (SOLR-13445) Preferred replicas on nodes with same system properties as the query master

    [ https://issues.apache.org/jira/browse/SOLR-13445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832436#comment-16832436 ] 

Cao Manh Dat edited comment on SOLR-13445 at 5/3/19 11:24 AM:
--------------------------------------------------------------

I had several private conversations with [~shalinmangar] about how to deal with this issue, and he helped a lot. Thanks [~shalinmangar].

The attached patch beside implementing mentioned features in the description, also solving an issue in {{SolrClientNodeStateProvider}} since we always retrying query metrics from other nodes even it just successfully done that. 

[~shalinmangar] can you take a look at the attached patch?


was (Author: caomanhdat):
I had several private conversations with [~shalinmangar] about how to deal with this issue, and he helped a lot. Thanks [~shalinmangar].

The attached patch beside implementing mentioned features in the description, also solving an issue in {{SolrClientNodeStateProvider}} since we always retrying query metrics from other nodes even it just successfully doing that. 

> Preferred replicas on nodes with same system properties as the query master
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-13445
>                 URL: https://issues.apache.org/jira/browse/SOLR-13445
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Major
>         Attachments: SOLR-13445.patch
>
>
> Currently, Solr chooses a random replica for each shard to fan out the query request. However, this presents a problem when running Solr in multiple availability zones.
> If one availability zone fails then it affects all Solr nodes because they will try to connect to Solr nodes in the failed availability zone until the request times out. This can lead to a build up of threads on each Solr node until the node goes out of memory. This results in a cascading failure.
> This issue try to solve this problem by adding
> * another shardPreference param named {{node.sysprop}}, so the query will be routed to nodes with same defined system properties as the current one.
> * default shardPreferences on the whole cluster, which will be stored in {{/clusterprops.json}}.
> * a cacher for fetching other nodes system properties whenever /live_nodes get changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org