You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Steve Rowe (JIRA)" <ji...@apache.org> on 2014/07/02 18:42:25 UTC
[jira] [Updated] (SOLR-6220) Replica placement strategy for
solrcloud
[ https://issues.apache.org/jira/browse/SOLR-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Rowe updated SOLR-6220:
-----------------------------
Summary: Replica placement strategy for solrcloud (was: Replica placement startegy for solrcloud)
> Replica placement strategy for solrcloud
> ----------------------------------------
>
> Key: SOLR-6220
> URL: https://issues.apache.org/jira/browse/SOLR-6220
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Reporter: Noble Paul
> Assignee: Noble Paul
>
> h1.Objective
> Most cloud based systems allow to specify rules on how the replicas/nodes of a cluster are allocated . Solr should have a flexible mechanism through which we should be able to control allocation of replicas or later change it to suit the needs of the system
> All configurations are per collection basis. The rules are applied whenever a replica is created in any of the shards in a given collection during
> * collection creation
> * shard splitting
> * add replica
> * createsshard
> There are two aspects to how replicas are placed: snitch and placement.
> h2.snitch
> How to identify the tags of nodes. Snitches are configured through collection create command with the snitch prefix . eg: snitch.type=EC2Snitch.
> The system provides the following implicit tag names which cannot be used by other snitches
> * node : The solr nodename
> * host : The hostname
> * ip : The ip address of the host
> * cores : This is a dynamic varibale which gives the core count at any given point
> * disk : This is a dynamic variable which gives the available disk space at any given point
> There will a few snitches provided by the system such as
> h3.EC2Snitch
> Provides two tags called dc, rack from the region and zone values in EC2
> h3.IPSnitch
> Use the IP to infer the “dc” and “rack” values
> h3.NodePropertySnitch
> This lets users provide system properties to each node with tagname and value .
> example : -Dsolrcloud.snitch.vals=tag-x:val-a,tag-y:val-b. This means this particular node will have two tags “tag-x” and “tag-y” .
>
> h3.RestSnitch
> Which lets the user configure a url which the server can invoke and get all the tags for a given node.
> This takes extra parameters in create command
> example: {{snitch.type=RestSnitch&snitch.url=http://snitchserverhost:port?nodename={}}}
> The response of the rest call {{http://snitchserverhost:port/?nodename=192.168.1:8080_solr}}
> must be in either json format or properties format.
> eg:
> {code:JavaScript}
> {
> “tag-x”:”x-val”,
> “tag-y”:”y-val”
> }
> {code}
> or
> {noformat}
> tag-x=x-val
> tag-y=y-val
> {noformat}
> h3.ManagedSnitch
> This snitch keeps a list of nodes and their tag value pairs in Zookeeper. The user should be able to manage the tags and values of each node through a collection API
> h2.Placement
> This tells how many replicas for a given shard needs to be assigned to nodes with the given key value pairs. These parameters will be passed on to the collection CREATE api as a parameter "placement" . The values will be saved in the state of the collection as follows
> {code:Javascript}
> {
> “mycollection”:{
> “snitch”: {
> type:“EC2Snitch”
> }
> “placement”:{
> “key1”: “value1”,
> “key2”: “value2”,
> }
> }
> {code}
> A rule consists of 2 parts
> * LHS or the qualifier .The format is \{shardname}.\{replicacount}\{quantifier} . Use the wild card “*” for qualifying all. quatifiers are
> ** no value means . exactly equal. e.g: 2 means exactly 2
> ** "+" means greater than or equal . e.g : 2+means 2 or more
> ** "\-" means less than. e.g 2- means , less than 2
> * RHS or conditions : The format is \{tagname}\{operand}\{value} . The tag name and values are provided by the snitch. The supported operands are
> ** -> : equals
> ** > : greater than . Only applicable for numeric tags
> ** < : less than , Only applicable to numeric tags
> ** ! : NOT or not equals
> Each collection can have any number of rules. As long as the rules do not conflict with each other it should be OK. Or else an error is thrown
> Example rules:
> * “shard1.1”:“dc->dc1,rack->168” : This would assign exactly 1 replica for shard1 with nodes having tags “dc=dc1,rack=168”.
> * “shard1.1+”:“dc->dc1,rack->168” : Same as above but assigns atleast one replica to the tag val combination
> * “*.1”:“dc->dc1” : For all shards keep exactly one replica in dc:dc1
> * “*.1+”:”dc->dc2” : At least one replica needs to be in dc:dc2
> * “*.2-”:”dc->dc3” : Keep a maximum of 2 replicas in dc:dc3 for all shards
> * “shard1.*”:”rack->730” : All replicas of shard1 will go to rack 730
> * “shard1.1”:“node->192.167.1.2:8983_solr” : 1 replica of shard1 must go to the node 192.167.1.28983_solr
> * “shard1.* : “rack!738” : No replica of shard1 should go to rack 738
> * “shard1.* : “host!192.168.89.91” : No replica of shard1 should go to host 192.168.89.91
> * “\*.*”: “cores<5”: All replicas should be created in nodes with less than 5 cores
> * “\*.*”:”disk>20gb” : All replicas must be created in nodes with disk space greater than 20gb
> In the collection create API all the placement rules are provided as a parameter called placement and multiple rules are separated with "|"
> example:
> {noformat}
> snitch.type=EC2Snitch&placement=*.1:dc->dc1|*.2-:dc->dc3|shard1.*:rack!738
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org