You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2018/06/23 01:54:00 UTC

[jira] [Comment Edited] (SOLR-11985) Allow percentage in replica attribute in policy

    [ https://issues.apache.org/jira/browse/SOLR-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520919#comment-16520919 ] 

Noble Paul edited comment on SOLR-11985 at 6/23/18 1:53 AM:
------------------------------------------------------------

bq. for the collection with 4 replicas. In the collection with 4 replicas, you could have 2 replicas on us-east-1a and 2 replicas on us-east-1b. What we really want is 1 on each before having the 4th replica on another zone...

In reality that is what happens. it starts allotting one at a time and you end up with 1 on each zone and another one ends up in a random zone.

But the problem is that once you are already in a badly distributed cluster, it won't show any violations.

Once we are done with SOLR-12511, that ceases to be a problem. your rules will look like 
{code}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

this means the effective policy for a shard with 4 replicas is 
{code}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

This means that any zone with 0 replicas is a violation. 


was (Author: noble.paul):
bq. for the collection with 4 replicas. In the collection with 4 replicas, you could have 2 replicas on us-east-1a and 2 replicas on us-east-1b. What we really want is 1 on each before having the 4th replica on another zone...

In reality that is what happens. it starts allotting one at a time and you end up with 1 on each zone and another one ends up in a random zone.

Once we are done with SOLR-12511, that ceases to be a problem. your rules will look like 
{code}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

this means the effective policy for a shard with 4 replicas is 
{code}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

This means that any zone with 0 replicas is a violation. 

> Allow percentage in replica attribute in policy
> -----------------------------------------------
>
>                 Key: SOLR-11985
>                 URL: https://issues.apache.org/jira/browse/SOLR-11985
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling, SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Noble Paul
>            Priority: Major
>             Fix For: master (8.0), 7.5
>
>         Attachments: SOLR-11985.patch, SOLR-11985.patch
>
>
> Today we can only specify an absolute number in the 'replica' attribute in the policy rules. It'd be useful to write a percentage value to make certain use-cases easier. For example:
> {code:java}
> // Keep a third of the the replicas of each shard in east region
> {"replica" : "<34%", "shard" : "#EACH", "sysprop:region": "east"}
> // Keep two thirds of the the replicas of each shard in west region
> {"replica" : "<67%", "shard" : "#EACH", "sysprop:region": "west"}
> {code}
> Today the above must be represented by different rules for each collection if they have different replication factors. Also if the replication factor changes later, the absolute value has to be changed in tandem. So expressing a percentage removes both of these restrictions.
> This feature means that the value of the attribute {{"replica"}} is only available just in time. We call such values {{"computed values"}} . The computed value for this attribute depends on other attributes as well. 
>  Take the following 2 rules
> {code:java}
> //example 1
> {"replica" : "<34%", "shard" : "#EACH", "sysprop:region": "east"}
> //example 2
> {"replica" : "<34%",  "sysprop:region": "east"}
> {code}
> assume we have collection {{"A"}} with 2 shards and {{replicationFactor=3}}
> *example 1* would mean that the value of replica is computed as
> {{3 * 34 / 100 = 1.02}}
> Which means *_for each shard_* keep less than 1.02 replica in east availability zone
>  
> *example 2* would mean that the value of replica is computed as 
> {{3 * 2 * 34 / 100 = 2.04}}
>  
> which means _*for each collection*_ keep less than 2.04 replicas on east availability zone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org