You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2018/10/12 06:21:00 UTC

[jira] [Comment Edited] (SOLR-11522) Suggestions/recommendations to rebalance replicas

    [ https://issues.apache.org/jira/browse/SOLR-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647457#comment-16647457 ] 

Noble Paul edited comment on SOLR-11522 at 10/12/18 6:20 AM:
-------------------------------------------------------------

Ideally, it should just be {{get(..)}}. 
 but as this class is implemented by a million other classes there is likely to be a conflict. so I went with {{_get(..)}}.

The broad philosophy is that Solr has embraced JSON for everything (even some of the JUnit tests are driven by JSON). 
 We need to have an in-memory representation of JSON which is
 * memory efficient and as little overhead as possible
 * streaming . Use as little memory as possible
 * can be serialized to any of the supported outputs with ease .(supports Javabin etc  too)
 * No concrete classes . hence {{MapWriter}}, {{IteratorWriter}} interfaces

If a class implements {{MapWriter}}, most likely it is a deeply nested Object. . The most common operation that you can perform is to query that object with a proper path like {{a/b/c[4]/d}} . Looking up a {{MapWriter}} is usually an operation with a complexity of {{O(n)}} and no new Objects are created in the process. So, it is pretty fast. The best part is the readability it offers to our tests

Using a string path is OK for JUnit tests but it leads to creation of unnecessary objects. If we use it in other places, we can't afford to create new {{String}} objects. That's why I created an equivalent method {{get(List<String>)}} which doesn't create new Objects.

The alternative was to use a {{Utils#getObjectByPath()}} method . It was definitely ugly.


was (Author: noble.paul):
Ideally, it should just be {{get(..)}}. 
but as this class is implemented by a million other classes there is likely to be a conflict. so I went with {{_get(..)}}. 

The broad philosophy is that Solr has embraced  JSON for everything (even some of the JUnit tests are driven by JSON). 
We need to have a cheap in-memory representation of JSON which is 
* memory efficient and as little overhead as possible
* streaming . Use as little memory as possible
* can be serialized to any of the supported outputs with ease .(supports Javabin too)
* No concrete classes . hence {{MapWriter}}, {{IteratorWriter}} interfaces

If a class implements {{MapWriter}}, most likely it is a deeply nested Object. . The most common operation that you can perform is to query that object with a proper path like {{a/b/c[4]/d}} . Looking up a MapWriter is usually an operation with a complexity of {{O\(n)}} and no new Objects are created in the process. So, it is pretty fast. The best part is the readability it offers to our tests

Using a string path is OK for JUnit tests but it leads to creation of unnecessary objects. If we use it in other places, we can't afford to create new {{String}} objects. That's why I created an equivalent method {{get(List<String>)}} which doesn't create new Objects.

The alternative was to use a {{Utils#getObjectByPath()}} method . It was definitely ugly.



> Suggestions/recommendations to rebalance replicas
> -------------------------------------------------
>
>                 Key: SOLR-11522
>                 URL: https://issues.apache.org/jira/browse/SOLR-11522
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>            Reporter: Noble Paul
>            Priority: Major
>
> It is possible that a cluster is unbalanced even if it is not breaking any of the policy rules. Some nodes may have very little load while some others may be heavily loaded. So, it is possible to move replicas around so that the load is more evenly distributed. This is going to be driven by preferences. The way we arrive at these suggestions is going to be as follows
>  # Sort the nodes according to the given preferences
>  # Choose a replica from the most loaded node ({{source-node}}) 
>  # try adding them to the least loaded node ({{target-node}})
>  # See if it breaks any policy rules. If yes , try another {{target-node}} (go to #3)
>  # If no policy rules are being broken, present this as a {{suggestion}} . The suggestion contains the following information
>  #* The {{source-node}} and {{target-node}} names
>  #* The actual v2 command that can be run to effect the operation
>  # Go to step #1
>  # Do this until the a replicas can be moved in such a way that the {{target node}} is more loaded than the {{source-node}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org