You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by "Jaeboo Jeong (JIRA)" <ji...@apache.org> on 2017/03/24 09:08:41 UTC

[jira] [Created] (SLIDER-1222) During decreasing the number of components, we need to release the components of specific nodes.

Jaeboo Jeong created SLIDER-1222:
------------------------------------

             Summary: During decreasing the number of components, we need to release the components of specific nodes.
                 Key: SLIDER-1222
                 URL: https://issues.apache.org/jira/browse/SLIDER-1222
             Project: Slider
          Issue Type: Improvement
          Components: agent, core
    Affects Versions: Slider 0.81
            Reporter: Jaeboo Jeong


If we want to create a shard cluster, we need index numbers to split shards.
Currently we can use app_container_tag for shard index number.
And I also can add replication nodes to increases data availability.

For example, If we want to 5 shards and 2 replication cluster, shard index numbers are like this.
total desired components = 10
shard_index = app_container_tag % 5
replication = 0 if app_container_tag < 5 else 1

However using app_container_tag is a problem during decreasing the number of components.
Because we don’t know that allocated components are for shard or not.
So shard component can be released even if we want to decrease replication component.

For a shard cluster, I added the following functionality.
- write allocated component’s shard and replication informations in zookeeper
- during releasing component, replication component is released first
- add command to change shard and replication configuration

And we need to add the following configurations and script functions. 
- Application Configuration
{code}
"global": {
    "create.default.zookeeper.node": "true",

    "site.global.zookeeper.quorum": "${ZK_HOST}",
    "site.global.zookeeper.port": "2181",
    "site.global.zookeeper.znode.parent": "${DEFAULT_ZK_PATH}",

    "site.global.COMPONENT_NAME.shard": "10",
    "site.global.COMPONENT_NAME.replication": "2"

}
{code}
- Resource Specification
{code}
  "components": {

    "COMPONENT_NAME": {
      "yarn.role.priority": "1",
      "yarn.component.instances": "10",
      "yarn.vcores": "2",
      "yarn.memory": "8192",
      "yarn.component.placement.policy": "32",
      "yarn.placement.escalate.seconds": "60"
    }
{code}
  - Placement Policy
{code}
    DEFAULT_SHARD_REPLICATION = 32
    ANTI_AFFINITY_SHARD_REPLICATION = 64
{code}
- add allocation function in start() function of App Package
{code}
import container_tag
...

    def start(self, env):
        import params
        env.set_params(params)

        shard, replication = container_tag.allocate_container_tag(params.zk_server, params.zk_parent, 'COMPONENT_NAME', params.container_id)

...
{code}
  - make argument data for container_tag.allocate_container_tag() in params.py
{code}
config = Script.get_config()
global_conf = config['configurations']['global']

zk_quorum = global_conf['zookeeper.quorum']
zk_port = global_conf['zookeeper.port']
zk_server = ','.join([h + ':' + str(nopq_zk_port) for h in nopq_zk_quorum.split(',')])
zk_parent = global_conf['zookeeper.znode.parent']

container_id = global_conf['app_container_id']
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)