You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@helix.apache.org by Lei Xia <lx...@apache.org> on 2017/06/08 05:42:33 UTC

Re: Dynamic partition management

Hi, Subramanian

  Helix actually allows you to dynamically change the number of partitions
in a resource.   If you are using your own customized rebalancer, i.e, your
rebalance mode set in resource's IdealState is CUSTOMIZED, what you can do
is to manipulate the IdealState's MapFields when adding or removing
partitions.

  An example is given here, say initially, your resource has 3 partitions,
your IS should looks like:

{
  "id":"myDB"
  ,"simpleFields":{
    ,"NUM_PARTITIONS":"3"
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    "myDB_0":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_1":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_2":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
  }
}

   Now you would like to add 2 new partitions. You can update your IS with
these new partitions, and Helix will bring all replicas to ONLINE for these
two new partitions.

{
  "id":"myDB"
  ,"simpleFields":{
*    ,"NUM_PARTITIONS":"5"*
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    "myDB_0":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_1":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
    ,"myDB_2":{
      "node-5":"ONLINE"
      ,"node-6":"ONLINE"
    }
    *,"myDB_3":{*
*      "node-1":"ONLINE"*
*      ,"node-2":"ONLINE"*
*    }*
*    ,"myDB_4":{*
*      "node-3":"ONLINE"*
*      ,"node-4":"ONLINE"*
*    }*
  }
}


    Removing an existing partition is a little tricky,  you can not simply
remove it from IS.  You need to first set all replica for that partition to
DROPPED state, for example, such as below.

{
  "id":"myDB"
  ,"simpleFields":{
*    ,"NUM_PARTITIONS":"5"*
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    *"myDB_0":{*
*      "node-1":"DROPPED"*
*      ,"node-2":"DROPPED"*
*    }*
*    ,"myDB_1":{*
*      "node-3":"DROPPED"*
*      ,"node-4":"DROPPED"*
*    }*
    ,"myDB_2":{
      "node-5":"ONLINE"
      ,"node-6":"ONLINE"
    }
    ,"myDB_3":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_4":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
  }

 Once you confirming the partition has been removed from resource's
ExternalView, it is safe to remove it from IdealState now.

{
  "id":"myDB"
  ,"simpleFields":{
*    ,"NUM_PARTITIONS":"3"*
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    ,"myDB_2":{
      "node-5":"ONLINE"
      ,"node-6":"ONLINE"
    }
    ,"myDB_3":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_4":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
  }


  Help this helps on solving your problem.


Thanks
Lei


On Wed, Jun 7, 2017 at 2:58 PM, Subramanian Raghunathan <
subramanian.raghunathan@integral.com> wrote:

> Hi,
>
>
>
>                 Trying to follow up if there are any thoughts/suggestions
> over the approaches.
>
>
>
>                 Trying to manage a cluster with a resource that has
> Dynamic number of partitions please provide me your valuable thoughts on
> same.
>
>  Version(s) :
>
>                                 Helix: 0.7.1
>
>                                 Zookeeper:3.3.4
>
>  -Zookeeper quorum (3 nodes)
>
> - State Model: OnlineOffline
>
> - Controller (leader elected from one of the cluster nodes)
>
> - Single resources with partitions.
>
> - Custom rebalancer
>
>  Few of the approaches that I could think of ...
>
>  Approach 1: *Buffer Partitions*
>
>                 Let’s say the current number of partitions are N.
>
>                                 1 ) Define N+X (Where x could be the
> delta where it can expand maximum , expecting to be of few in number
> (1-10))
>
>                                 2 ) Use property store to identify which
> are all valid partition(s)
>
>                                 3 ) Don’t act upon invalid ones by
> looking up property store (or) Disable the unallocated partition(s)
>
>
>
>                 When a new partition to be added -
>
> 1.       Update property store of new partitions (Delta merge)
>
> 2.       Enable the partition(s)
>
>  --- The balancing algorithm should be able to balance valid & extra
> partition(s) differently. Lots of customizations.
>
>  Approach 2: *New Resource Addition*
>
>                 Let’s say the current number of partitions are N.
>
>                 Create Resource 1 with N partitions initially.
>
>                 When a new partition(s) to be added -
>
> 1.       Drop the resource
>
> 2.       Create new resource with N+X (newly added partitions )
>
>  --- Caveat that I could see is Dropping the resource would stop the
> entire workflow (It’s a bigger price to pay for us since we are real time).
>
>
>
>
>
> Thanks & Regards,
>
> Subramanian.
>
>
>
> *From:* kishore g [mailto:g.kishore@gmail.com]
> *Sent:* Friday, June 02, 2017 9:52 PM
> *To:* Subramanian Raghunathan <su...@integral.com>;
> user@helix.apache.org
> *Subject:* Re: Dynamic partition management
>
>
>
> +helix
>
>
>
> On Jun 2, 2017 6:50 PM, "Subramanian Raghunathan" <
> subramanian.raghunathan@integral.com> wrote:
>
> Hi kishore,
>
>
>
>                 Trying to manage a cluster with a resource that has
> Dynamic number of partitions please provide me your valuable thoughts on
> same.
>
>
>
> Version(s) :
>
>                                 Helix: 0.7.1
>
>                                 Zookeeper:3.3.4
>
>
>
> -Zookeeper quorum (3 nodes)
>
> - State Model: OnlineOffline
>
> - Controller (leader elected from one of the cluster nodes)
>
> - Single resources with partitions.
>
> - Custom rebalancer
>
>
>
> Few of the approaches that I could think of ...
>
>
>
> Approach 1: *Buffer Partitions*
>
>                 Let’s say the current number of partitions are N.
>
>                                 1 ) Define N+X (Where x could be the
> delta where it can expand maximum , expecting to be of few in number
> (1-10))
>
>                                 2 ) Use property store to identify which
> are all valid partition(s)
>
>                                 3 ) Don’t act upon invalid ones by
> looking up property store (or) Disable the unallocated partition(s)
>
>
>
>                 When a new partition to be added -
>
> 3.       Update property store of new partitions (Delta merge)
>
> 4.       Enable the partition(s)
>
>
>
> --- The balancing algorithm should be able to balance valid & extra
> partition(s) differently. Lots of customizations.
>
>
>
> Approach 2: *New Resource Addition*
>
>                 Let’s say the current number of partitions are N.
>
>
>
>                 Create Resource 1 with N partitions initially.
>
>                 When a new partition(s) to be added -
>
> 3.       Drop the resource
>
> 4.       Create new resource with N+X (newly added partitions )
>
>
>
> --- Caveat that I could see is Dropping the resource would stop the entire
> workflow (It’s a bigger price to pay for us since we are real time).
>
>
>
>
>
> Thanks & Regards,
>
> Subramanian.
>
>
>
> Tel: +1 (650) 424 4655 <(650)%20424-4655>
>
> Mob:+1(650) 656 6006 <(650)%20656-6006>
>
>
>
> 850 Hansen Way,
>
> Palo Alto, CA 94304
>
> www.integral.com
>
> [image: Logo_signature_block]
> <http://www.integral.com/fxcloud_features/risk_management.html#ym>
>
> NOTICE: This e-mail message and any attachments, which may contain
> confidential information, are to be viewed solely by the intended recipient
> of Integral Development Corp. For further information, please visit
> http://www.integral.com/about/disclaimer.html.
>
>
>
>
>
>
>
>

RE: Dynamic partition management

Posted by Subramanian Raghunathan <su...@integral.com>.
Perfect , Thanks Lei Xia .

From: Lei Xia [mailto:lxia@apache.org]
Sent: Wednesday, June 07, 2017 10:43 PM
To: user@helix.apache.org
Cc: dev@helix.apache.org
Subject: Re: Dynamic partition management

Hi, Subramanian

  Helix actually allows you to dynamically change the number of partitions in a resource.   If you are using your own customized rebalancer, i.e, your rebalance mode set in resource's IdealState is CUSTOMIZED, what you can do is to manipulate the IdealState's MapFields when adding or removing partitions.

  An example is given here, say initially, your resource has 3 partitions, your IS should looks like:
{
  "id":"myDB"
  ,"simpleFields":{
    ,"NUM_PARTITIONS":"3"
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    "myDB_0":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_1":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_2":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
  }
}

   Now you would like to add 2 new partitions. You can update your IS with these new partitions, and Helix will bring all replicas to ONLINE for these two new partitions.
{
  "id":"myDB"
  ,"simpleFields":{
    ,"NUM_PARTITIONS":"5"
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    "myDB_0":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_1":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
    ,"myDB_2":{
      "node-5":"ONLINE"
      ,"node-6":"ONLINE"
    }
    ,"myDB_3":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_4":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
  }
}

    Removing an existing partition is a little tricky,  you can not simply remove it from IS.  You need to first set all replica for that partition to DROPPED state, for example, such as below.

{
  "id":"myDB"
  ,"simpleFields":{
    ,"NUM_PARTITIONS":"5"
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    "myDB_0":{
      "node-1":"DROPPED"
      ,"node-2":"DROPPED"
    }
    ,"myDB_1":{
      "node-3":"DROPPED"
      ,"node-4":"DROPPED"
    }
    ,"myDB_2":{
      "node-5":"ONLINE"
      ,"node-6":"ONLINE"
    }
    ,"myDB_3":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_4":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
  }

 Once you confirming the partition has been removed from resource's ExternalView, it is safe to remove it from IdealState now.
{
  "id":"myDB"
  ,"simpleFields":{
    ,"NUM_PARTITIONS":"3"
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"2"
    ,"STATE_MODEL_DEF_REF":"OnlineOffline"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    ,"myDB_2":{
      "node-5":"ONLINE"
      ,"node-6":"ONLINE"
    }
    ,"myDB_3":{
      "node-1":"ONLINE"
      ,"node-2":"ONLINE"
    }
    ,"myDB_4":{
      "node-3":"ONLINE"
      ,"node-4":"ONLINE"
    }
  }

  Help this helps on solving your problem.

Thanks
Lei


On Wed, Jun 7, 2017 at 2:58 PM, Subramanian Raghunathan <su...@integral.com>> wrote:
Hi,

                Trying to follow up if there are any thoughts/suggestions over the approaches.

                Trying to manage a cluster with a resource that has Dynamic number of partitions please provide me your valuable thoughts on same.
 Version(s) :
                                Helix: 0.7.1
                                Zookeeper:3.3.4
 -Zookeeper quorum (3 nodes)
- State Model: OnlineOffline
- Controller (leader elected from one of the cluster nodes)
- Single resources with partitions.
- Custom rebalancer
 Few of the approaches that I could think of ...
 Approach 1: Buffer Partitions
                Let’s say the current number of partitions are N.
                                1 ) Define N+X (Where x could be the delta where it can expand maximum , expecting to be of few in number (1-10))
                                2 ) Use property store to identify which are all valid partition(s)
                                3 ) Don’t act upon invalid ones by looking up property store (or) Disable the unallocated partition(s)

                When a new partition to be added -
1.       Update property store of new partitions (Delta merge)
2.       Enable the partition(s)
 --- The balancing algorithm should be able to balance valid & extra partition(s) differently. Lots of customizations.
 Approach 2: New Resource Addition
                Let’s say the current number of partitions are N.
                Create Resource 1 with N partitions initially.
                When a new partition(s) to be added -
1.       Drop the resource
2.       Create new resource with N+X (newly added partitions )
 --- Caveat that I could see is Dropping the resource would stop the entire workflow (It’s a bigger price to pay for us since we are real time).



Thanks & Regards,
Subramanian.

From: kishore g [mailto:g.kishore@gmail.com<ma...@gmail.com>]
Sent: Friday, June 02, 2017 9:52 PM
To: Subramanian Raghunathan <su...@integral.com>>; user@helix.apache.org<ma...@helix.apache.org>
Subject: Re: Dynamic partition management


+helix

On Jun 2, 2017 6:50 PM, "Subramanian Raghunathan" <su...@integral.com>> wrote:
Hi kishore,

                Trying to manage a cluster with a resource that has Dynamic number of partitions please provide me your valuable thoughts on same.

Version(s) :
                                Helix: 0.7.1
                                Zookeeper:3.3.4

-Zookeeper quorum (3 nodes)
- State Model: OnlineOffline
- Controller (leader elected from one of the cluster nodes)
- Single resources with partitions.
- Custom rebalancer

Few of the approaches that I could think of ...

Approach 1: Buffer Partitions
                Let’s say the current number of partitions are N.
                                1 ) Define N+X (Where x could be the delta where it can expand maximum , expecting to be of few in number (1-10))
                                2 ) Use property store to identify which are all valid partition(s)
                                3 ) Don’t act upon invalid ones by looking up property store (or) Disable the unallocated partition(s)

                When a new partition to be added -
3.       Update property store of new partitions (Delta merge)
4.       Enable the partition(s)

--- The balancing algorithm should be able to balance valid & extra partition(s) differently. Lots of customizations.

Approach 2: New Resource Addition
                Let’s say the current number of partitions are N.

                Create Resource 1 with N partitions initially.
                When a new partition(s) to be added -
3.       Drop the resource
4.       Create new resource with N+X (newly added partitions )

--- Caveat that I could see is Dropping the resource would stop the entire workflow (It’s a bigger price to pay for us since we are real time).


Thanks & Regards,
Subramanian.

Tel: +1 (650) 424 4655<tel:(650)%20424-4655>
Mob:+1(650) 656 6006<tel:(650)%20656-6006>

850 Hansen Way,
Palo Alto, CA 94304
www.integral.com<http://www.integral.com/>
[Logo_signature_block]<http://www.integral.com/fxcloud_features/risk_management.html#ym>

NOTICE: This e-mail message and any attachments, which may contain confidential information, are to be viewed solely by the intended recipient of Integral Development Corp. For further information, please visit http://www.integral.com/about/disclaimer.html.